FramePack

Packing Input Frame Context for Video Generation

With an innovative next-frame prediction neural network architecture, FramePack continuously generates videos by compressing input frame context to a fixed length, making the generation workload independent of video length.

View Demos Get Started

Video Diffusion That Feels Like Image Diffusion

FramePack employs a next-frame prediction neural network structure to generate videos continuously by compressing input context to a fixed length, enabling length-invariant generation.

Process a large number of frames even on laptop GPUs
Only requires 6GB GPU memory
Can be trained with a much larger batch size
Generate 1-minute, 30FPS videos (1800 frames)

Key Features

Minimal Memory Requirements

Generate 60-second, 30fps (1800 frames) videos with a 13B model using only 6GB VRAM. Laptop GPUs can handle it easily.

Instant Visual Feedback

As a next-frame prediction model, you'll directly see the generated frames, getting plenty of visual feedback throughout the entire generation process.

Compressed Input Context

Compresses input contexts to a constant length, making generation workload invariant to video length and supporting ultra-long video generation.

Standalone Desktop Software

Provides a feature-complete desktop application with minimal standalone high-quality sampling system and memory management.

Amazing Demos

Anime

anime.mp4

Girl2

girl2.mp4

Boy

boy.mp4

Boy2

boy2.mp4

Girl3

girl3.mp4

Girl4

girl4.mp4

Foxpink

foxpink.mp4

Girlflower

girlflower.mp4

Girl

girl.mp4

How It Works

Installation & Setup

Clone FramePack from GitHub and install all dependencies in your environment.
Define Your Initial Frame

Upload an image or generate one from a text prompt to start your video sequence.
Create Motion Prompts

Describe the desired movement and action in natural language to guide the video generation.
Generate & Review

FramePack generates your video frame by frame with impressive temporal consistency. Download and share your results.

No credit card required. Start creating amazing videos today.

Get Started

### Manual Installation on Windows

1. Create a folder and open Command Prompt
   git clone https://github.com/lllyasviel/FramePack.git
   cd FramePack

2. Create and activate a Python virtual environment (Python 3.10 recommended)
   python -m venv venv
   venv\Scripts\activate.bat

3. Upgrade pip and install dependencies
   python -m pip install --upgrade pip
   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
   pip install -r requirements.txt

4. Install Triton and Sage Attention
   pip install triton-windows
   pip install https://github.com/woct0rdho/SageAttention/releases/download/v2.1.1-windows/sageattention-2.1.1+cu126torch2.6.0-cp312-cp312-win_amd64.whl
   ※Adjust the URL according to your CUDA or Python version

5. Optional: Install Flash Attention
   pip install packaging ninja
   set MAX_JOBS=4
   pip install flash-attn --no-build-isolation

6. Launch the Gradio UI
   python demo_gradio.py

# We recommend having an independent Python 3.10 environment
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

# Start the GUI
python demo_gradio.py

### Online Run on Windows (GUI)

1. Clone the repository:
   git clone https://github.com/lllyasviel/FramePack.git
   cd FramePack

2. Create and activate a Python virtual environment:
   python -m venv venv
   venv\Scripts\activate.bat

3. Install dependencies:
   python -m pip install --upgrade pip
   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
   pip install -r requirements.txt

4. Launch the Gradio GUI:
   python demo_gradio.py

5. Open in browser:
   http://localhost:7860

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

FramePack is a revolutionary video generation technology that compresses input contexts to a constant length, making the generation workload invariant to video length. Learn about our methods, architecture, and experimental results in detail.

Download PDF Project Page