# ComfyUI-FlashVSR_Stable

**Repository Path**: sheldongchen/ComfyUI-FlashVSR_Stable

## Basic Information

- **Project Name**: ComfyUI-FlashVSR_Stable
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: GPL-3.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-04
- **Last Updated**: 2026-05-04

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# ComfyUI-FlashVSR_Stable

**High-performance Video Super Resolution for ComfyUI with VRAM optimization.**

Run FlashVSR on 8GB-24GB+ GPUs without artifacts. Features intelligent resource management, 5 VAE options, and auto-downloading models.

[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![ComfyUI](https://img.shields.io/badge/ComfyUI-Compatible-green.svg)](https://github.com/comfyanonymous/ComfyUI)

---

Registry Link: https://registry.comfy.org/publishers/naxci1/nodes/ComfyUI-FlashVSR_Stable

---

## ✨ Key Features

- **🎬 Video Super Resolution**: 2x or 4x upscaling using FlashVSR diffusion models
- **🧠 5 VAE Options**: Choose from Wan2.1, Wan2.2, LightVAE, TAE variants for optimal VRAM/quality trade-off
- **📊 Pre-Flight Resource Check**: Intelligent VRAM estimation with settings recommendations
- **⚡ Auto-Download**: Models download automatically from HuggingFace if missing
- **🛡️ OOM Protection**: Automatic recovery with progressive fallback (tiled VAE → tiled DiT → chunking)
- **🔧 Unified Pipeline**: All modes share optimized processing logic

---

## 📋 Quick Links

- [Changelog](CHANGELOG.md) - Full version history
- [Sample Workflow](./workflow/FlashVSR.json)
- [HuggingFace Models](https://huggingface.co/JunhaoZhuang/FlashVSR-v1.1)

---

## Performance & VRAM Optimization

This node is optimized for various hardware configurations. Here are some guidelines:

### VRAM Tiers & Settings

| VRAM | Mode | Tiling | Chunk Size | Precision | Notes |
| :--- | :--- | :--- | :--- | :--- | :--- |
| **24GB+** | `full` or `tiny` | Disabled | 0 (All) | `bf16`/`auto` | Max quality/speed. |
| **16GB** | `tiny` | `tiled_vae=True` | 0 or ~100 | `bf16`/`auto` | Enable `keep_models_on_cpu`. |
| **12GB** | `tiny` | `tiled_vae=True`, `tiled_dit=True` | ~50 | `fp16` | Use `sparse_sage` attention. |
| **8GB** | `tiny-long` | **Required** | ~20 | `fp16` | Must use tiling and chunking. |

### Performance Enhancements
- **Attention Mode**: Use `sparse_sage_attention` for the best balance of speed and memory. `flash_attention_2` is faster but requires specific hardware/installation.
- **Precision**: `bf16` (BFloat16) is recommended for RTX 3000/4000/5000 series. It is faster and preserves dynamic range better than `fp16`.
- **Chunking**: Use `frame_chunk_size` to process videos in segments. This moves processed frames to CPU RAM, preventing VRAM saturation on long clips.
- **Resize Input**: If the input video is large (e.g., 1080p), use the `resize_factor` parameter to reduce input size to `0.5x` before processing. This drastically reduces VRAM usage and allows for 4x upscaling of the resized result (net 2x output). For small videos, leave at `1.0`.

### Pre-Flight Resource Check (NEW)

Before processing, FlashVSR now performs an intelligent pre-flight check that:

1. **Estimates VRAM Requirements**: Calculates approximate VRAM needed based on resolution, frames, scale, and settings.
2. **Checks Available Resources**: Uses `torch.cuda.mem_get_info()` for accurate real-time VRAM availability.
3. **Provides Recommendations**: If OOM is predicted, suggests optimal settings.

Example console output:
```
============================================================
🔍 PRE-FLIGHT RESOURCE CHECK
💻 RAM: 15.4GB / 95.8GB
💾 VRAM Available: 14.2GB
📊 Estimated VRAM Required: 12.8GB
✅ Safe to proceed. Estimated ~12.8GB needed, 14.2GB available.
============================================================
```

If VRAM is insufficient:
```
⚠️ Current settings require ~18.5GB but only 8.0GB available.
💡 Recommended Optimal Settings:
  • chunk_size = 32
  • tiled_vae = True
  • tiled_dit = True
  • resize_factor = 0.6
```

---

## 🎨 VAE Model Selection

### VAE Type Comparison

| VAE Type | VRAM Usage | Speed | Quality | Best For |
| :--- | :--- | :--- | :--- | :--- |
| **Wan2.1** | 8-12 GB | Baseline | ⭐⭐⭐⭐⭐ | Maximum quality, 24GB+ VRAM |
| **Wan2.2** | 8-12 GB | Baseline | ⭐⭐⭐⭐⭐ | Improved normalization for Wan2.2 models |
| **LightVAE_W2.1** | 4-5 GB | 2-3x faster | ⭐⭐⭐⭐ | 8-16GB VRAM, speed priority |
| **TAE_W2.2** | 6-8 GB | 1.5x faster | ⭐⭐⭐⭐ | Temporal consistency priority |
| **LightTAE_HY1.5** | 3-4 GB | 3x faster | ⭐⭐⭐⭐ | HunyuanVideo compatible, minimum VRAM |

### VAE Selection Guide

| Your VRAM | Recommended VAE | Additional Settings |
| :--- | :--- | :--- |
| **8GB** | `LightTAE_HY1.5` or `LightVAE_W2.1` | `tiled_vae=True`, `tiled_dit=True`, `chunk_size=16` |
| **12GB** | `LightVAE_W2.1` or `Wan2.1` | `tiled_vae=True` |
| **16GB** | Any VAE | Optional tiling for long videos |
| **24GB+** | `Wan2.1` or `Wan2.2` | Maximum quality, no restrictions |

### Auto-Download

All VAE models auto-download from HuggingFace if not found locally:

| VAE Selection | File | Direct Download Link |
| :--- | :--- | :--- |
| **Wan2.1** | `Wan2.1_VAE.pth` | [Download](https://huggingface.co/lightx2v/Autoencoders/blob/main/Wan2.1_VAE.pth) |
| **Wan2.2** | `Wan2.2_VAE.pth` | [Download](https://huggingface.co/lightx2v/Autoencoders/blob/main/Wan2.2_VAE.pth) |
| **LightVAE_W2.1** | `lightvaew2_1.pth` | [Download](https://huggingface.co/lightx2v/Autoencoders/blob/main/lightvaew2_1.pth) |
| **TAE_W2.2** | `taew2_2.safetensors` | [Download](https://huggingface.co/lightx2v/Autoencoders/blob/main/taew2_2.safetensors) |
| **LightTAE_HY1.5** | `lighttaehy1_5.pth` | [Download](https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.pth) |

---

## 📖 Best Practices / Settings Guide

### Low VRAM (8-12GB) Configuration

```
Mode: tiny-long
VAE: LightVAE_W2.1 or LightTAE_HY1.5
Tiled VAE: ✅ Enabled
Tiled DiT: ✅ Enabled
Chunk Size: 16-32
Resize Factor: 0.5-0.8
Keep Models on CPU: ✅ Enabled
```

### Medium VRAM (16GB) Configuration

```
Mode: tiny
VAE: Wan2.1 or LightVAE_W2.1
Tiled VAE: ✅ Enabled
Tiled DiT: Optional
Chunk Size: 50-100
Resize Factor: 1.0
Keep Models on CPU: Optional
```

### High VRAM (24GB+) Configuration

```
Mode: full or tiny
VAE: Wan2.1 or Wan2.2
Tiled VAE: ❌ Disabled
Tiled DiT: ❌ Disabled
Chunk Size: 0 (all frames)
Resize Factor: 1.0
Keep Models on CPU: ❌ Disabled
```

### Processing Summary

At the end of each run, you'll see a summary:

```
============================================================
📊 PROCESSING SUMMARY
⏱️ Total Processing Time: 130.08s (1.54 FPS)
📥 Input Resolution: 276x206 (200 frames)
📤 Output Resolution: 552x412 (200 frames)
📈 Peak VRAM Used: 12.4 GB
============================================================
```

---

## 🔧 Node Parameters

Hover over any input in ComfyUI to see tooltips. Full parameter list:

| Parameter | Description |
| :--- | :--- |
| **model** | FlashVSR model version |
| **mode** | `tiny` (fast), `tiny-long` (lowest VRAM), `full` (highest quality) |
| **vae_model** | VAE architecture (5 options, auto-download) |
| **scale** | Upscaling factor: 2x or 4x |
| **color_fix** | Wavelet color transfer. Highly recommended. |
| **tiled_vae** | Spatial tiling for VAE. Reduces VRAM, slower. |
| **tiled_dit** | Spatial tiling for DiT. Required for 4K output. |
| **tile_size** | Tile dimensions. Smaller = less VRAM. |
| **overlap** | Tile overlap for seamless blending. |
| **unload_dit** | Unload DiT before VAE decode. |
| **frame_chunk_size** | Process N frames at a time. 0 = all. |
| **enable_debug** | Verbose console logging. |
| **keep_models_on_cpu** | Offload to system RAM when idle. |
| **resize_factor** | To first reduce the size of large videos and then enlarge them, use a range of (0.3-1.0). |
| **attention_mode** | Attention kernel: `sparse_sage`, `flash_attention_2`, `sdpa`, `block_sparse` |

---

## 💻 Command-Line Interface (CLI)

FlashVSR includes a full-featured CLI that mirrors all ComfyUI node parameters for standalone video upscaling.

### Quick Start

```bash
# Basic 2x upscale
python cli_main.py --input video.mp4 --output upscaled.mp4 --scale 2

# 4x upscale with tiling for lower VRAM
python cli_main.py --input video.mp4 --output upscaled.mp4 --scale 4 \
    --tiled_vae --tiled_dit --tile_size 256 --tile_overlap 24

# Long video with chunking to prevent OOM
python cli_main.py --input long_video.mp4 --output upscaled.mp4 \
    --frame_chunk_size 50 --mode tiny-long

# Low VRAM mode (8GB GPUs)
python cli_main.py --input video.mp4 --output upscaled.mp4 --scale 2 \
    --vae_model LightVAE_W2.1 --tiled_vae --tiled_dit \
    --frame_chunk_size 20 --resize_factor 0.5

# Custom models directory
python cli_main.py --input video.mp4 --output upscaled.mp4 \
    --models_dir /path/to/your/models
```

### CLI Arguments Reference

All arguments map 1:1 with ComfyUI node inputs. Run `python cli_main.py --help` for full details.

#### Required Arguments

| Argument | Description |
| :--- | :--- |
| `--input`, `-i` | Input video file path (e.g., `video.mp4`) |
| `--output`, `-o` | Output video file path (e.g., `upscaled.mp4`) |

#### Pipeline Initialization (from FlashVSRNodeInitPipe)

| Argument | Type | Default | Description |
| :--- | :--- | :--- | :--- |
| `--model` | choice | `FlashVSR-v1.1` | Model version: `FlashVSR`, `FlashVSR-v1.1` |
| `--mode` | choice | `tiny` | Operation mode: `tiny`, `tiny-long`, `full` |
| `--vae_model` | choice | `Wan2.1` | VAE model: `Wan2.1`, `Wan2.2`, `LightVAE_W2.1`, `TAE_W2.2`, `LightTAE_HY1.5` |
| `--force_offload` | flag | `True` | Force offload models to CPU after execution |
| `--no_force_offload` | flag | - | Disable force offloading |
| `--precision` | choice | `auto` | Precision: `fp16`, `bf16`, `auto` |
| `--device` | string | `auto` | Device: `cuda:0`, `cuda:1`, `cpu`, `auto` |
| `--attention_mode` | choice | `sparse_sage_attention` | Attention: `sparse_sage_attention`, `block_sparse_attention`, `flash_attention_2`, `sdpa` |

#### Processing Parameters (from FlashVSRNodeAdv)

| Argument | Type | Default | Description |
| :--- | :--- | :--- | :--- |
| `--scale` | int | `2` | Upscaling factor: `2` or `4` |
| `--color_fix` | flag | `True` | Apply wavelet-based color correction |
| `--no_color_fix` | flag | - | Disable color correction |
| `--tiled_vae` | flag | `False` | Enable spatial tiling for VAE decoder |
| `--tiled_dit` | flag | `False` | Enable spatial tiling for DiT |
| `--tile_size` | int | `256` | Tile size for DiT processing (32-1024) |
| `--tile_overlap` | int | `24` | Overlap pixels between tiles (8-512) |
| `--unload_dit` | flag | `False` | Unload DiT before VAE decoding |
| `--sparse_ratio` | float | `2.0` | Sparse attention control (1.5-2.0) |
| `--kv_ratio` | float | `3.0` | Key/Value cache ratio (1.0-3.0) |
| `--local_range` | int | `11` | Local attention window: `9` or `11` |
| `--seed` | int | `0` | Random seed for reproducibility |
| `--frame_chunk_size` | int | `0` | Process N frames at a time (0 = all) |
| `--enable_debug` | flag | `False` | Enable verbose logging |
| `--keep_models_on_cpu` | flag | `True` | Keep models in CPU RAM when idle |
| `--no_keep_models_on_cpu` | flag | - | Keep models in VRAM |
| `--resize_factor` | float | `1.0` | Resize input before processing (0.1-1.0) |

#### Video I/O Parameters

| Argument | Type | Default | Description |
| :--- | :--- | :--- | :--- |
| `--fps` | float | input FPS | Output video FPS |
| `--codec` | string | `libx264` | Video codec: `libx264`, `libx265`, `h264_nvenc` |
| `--crf` | int | `18` | Quality (0-51, lower = better) |
| `--start_frame` | int | `0` | Start frame index (0-indexed) |
| `--end_frame` | int | `-1` | End frame index (-1 = all frames) |
| `--models_dir` | string | `./models` | Custom models directory path |

---

## 🚀 Installation

### Step 1: Install the Node

```bash
cd ComfyUI/custom_nodes
git clone https://github.com/naxci1/ComfyUI-FlashVSR_Stable.git
python -m pip install -r ComfyUI-FlashVSR_Stable/requirements.txt
```

> 📢 **Turing architecture or older GPUs (GTX 16 series, RTX 20 series, and earlier)**: Install `triton<3.3.0`:
> ```bash
> # Windows
> python -m pip install -U triton-windows<3.3.0
> # Linux
> python -m pip install -U triton<3.3.0
> ```

### Step 2: Download Models

Download the `FlashVSR` folder from [HuggingFace](https://huggingface.co/JunhaoZhuang/FlashVSR-v1.1):

```
ComfyUI/models/FlashVSR/
├── LQ_proj_in.ckpt
├── TCDecoder.ckpt
├── diffusion_pytorch_model_streaming_dmd.safetensors
└── Wan2.1_VAE.pth  (or auto-downloads)
```

> 💡 **VAE files auto-download** from HuggingFace if not present. Only the DiT model and other components need manual download.

### Step 3: Custom Model Paths (Optional)

By default, FlashVSR looks for models in `ComfyUI/models/FlashVSR/`. To use a different location (e.g., models on another drive):

1. Edit `model_paths.yaml` in the `ComfyUI-FlashVSR_Stable` directory
2. Set `flashvsr_model_path` to your custom path
3. Restart ComfyUI

**Example configurations:**

```yaml
# Windows (D: drive)
flashvsr_model_path: "D:/AI/Models/FlashVSR"

# Windows (alternative syntax)
flashvsr_model_path: "E:\\ComfyUI\\models\\FlashVSR"

# Linux/Mac
flashvsr_model_path: "/home/user/models/FlashVSR"
flashvsr_model_path: "/mnt/storage/AI/FlashVSR"

# Use default (leave empty)
flashvsr_model_path: ""
```

> 📂 **Auto-Download Support**: If model files don't exist, they will automatically download to the directory specified in `model_paths.yaml`. The custom path will be created if needed.
> 
> **Example**: If you set `flashvsr_model_path: "D:/AI/Models"`, models will automatically download to `D:/AI/Models/FlashVSR/` on first use.

---

## 🖼️ Preview

![Workflow Preview](./workflow/image1.png)

### Sample Workflow

[Download Workflow JSON](./workflow/FlashVSR.json)

---

## 🏷️ Recent Changes

See [CHANGELOG.md](CHANGELOG.md) for full version history.

---

## 🙏 Acknowledgments

- [FlashVSR](https://github.com/OpenImagingLab/FlashVSR) @OpenImagingLab  
- [Sparse_SageAttention](https://github.com/jt-zhang/Sparse_SageAttention_API) @jt-zhang
- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) @comfyanonymous
- [Wan2.2](https://github.com/Wan-Video/Wan2.2) @Wan-Video
- [LightX2V](https://github.com/ModelTC/LightX2V) @ModelTC
- [LightX2V Autoencoders](https://huggingface.co/lightx2v/Autoencoders) @lightx2v

---

## 📄 License

MIT License - see [LICENSE](LICENSE) for details.