1 The University of Tokyo 2 Shanghai Artificial Intelligence Laboratory * Corresponding Authors
## TL;DR
> We leverage **"3DGS Avatar + Background Video"** as guidance for the video diffusion model to **insert and animate anyone into any scene following given motion sequence**.
## 🔥🔥🔥 New Features/Updates
- (2024.07.22) We support Unified Sequence Parallel (USP) for multi-GPU inference.
- (2024.07.03) We have released the cross-character inference script to replace the person in the source video!
- (2025.07.02) Our [Project Page](https://myniuuu.github.io/AniCrafter) 🏠 is online!
- (2025.07.01) We have released the model and inference script to insert and animate the character into the background video following SMPLX motion sequences!
- If you find this work interesting, please do not hesitate to give a ⭐!
## 📰 CODE RELEASE
- [x] (2024.07.01) Release model checkpoint and cross-character inference script.
- [x] (2024.07.03) Release the complete cross-character inference script including data preprocessing (mask parsing + SMPLX estimation + background inpainting).
- [x] (2024.07.22) We support Unified Sequence Parallel (USP) for multi-GPU inference.
- [ ] Release training codes.
## ⚙️ Environment Setup
### 🌍 Virtual Enviroment
```
conda create -n anicrafter python=3.10
conda activate anicrafter
bash install_cu124.sh
pip install xfuser # Unified Sequence Parallel (USP) for multi-GPU inference
```
### 📦 Download Checkpoints
```
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./Wan2.1-I2V-14B-720P
huggingface-cli download MyNiuuu/Anicrafter_release --local-dir ./Anicrafter_release
mv ./Anicrafter_release/gfpgan ./gfpgan
mv ./Anicrafter_release/pretrained_models ./pretrained_models
```
## 🏃 Cross-Character Inference from Background Video and Motions
Run the following commands to insert and animate the character into the background video following SMPLX motion sequences. The pipeline consists of following key functions:
- Reconstructing 3DGS Avatar from single image using [LHM](https://github.com/aigc3d/LHM)
- Animating the 3DGS Avatar according to the SMPLX sequences to obtain the spatial aligned avatar renderings
- Combine avatar rendering and background video to form the "Avatar + Background" condition
- Run the diffusion model to obtain the final animation results
### Single GPU Inference
```
python run_pipeline.py \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--scene_path ./demo/videos/scene_000000 \
--save_root ./infer_result
```
### Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration
```
torchrun \
--nproc_per_node=8 \
--master_port=16333 \
run_pipeline_usp.py \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--scene_path ./demo/videos/scene_000000 \
--save_root ./infer_result
```
## 🏃 Cross-Character Inference from in-the-wild Videos
Run the following commands to replace the person in the source video with our complete data preprocessing pipeline, which contains the following components:
- Parsing human masks
- Estimating SMPLX parameters and rendering SMPLX mesh videos
- Background inpainting based on the human masks
- Reconstructing 3DGS Avatar from single image using [LHM](https://github.com/aigc3d/LHM)
- Animating the 3DGS Avatar according to the SMPLX sequences to obtain the spatial aligned avatar renderings
- Combine avatar rendering and background video to form the "Avatar + Background" condition
- Run the diffusion model to obtain the final animation results
### ⚙️ Additional Environment Setup
```
cd engine/pose_estimation
pip install mmcv==1.3.9
pip install -v -e third-party/ViTPose
pip install ultralytics
pip install av
cd ../..
pip install numpy==1.23.5
mkdir weights
cd weights
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/cutie-base-mega.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/i3d_rgb_imagenet.pt
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/ProPainter.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/raft-things.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/recurrent_flow_completion.pth
cd ..
# or you can mannually download from https://github.com/sczhou/ProPainter/releases/tag/v0.1.0
```
### Single GPU Inference
```
# Mask + SMPLX + Inpainting + Avatar Recon + Rendering + Diffusion
# You could change the hyper-parameters of the inpainting algorithm to obtain optimal results
python run_pipeline_with_preprocess.py \
--video_root ./demo/origin_videos/raw_video \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--save_root ./infer_result_replace
```
### Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration
```
# Mask + SMPLX + Inpainting + Avatar Recon + Rendering + Diffusion
# You could change the hyper-parameters of the inpainting algorithm to obtain optimal results
torchrun \
--nproc_per_node=8 \
--master_port=16333 \
run_pipeline_with_preprocess_usp.py \
--video_root ./demo/origin_videos/raw_video \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--save_root ./infer_result_replace
```
## Citation
```
@article{niu2025anicrafter,
title={AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models},
author={Niu, Muyao and Cao, Mingdeng and Zhan, Yifan and Zhu, Qingtian and Ma, Mingze and Zhao, Jiancheng and Zeng, Yanhong and Zhong, Zhihang and Sun, Xiao and Zheng, Yinqiang},
journal={arXiv preprint arXiv:2505.20255},
year={2025}
}
```
## Acknowledgements
We sincerely appreciate the code release of the following projects: [LHM](https://github.com/aigc3d/LHM), [Unianimate-DiT](https://github.com/ali-vilab/UniAnimate-DiT), [Diffusers](https://github.com/huggingface/diffusers), and [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)