# magic_animate_unofficial **Repository Path**: elliotqi/magic_animate_unofficial ## Basic Information - **Project Name**: magic_animate_unofficial - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-03-16 - **Last Updated**: 2024-03-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # magic_animate_unofficial This unofficial training code is primarily adapted from [Magic Animate](https://github.com/magic-research/magic-animate) and [Animatediff](https://github.com/guoyww/AnimateDiff). ## ToDo - [x] **Release Training Code.** - [x] **Release pre-trained weights.** ## Features - Utilizes Deepspeed training with a resolution of 768*512, and a batch size = 2 per GPU using V100-32G. - We've altered the condition from dense-pose to dwpose, which differs from Magic Animate. Feel free to revert this change if necessary. - We employ a fine-grained image prompt from [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) instead of the EMPTY text prompt used in Magic Animate. You can revert this change if required. ```python from animatediff.magic_animate.resampler import Resampler # define a resampler image_proj_model = Resampler( dim=cross_attention_dim, depth=4, dim_head=64, heads=12, num_queries=64, embedding_dim=1280, output_dim=cross_attention_dim, ff_mult=4, ) # extract fine-grained features of reference image for cross-attention guidance # project from (batch_size, 257, 1280) to (batch_size, 64, 768); replace empty text embeddings with this. encoder_hidden_states = image_proj_model(image_prompts) ``` ## Requirements see 'requirements.txt' To support DWPose which is dependent on MMDetection, MMCV and MMPose ``` pip install -U openmim mim install mmengine mim install "mmcv>=2.0.1" mim install "mmdet>=3.1.0" mim install "mmpose>=1.1.0" ``` ## Data To prepare your videos, create a JSON file with a list of video directories. Then, update the "json_path" value in "./configs/training/aa_train_stage1.yaml" to point to your JSON file. You can use the public [fashion dataset](https://drive.google.com/drive/folders/17-BoVYRnG6WLymJ4q2tw-JJp_TC3u52P?usp=sharing) for fast prototyping. ## Inference We offer a model weight [Google drive link](https://drive.google.com/file/d/1Zai8g2PRcYqTZ77bpZp4igg9ZZjosR4n/view?usp=sharing) that has been trained using 2,000 dance videos sourced from the web, and subsequently fine-tuned with fashion videos. ```commandline python3 infer.py ``` Please note that this weight was not trained from scratch; instead, it was initialized with the official weight from Magic Animate. Additionally, this weight does not utilize the IPAdapter as we originally intended. The presence of background artifacts could potentially be attributed to the fact that very few training videos have a white background as in fashion videos. Admittedly, this model is far from perfect. We hope this could be some little help for fast prototyping.
![]() |
![]() |
![]() |
![]() |