# INSTA **Repository Path**: Junjiagit/INSTA ## Basic Information - **Project Name**: INSTA - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-28 - **Last Updated**: 2025-10-28 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

INSTA - Instant Volumetric Head Avatars

Wojciech Zielonka, Timo Bolkart, Justus Thies

Max Planck Institute for Intelligent Systems, Tübingen, Germany

Video Paper Project Website Dataset Face Tracker INSTA Pytorch Email

Official Repository for CVPR 2023 paper Instant Volumetric Head Avatars

This repository is based on [instant-ngp](https://github.com/NVlabs/instant-ngp), some of the features of the original code are not available in this work.

⚠ We also prepared a Pytorch demo version of the project INSTA Pytorch ⚠

### Installation The repository is based on `instant-ngp` [commit](https://github.com/NVlabs/instant-ngp/tree/e7631da9fca9d0f3467f826fccd7a5849b3f6309). The requirements for the installation are the same, therefore please follow the [guide](https://github.com/NVlabs/instant-ngp#building-instant-ngp-windows--linux). Remember to use the `--recursive` option during cloning. ```shell git clone --recursive https://github.com/Zielon/INSTA.git cd INSTA cmake . -B build cmake --build build --config RelWithDebInfo -j ``` ### Usage and Requirements After building the project you can either start training an avatar from scratch or load a snapshot. For training, we recommend a graphics card higher or equal to `RTX3090 24GB` and `32 GB` of RAM memory. Training on a different hardware probably requires adjusting options in the config: ```shell "max_cached_bvh": 4000, # How many BVH data structures are cached "max_images_gpu": 1700, # How many frames are loaded to GPU. Adjust for a given GPU memory size. "use_dataset_cache": true, # Load images to RAM memory "max_steps": 33000, # Max training steps after which test sequence will be recorded "render_novel_trajectory": false, # Dumps additional camera trajectories after max steps "render_from_snapshot": false # For --no-gui option to directly render sequences ``` Rendering from a snapshot does not require a high-end GPU and can be performed even on a laptop. We have tested it on `RTX 3080 8GB` laptop version. For `--no-gui` option you can train and load snapshot for rendering by using the config in the same way as the one with `GUI`. The viewer options are the same as in the case of [instant-ngp](https://github.com/NVlabs/instant-ngp#keyboard-shortcuts-and-recommended-controls), with some additional key `F` to raycast the FLAME mesh. Usage example ```shell # Run without GUI examples script ./run.sh # Run cross reenactment based on deformation gradient transfer ./run_transfer.sh # Training with GUI ./build/rta --config insta.json --scene data/obama --height 512 --width 512 # Loading from a checkpoint ./build/rta --config insta.json --scene data/obama/transforms_test.json --snapshot data/obama/snapshot.msgpack ```

For better visualization you can use our GUI application.

### Dataset and Training We are releasing part of our dataset together with publicly available preprocessed avatars from [NHA](https://github.com/philgras/neural-head-avatars), [NeRFace](https://github.com/gafniguy/4D-Facial-Avatars) and [IMAvatar](https://github.com/zhengyuf/IMavatar). Each participant whose data was recorded in this study provided written consent for its release by signing this [document](./documents/Consent_general_english_ncs_video.pdf). Access to those sequences can be requested via [Google Forms](https://docs.google.com/forms/d/e/1FAIpQLSecX-7Arzv_qVQWFdicNxcxmPmSQx46y6TxnBBN67m0hvkXiA/viewform?usp=sharing&ouid=114977764432146378365). [Available avatars](https://drive.google.com/drive/folders/1LsVvr7PPwGlyK0qiTuDVUz4ihreHJgut?usp=sharing). Click the selected avatar to download the training dataset and the checkpoint. The avatars have to be placed in the `data` folder.

The output of the training (**Record Video** in menu), including rendered frames, checkpoint, etc will be saved in the `./data/{actor}/experiments/{config}/debug`. After the specified number of max steps, the program will automatically either render frames using novel cameras (`All` option in GUI and `render_novel_trajectory` in config) or only the currently selected one in `Mode`, by default `Overlay\Test`. ### Dataset Generation For the input generation, a conda environment is needed, and a few other repositories. Simply run `install.sh` from [scripts](https://github.com/Zielon/INSTA/tree/master/scripts) folder to prepare the workbench. Next, you can use [Metrical Photometric Tracker](https://github.com/Zielon/metrical-tracker) for the tracking of a sequence. After the processing is done run the `generate.sh` script to prepare the sequence. As input please specify the absolute path of the tracker output. **For training we recommend at least 1000 frames.** ```shell # 1) Run the Metrical Photometric Tracker for a selected actor python tracker.py --cfg ./configs/actors/duda.yml # 2) Generate a dataset using the script. Importantly, use the absolute path to tracker input and desired output. ./generate.sh /metrical-tracker/output/duda INSTA/data/duda 100 # {input} {output} {# of test frames from the end} ``` ### Citation If you use this project in your research please cite INSTA: ```bibtex @proceedings{INSTA:CVPR2023, author = {Zielonka, Wojciech and Bolkart, Timo and Thies, Justus}, title = {Instant Volumetric Head Avatars}, journal = {Conference on Computer Vision and Pattern Recognition}, year = {2023} } ```