# Audio2Face **Repository Path**: niki_liqing/Audio2Face ## Basic Information - **Project Name**: Audio2Face - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-11-01 - **Last Updated**: 2022-03-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## Audio to Face Blendshape Implementation with PyTorch. 复现人:刘宇昂 - Base model - LSTM using MFCC audio features - CNN([ref](http://research.nvidia.com/publication/2017-07_Audio-Driven-Facial-Animation) simplified version) with LPC features ## Prerequisites - Python3 - PyTorch v0.3.0 - numpy - librosa & audiolazy - scipy - etc. ## Files - Scripts to run - `main.py`: change net name and set checkpoints folder to train different models - `test_model.py`: generate blendshape sequences given extracted audio features (need audio features as input) - `synthesis.py`: generate blendshape directly from input wav (need arguements of input audio path) - Classes - `models.py`: Classes with LSTM and CNN (simplified NvidiaNet) model. - `models_testae.py`: Advanced models with audoencoder design. - `dataset.py`: Class for loading dataset. - Input preprocessing - `misc/audio_mfcc.py`: extract mfcc features from input wav files - `misc/audio_lpc.py`: extract lpc features - `misc/combine.py`: combine certain audio feature/blendshape files to obtain a single file for data loading ## Usage ### Input To build your own dataset, you need to preprocess your wav/blendshape pairs with `misc/audio_mfcc.py` or `misc/audio_lpc.py`. Then combine those feature/blendshape files `misc/combine.py` to a single feature/blendshape file. ### Training Modify `main.py`. Set model to the one you need and also specify checkpoint folder. ### Evaluation - Both `test_model.py` and `synthesis.py` can be used to generate blendshape sequences. - `test_model.py` accepts extrated audio features (MFCC/LPC). - `synthesis.py` takes raw wav file as input - State the arguments and it will produce a blenshape test file.