# Audio2Face

**Repository Path**: niki_liqing/Audio2Face

## Basic Information

- **Project Name**: Audio2Face
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2021-11-01
- **Last Updated**: 2022-03-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README


## Audio to Face Blendshape 
Implementation with PyTorch. 

复现人：刘宇昂
- Base model
    - LSTM using MFCC audio features
    - CNN([ref](http://research.nvidia.com/publication/2017-07_Audio-Driven-Facial-Animation) simplified version) with LPC features


## Prerequisites
- Python3
- PyTorch v0.3.0
- numpy
- librosa & audiolazy
- scipy
- etc.

## Files
- Scripts to run
  - `main.py`: change net name and set checkpoints folder to train different models
  - `test_model.py`: generate blendshape sequences given extracted audio features (need audio features as input)
  - `synthesis.py`: generate blendshape directly from input wav (need arguements of input audio path)

- Classes
  - `models.py`: Classes with LSTM and CNN (simplified NvidiaNet) model. 
  - `models_testae.py`: Advanced models with audoencoder design. 
  - `dataset.py`: Class for loading dataset.

- Input preprocessing
  - `misc/audio_mfcc.py`: extract mfcc features from input wav files
  - `misc/audio_lpc.py`: extract lpc features
  - `misc/combine.py`: combine certain audio feature/blendshape files to obtain a single file for data loading

## Usage
### Input
To build your own dataset, you need to preprocess your wav/blendshape pairs with `misc/audio_mfcc.py` or `misc/audio_lpc.py`. Then combine those feature/blendshape files `misc/combine.py` to a single feature/blendshape file. 

### Training
Modify `main.py`. Set model to the one you need and also specify checkpoint folder. 

### Evaluation
- Both `test_model.py` and `synthesis.py` can be used to generate blendshape sequences. 
    - `test_model.py` accepts extrated audio features (MFCC/LPC).
    - `synthesis.py` takes raw wav file as input
    - State the arguments and it will produce a blenshape test file.