# fac-via-ppg **Repository Path**: atomai/fac-via-ppg ## Basic Information - **Project Name**: fac-via-ppg - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-12-11 - **Last Updated**: 2020-12-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams This repository hosts the code we used to prepare our interspeech'19 paper titled "[Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams](https://www.isca-speech.org/archive/Interspeech_2019/pdfs/1778.pdf)" ### Install This project uses `conda` to manage all the dependencies, you should install [anaconda](https://anaconda.org/) if you have not done so. ```bash # Clone the repo git clone https://github.com/guanlongzhao/fac-via-ppg.git cd $PROJECT_ROOT_DIR # Install dependencies conda env create -f environment.yml # Activate the installed environment conda activate ppg-speech # Compile protocol buffer to get the data_utterance_pb2.py file protoc -I=src/common --python_out=src/common src/common/data_utterance.proto # Include src in your PYTHONPATH export PYTHONPATH=$PROJECT_ROOT_DIR/src:$PYTHONPATH ``` If `conda` complains that some packages are missing, it is very likely that you can find a similar version of that package on anaconda's archive. ### Run unit tests ```bash cd test # Remember to make this script executable ./run_coverage.sh ``` This only does a few sanity checks, don't worry if the test coverage looks low :) Depending on your git configs, you may or may not need to recreate the symbolic links in `test/data`. ### Train PPG-to-Mel model Change default parameters in `src/common/hparams.py:create_hparams()`. The training and validation data should be specified in text files, see `data/filelists` for examples. ```bash cd src/script python train_ppg2mel.py ``` The `FP16` mode will not work, unfortunately :( ### Train WaveGlow model Change the default parameters in `src/waveglow/config.json`. The training data should be specified in the same manner as the PPG-to-Mel model. ```bash cd src/script python train_waveglow.py ``` ### View training progress You should find a dir `log` in all of your output dirs, that is the `LOG_DIR` you should use below. ```bash tensorboard --logdir=${LOG_DIR} ``` ### Generate speech synthesis Use `src/script/generate_synthesis.py`, you can find pre-trained models in the [Links](#Links) section. ```bash generate_synthesis.py [-h] --ppg2mel_model PPG2MEL_MODEL --waveglow_model WAVEGLOW_MODEL --teacher_utterance_path TEACHER_UTTERANCE_PATH --output_dir OUTPUT_DIR ``` ### Links - Syntheses and pre-trained models: [link to model and syntheses](https://drive.google.com/file/d/1nye-CAGyz3diM5Q80s0iuBYgcIL_cqrs/view?usp=sharing) - Training data (L2-ARCTIC recordings after noise removal): [link to training data](https://drive.google.com/file/d/1WnBHAfjEKdFTBDv5D6DxRnlcvfiODBgy/view?usp=sharing) - Demo: [link to audio samples](https://guanlongzhao.github.io/demo/fac-via-ppg) ### Citation Please kindly cite the following paper if you use this code repository in your work, ``` @inproceedings{zhao2019ForeignAC, author={Guanlong Zhao and Shaojin Ding and Ricardo Gutierrez-Osuna}, title={{Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams}}, year=2019, booktitle={Proc. Interspeech 2019}, pages={2843--2847}, doi={10.21437/Interspeech.2019-1778}, url={http://dx.doi.org/10.21437/Interspeech.2019-1778} } ```