# ctts

**Repository Path**: aucki6144/ctts

## Basic Information

- **Project Name**: ctts
- **Description**: Controllable Text-to-speech system, based on FastSpeech2
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 2
- **Forks**: 0
- **Created**: 2023-11-03
- **Last Updated**: 2024-05-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# CTTS: Controllable Text-To-Speech

## Links

[Controllable TTS - Grader (Github)](https://github.com/aucki6144/ctts_grader)

[Controllable TTS - Grader (Gitee)](https://gitee.com/aucki6144/ctts_grader)

[Controllable TTS (Gitee)](https://gitee.com/aucki6144/ctts)


## Quickstart

### GUI
Start gradio gui with
```commandline
python .\gui.py
```

### Dependencies
You can install the Python dependencies with
```
pip3 install -r requirements.txt
```

> Attention: the matplotlib version requirements of gradio conflicts with FastSpeech2. Install gradio first then rollback matplotlib to 3.2.2.

### Inference

Arrange config files as the following structure:
```
  .
  ├── config
  │   ├── DATASET_NAME
  └── └── └── model.yaml
          └── preprocess.yaml
          └── train.yaml
```
The model use ESD_en dataset by default. For English single-speaker TTS, run
```commandline
python .\synthesize.py -t "YOUR_CONTENT"
```
There are optional parameters, you can check out the details by using "help" or read the code in "synthesis.py"
```commandline
python .\synthesize.py --help
```
Here lists some common used parameters:

``-m`` or ``--model``: name of model used.

``-s`` or ``--speaker_id``: specify the emotion id in multi emotion datasets.

``-e`` or ``--emotion_id``: specify the speaker id in multi speaker datasets.

``-r`` or ``--restore_step``: load the model of a particular checkpoint.

The generated utterances will be put in ``output/result/``.

## Training

### Preprocess

Preprocess dataset by the following command:

```commandline
python .\preprocess.py -m ESD_en
```

The TextGrid file generated by MFA should be put in ./preprocessed_data/DATASET_NAME/

### Training

Train model by the following command:

```commandline
python .\train.py -m ESD_en
```

configure pretrain path by parameter ``-pp`` or ``-pretrain_path``