# deepvoice3

**Repository Path**: muconaiqi/deepvoice3

## Basic Information

- **Project Name**: deepvoice3
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-12-25
- **Last Updated**: 2021-12-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Deep Voice 3

## **Work In Progress**
To check the current status, see [this](https://github.com/Kyubyong/deepvoice3/issues/9).

This is a tensorflow implementation of [DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH](https://arxiv.org/pdf/1710.07654.pdf). For now I'm focusing on single speaker synthesis.

### Data

I'm trying with [Nick Offerman's audiobook files](https://www.audible.com/pd/Fiction/The-Adventures-of-Tom-Sawyer-Audiobook/B01HQMQLWK?source_code=AUDORWS0628169HI5use) for fun and [The LJ Speech Dataset](https://keithito.com/LJ-Speech-Dataset) which in public domain.

## File Description

  * hyperparams.py: hyper parameters
  * prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
  * data_load.py
  * utils.py: several custom operational functions.
  * modules.py: building blocks for the networks.
  * networks.py: encoder, decoder, and converter
  * train.py: train
  * synthesize.py: inference
  * test_sents.txt: some test sentences in the paper.
  
## Papers that referenced this repo

  * [Fitting New Speakers Based on a Short Untranscribed Sample](https://arxiv.org/abs/1802.06984)