1 Star 0 Fork 0

DannyGGbond / automatic-data-generation

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

automatic-data-generation

Automatic data generation with CVAEs -- Internship by Stéphane

Install

Requirements: Python3.6, pip

virtualenv venv
. venv/bin/activate
pip install -e .

You might need to download some NLTK resources:

  >>> import nltk
  >>> nltk.download('punkt')

Dataset

The abstract class automatic-data-generation.data.base_dataset.py provides the interface for representing a training dataset. To implement a new dataset format, write a class inheriting from Dataset and implement its abstract methods.

Training

Use the script automatic-data-generation.train_and_eval_cvae.py to train a model, generate sentences, and evaluate their quality.

python automatic_data_generation/train_and_eval_cvae.py --dataset-size 200 --n-generated 1000 --n-epochs 5 --none-size 100  --none-type subtitles --restrict-to-intent GetWeather PlayMusic
  • --dataset-size: number of sentences in the training dataset
  • --none-size: number of None sentences to be added to the training dataset
  • --none-type: type of None sentences
  • --restrict-to-intent: list of intents to filter on for training
  • --n-epochs: number of epochs for training
  • --n-generated: number of generated sentences

Output folder

An folder will be created with the following elements:

  • model: a folder with a model.pth file and its associated config.json
  • tensorboard: a folder with the checkpoints for tensorboard
  • run.pkl: a dictionnary with every runtime parameters
  • train_*.csv: the training dataset
  • train_*_augmented.csv: the training dataset augmented with generated sentences
  • validate_*.csv: the validation dataset

空文件

简介

Automatic data generation implemented by Stephane during internship at Snips 展开 收起
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/DannyGao/automatic-data-generation.git
git@gitee.com:DannyGao/automatic-data-generation.git
DannyGao
automatic-data-generation
automatic-data-generation
master

搜索帮助