# lingvo
**Repository Path**: marcy/lingvo
## Basic Information
- **Project Name**: lingvo
- **Description**: Lingvo
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-01-09
- **Last Updated**: 2021-02-27
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Lingvo
## What is it?
Lingvo is a framework for building neural networks in Tensorflow, particularly
sequence models.
A list of publications using Lingvo can be found [here](PUBLICATIONS.md).
## Quick start
### Installation
There are multiple ways to set up Lingvo: through pip, docker, or cloning the
repository.
**pip:**
The easiest way to get started is to install the
[Lingvo pip package](https://pypi.org/project/lingvo) (available for python 3.6
and 3.7).
For an example of how to develop and train a custom model, see the
[codelab](https://colab.research.google.com/github/tensorflow/lingvo/blob/master/codelabs/introduction.ipynb).
**docker:**
Docker configurations are provided. Instructions can be found in the comments on
the top of each file.
* [lib.dockerfile](docker/lib.dockerfile) has the Lingvo pip package
preinstalled.
* [dev.dockerfile](docker/dev.dockerfile) can be used to build Lingvo from
sources using Bazel.
[How to install docker.](https://docs.docker.com/install/linux/docker-ce/ubuntu/)
**From sources:**
The prerequisites are:
* a TensorFlow 2.0 [installation](https://www.tensorflow.org/install/),
* a `C++` compiler (only g++ 7.3 is officially supported), and
* the bazel build system.
Refer to [docker/dev.dockerfile](docker/dev.dockerfile) for a set of working
requirements.
git clone the repository, then use bazel to build and run targets directly.
### Running the MNIST image model
#### Preparing the input data
**pip:**
```shell
mkdir -p /tmp/mnist
python3 -m lingvo.tools.keras2ckpt --dataset=mnist
```
**bazel:**
```shell
mkdir -p /tmp/mnist
bazel run -c opt //lingvo/tools:keras2ckpt -- --dataset=mnist
```
The following files will be created in `/tmp/mnist`:
* `mnist.data-00000-of-00001`: 53MB.
* `mnist.index`: 241 bytes.
#### Running the model
**pip:**
```shell
cd /tmp/mnist
curl -O https://raw.githubusercontent.com/tensorflow/lingvo/master/lingvo/tasks/image/params/mnist.py
python3 -m lingvo.trainer --run_locally=cpu --mode=sync --model=mnist.LeNet5 --logdir=/tmp/mnist/log
```
**bazel:**
```shell
(cpu) bazel build -c opt //lingvo:trainer
(gpu) bazel build -c opt --config=cuda //lingvo:trainer
bazel-bin/lingvo/trainer --run_locally=cpu --mode=sync --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr
```
After about 20 seconds, the loss should drop below 0.3 and a checkpoint will be
saved, like below. Kill the trainer with Ctrl+C.
```
trainer.py:518] step: 205, steps/sec: 11.64 ... loss:0.25747201 ...
checkpointer.py:115] Save checkpoint
checkpointer.py:117] Save checkpoint done: /tmp/mnist/log/train/ckpt-00000205
```
Some artifacts will be produced in `/tmp/mnist/log/control`:
* `params.txt`: hyper-parameters.
* `model_analysis.txt`: model sizes for each layer.
* `train.pbtxt`: the training `tf.GraphDef`.
* `events.*`: a tensorboard events file.
As well as in `/tmp/mnist/log/train`:
* `checkpoint`: a text file containing information about the checkpoint files.
* `ckpt-*`: the checkpoint files.
Now, let's evaluate the model on the "Test" dataset. In the normal training
setup the trainer and evaler should be run at the same time as two separate
processes.
**pip:**
```shell
python3 -m lingvo.trainer --job=evaler_test --run_locally=cpu --model=mnist.LeNet5 --logdir=/tmp/mnist/log
```
**bazel:**
```shell
bazel-bin/lingvo/trainer --job=evaler_test --run_locally=cpu --model=image.mnist.LeNet5 --logdir=/tmp/mnist/log --logtostderr
```
Kill the job with Ctrl+C when it starts waiting for a new checkpoint.
```
base_runner.py:177] No new check point is found: /tmp/mnist/log/train/ckpt-00000205
```
The evaluation accuracy can be found slightly earlier in the logs.
```
base_runner.py:111] eval_test: step: 205, acc5: 0.99775392, accuracy: 0.94150388, ..., loss: 0.20770954, ...
```
### Running the machine translation model
To run a more elaborate model, you'll need a cluster with GPUs. Please refer to
[`third_party/py/lingvo/tasks/mt/README.md`](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/README.md)
for more information.
## Current models
### Automatic Speech Recogition
* [asr.librispeech.Librispeech960Grapheme](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/asr/params/librispeech.py)1,2
* [asr.librispeech.Librispeech960Wpm](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/asr/params/librispeech.py)1,2
### Car
* [car.kitti.StarNetCarModel0701](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/kitti.py)3
* [car.kitti.StarNetPedCycModel0704](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/kitti.py)3
* [car.waymo.StarNetVehicle](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/waymo.py)3
* [car.waymo.StarNetPed](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/car/params/waymo.py)3
### Image
* [image.mnist.LeNet5](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/image/params/mnist.py)4
### Language Modelling
* [lm.one_billion_wds.WordLevelOneBwdsSimpleSampledSoftmax](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/lm/params/one_billion_wds.py)5
### Machine Translation
* [mt.wmt14_en_de.WmtEnDeTransformerBase](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmt14_en_de.py)6
* [mt.wmt14_en_de.WmtEnDeRNMT](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmt14_en_de.py)6
* [mt.wmtm16_en_de.WmtCaptionEnDeTransformer](https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/mt/params/wmtm16_en_de.py)6
\[1]: [Listen, Attend and Spell](https://arxiv.org/pdf/1508.01211.pdf). William
Chan, Navdeep Jaitly, Quoc V. Le, and Oriol Vinyals. ICASSP 2016.
\[2]: [End-to-end Continuous Speech Recognition using Attention-based Recurrent
NN: First Results](https://arxiv.org/pdf/1412.1602.pdf). Jan Chorowski, Dzmitry
Bahdanau, Kyunghyun Cho, and Yoshua Bengio. arXiv 2014.
\[3]:
[StarNet: Targeted Computation for Object Detection in Point Clouds](https://arxiv.org/pdf/1908.11069.pdf).
Jiquan Ngiam, Benjamin Caine, Wei Han, Brandon Yang, Yuning Chai, Pei Sun, Yin
Zhou, Xi Yi, Ouais Alsharif, Patrick Nguyen, Zhifeng Chen, Jonathon Shlens, and
Vijay Vasudevan. arXiv 2019.
\[4]:
[Gradient-based learning applied to document recognition](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf).
Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. IEEE 1998.
\[5]:
[Exploring the Limits of Language Modeling](https://arxiv.org/pdf/1602.02410.pdf).
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu.
arXiv, 2016.
\[6]: [The Best of Both Worlds: Combining Recent Advances in Neural Machine
Translation](http://aclweb.org/anthology/P18-1008). Mia X. Chen, Orhan Firat,
Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Mike
Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz
Kaiser, Zhifeng Chen, Yonghui Wu, and Macduff Hughes. ACL 2018.
## References
* [API Docs](https://tensorflow.github.io/lingvo/)
* [Codelab](https://colab.research.google.com/github/tensorflow/lingvo/blob/master/codelabs/introduction.ipynb)
Please cite this [paper](https://arxiv.org/abs/1902.08295) when referencing
Lingvo.
```
@misc{shen2019lingvo,
title={Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling},
author={Jonathan Shen and Patrick Nguyen and Yonghui Wu and Zhifeng Chen and others},
year={2019},
eprint={1902.08295},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```