# DeepPavlov **Repository Path**: nebel/DeepPavlov ## Basic Information - **Project Name**: DeepPavlov - **Description**: DeepPavlov 是一个开源的对话 AI 库，基于 TensorFlow 和 Keras 构建，其作用是： NLP 和对话系统研究；实现和评估复杂对话系统 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 9 - **Created**: 2018-03-05 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README [![License Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/deepmipt/DeepPavlov/blob/master/LICENSE) ![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg) # DeepPavlov ### *We are in a really early Alpha release. You have to be ready for hard adventures.* ### *If you have updated to version 0.0.2 - please re-download all pretrained models* An open-source conversational AI library, built on TensorFlow and Keras, and designed for * NLP and dialog systems research * implementation and evaluation of complex conversational systems Our goal is to provide researchers with: * a framework for implementing and testing their own dialog models with subsequent sharing of that models * set of predefined NLP models / dialog system components (ML/DL/Rule-based) and pipeline templates * benchmarking environment for conversational models and systematized access to relevant datasets and AI-application developers with: * framework for building conversational software * tools for application integration with adjacent infrastructure (messengers, helpdesk software etc.) ## Features | Component | Description | | --------- | ----------- | | [Slot filling and NER componenst](deeppavlov/models/ner/README.md) | Based on neural Named Entity Recognition network and fuzzy Levenshtein search to extract normalized slot values from the text. The NER component reproduces architecture from the paper [Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition](https://arxiv.org/pdf/1709.09686.pdf), which is inspired by Bi-LSTM+CRF architecture from https://arxiv.org/pdf/1603.01360.pdf. | | [Intent classification component](deeppavlov/models/classifiers/intents/README.md) | Based on shallow-and-wide Convolutional Neural Network architecture from [Kim Y. Convolutional neural networks for sentence classification – 2014](https://arxiv.org/pdf/1408.5882). The model allows multilabel classification of sentences. | | [Automatic spelling correction component](deeppavlov/models/spellers/error_model/README.md) | Based on [An Improved Error Model for Noisy Channel Spelling Correction by Eric Brill and Robert C. Moore](http://www.aclweb.org/anthology/P00-1037) and uses statistics based error model, a static dictionary and an ARPA language model to correct spelling errors. | | **Skill** | | | [Goal-oriented bot](deeppavlov/skills/go_bot/README.md) | Based on Hybrid Code Networks (HCNs) architecture from [Jason D. Williams, Kavosh Asadi, Geoffrey Zweig, Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning – 2017](https://arxiv.org/abs/1702.03274). It allows to predict responses in the goal-oriented task dialogue. The model is quite customizable: embeddings, slot filler and intent classifier can be used or not on demand. | | **Embeddings** | | | [Pre-trained embeddings for Russian language](pretrained-vectors.md) | Pre-trained on joint [Russian Wikipedia](https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0) and [Lenta.ru](https://lenta.ru/) corpora word vectors for Russian language. | ## Basic examples View video demo of deploy goal-oriented bot and slot-filling model with Telegram UI [![Alt text for your video](https://img.youtube.com/vi/yzoiCa_sMuY/0.jpg)](https://youtu.be/yzoiCa_sMuY) * Run goal-oriented bot with Telegram interface: ``` python deep.py interactbot configs/go_bot/gobot_dstc2.json -t ``` * Run goal-oriented bot with console interface: ``` python deep.py interact configs/go_bot/gobot_dstc2.json ``` * Run slot-filling model with Telegram interface ``` python deep.py interactbot configs/ner/slotfill_dstc2.json -t ``` * Run slot-filling model with console interface ``` python deep.py interact configs/ner/slotfill_dstc2.json ``` ## Conceptual overview ### Principles The library is designed following the principles: * end-to-end deep learning architecture as long-term goal * hybrid ML/DL/Rule-based architecture as a current approach * modular dialog system architecture * component-based software engineering, reusability maximization * easy to extend and benchmark * multiple components by one NLP task with data-driven selection of suitable components ### Target Architecture Target architecture of our library:

DeepPavlov is built on top of machine learning frameworks (TensorFlow, Keras). Other external libraries can be used to build basic components. ### Key Concepts * `Agent` - conversational agent communicating with users in natural language (text) * `Skill` - unit of interaction that fulfills a user’s need. Typically, a user’s need is fulfilled by presenting information or completing a transaction (e.g. answer question by FAQ, booking tickets etc.); however, for some experiences success is defined as continued engagement (e.g. chit-chat) * `Components` - atomic functionality blocks * `Rule-based Components` - can not be trained * `Machine Learning Components` - can be trained only separately * `Deep Learning Components` - can be trained separately and in end-to-end mode being joined in chain * `Switcher` - mechanism by which agent ranks and selects the final response shown to the user * `Components Chainer` - tool for agents/components pipeline building from heterogeneous components (rule-based/ml/dl), which allow to train and inference pipeline as a whole. ### Contents * [Installation](#installation) * [Quick start](#quick-start) * [Technical overview](#technical-overview) * [Project modules](#project-modules) * [Config](#config) * [Training](#training) * [Train config](#train-config) * [Train parameters](#train-parameters) * [DatasetReader](#datasetreader) * [Dataset](#dataset) * [Inferring](#inferring) * [License](#license) * [Support and collaboration](#support-and-collaboration) * [The Team](#the-team) ## Installation 1. Create a virtual environment with `Python 3.6` ``` virtualenv env ``` 2. Activate the environment. ``` source ./env/bin/activate ``` 3. Clone the repo and `cd` to project root ``` git clone https://github.com/deepmipt/DeepPavlov.git cd DeepPavlov ``` 4. Install the requirements: ``` python setup.py develop ``` 5. Install `spacy` dependencies: ``` python -m spacy download en ``` ## Quick start To interact with our pre-trained models, they should be downloaded first: ``` python download.py [-all] ``` * `[-all]` option is not required for basic examples; it will download **all** our pre-trained models. * Warning! `[-all]` requires about 10 GB of free space on disk. Then models can be interacted or trained with the following command: ``` python deep.py ``` * `` can be 'train', 'interact' or 'interactbot' * `` should be a path to an NLP pipeline json config For 'interactbot' mode you should specify Telegram bot token in `-t` parameter or in `TELEGRAM_TOKEN` environment variable. Available model configs are: *configs/go_bot/gobot_dstc2.json* *configs/intents/intents_dstc2.json* *configs/ner/slotfill_dstc2.json* *configs/error_model/brillmoore_wikitypos_en.json* --- ## Technical overview ### Project modules

deeppavlov.core.commands	basic training and inferring functions
deeppavlov.core.common	registration and classes initialization functionality, class method decorators
deeppavlov.core.data	basic *Dataset, DatasetReader* and *Vocab* classes
deeppavlov.core.models	abstract model classes and interfaces
deeppavlov.dataset_readers	concrete *DatasetReader* classes
deeppavlov.datasets	concrete *Dataset* classes
deeppavlov.models	concrete *Model* classes
deeppavlov.skills	*Skill* classes. Skills are dialog models.
deeppavlov.vocabs	concrete *Vocab* classes

### Config An NLP pipeline config is a JSON file that contains one required element `chainer`: ```json { "chainer": { "in": ["x"], "in_y": ["y"], "pipe": [ ... ], "out": ["y_predicted"] } } ``` Chainer is a core concept of DeepPavlov library: chainer builds a pipeline from heterogeneous components (rule-based/ml/dl) and allows to train and infer pipeline as a whole. Each component in the pipeline specifies its inputs and outputs as array of names, for example: `"in": ["tokens", "features"]` and `"out": ["token_embeddings", "features_embeddings"]` and you can chain outputs of one components with inputs of other components: ```json { "name": "str_lower", "in": ["x"], "out": ["x_lower"] }, { "name": "nltk_tokenizer", "in": ["x_lower"], "out": ["x_tokens"] }, ``` Each [Component](deeppavlov/core/models/component.py) in the pipeline must implement method `__call__` and has `name` parameter, which is its registered codename and can have any other parameters, repeating its `__init__()` method arguments. Default values of `__init__()` arguments will be overridden with the config values during class instance initialization. You can reuse components in the pipeline to process different parts of data with help of `id` and `ref` parameters: ```json { "name": "nltk_tokenizer", "id": "tokenizer", "in": ["x_lower"], "out": ["x_tokens"] }, { "ref": "tokenizer", "in": ["y"], "out": ["y_tokens"] }, ``` ### Training There are two abstract classes for trainable components: **Estimator** and **NNModel**. [**Estimators**](deeppavlov/core/models/estimator.py) are fit once on any data with no batching or validation patience, so it can be painlessly done at the time of pipeline initialization. [Vocab](deeppavlov/core/data/vocab.py) is a good example of Estimator. `fit` method has to be implemented for each Estimator. [**NNModel**](deeppavlov/core/models/nn_model.py) requires a more complex training. It trains on the same data it predicts on and ground truth answers. The process takes multiple epochs with periodic validation and logging. `train_on_batch` method has to be implemented for each NNModel. Training is triggered by `deeppavlov.core.commands.train.train_model_from_config()` function. ### Train config Estimators that are trained should also have `fit_on` parameter with a list of input parameters' names. A NNModel should have `in_y` parameter with a list of ground truth answers' names. For example: ```json [ { "id": "classes_vocab", "name": "default_vocab", "fit_on": ["y"], "level": "token", "save_path": "vocabs/classes.dict", "load_path": "vocabs/classes.dict" }, { "in": ["x"], "in_y": ["y"], "out": ["y_predicted"], "name": "intent_model", "save_path": "intents/intent_cnn", "load_path": "intents/intent_cnn", "classes_vocab": { "ref": "classes_vocab" } } ] ``` Config for training the pipeline has to have three additional elements: `dataset_reader`, `dataset` and `train`: ```json { "dataset_reader": { "name": ..., ... } "dataset": { "name": ..., ... }, "chainer": { ... } "train": { ... } } ``` ### Train Parameters * `epochs` — maximum number of epochs to train NNModel, defaults to `-1`, infinite * `batch_size`, * `metrics` — list of names of [registered metrics](deeppavlov/metrics) to evaluate the model on. First one in the list is used for validation patience * `metric_optimization` — one of `maximize` or `minimize`, defaults to `maximize` * `validation_patience` — how many times in a row validation metric has to not improve to stop training, defaults to `5` * `val_every_n_epochs` — how often to validate the pipe, defaults to `-1`, never * `log_every_n_batches`, `log_every_n_epochs` — how often to calculate metrics for train data, defaults to `-1`, never * `validate_best`, `test_best` flags to infer the best saved model on valid and test data, defaults to `true` ### DatasetReader `DatasetReader` class reads data and returns it in a specified format. A concrete `DatasetReader` class should be inherited from base `deeppavlov.data.dataset_reader.DatasetReader` class and registered with a codename: ```python from deeppavlov.core.common.registry import register from deeppavlov.core.data.dataset_reader import DatasetReader @register('dstc2_datasetreader') class DSTC2DatasetReader(DatasetReader): ``` ### Dataset `Dataset` forms needed sets of data ('train', 'valid', 'test') and forms data batches. A concrete `Dataset` class should be registered and can be inherited from `deeppavlov.data.dataset_reader.Dataset` class. `deeppavlov.data.dataset_reader.Dataset` is not an abstract class and can be used as `Dataset` as well. ### Inferring All components inherited from `deeppavlov.core.models.component.Componet` abstract class can be inferred. The `__call__()` method should return what a compoent can do. For example, a *tokenizer* should return *tokens*, a *NER recognizer* should return *recognized entities*, a *bot* should return a *replica*. A particular format of returned data should be defined in `__call__()`. Inferring is triggered by `deeppavlov.core.commands.infer.interact_model()` function. There is no need in a separate JSON for inferring. ## License DeepPavlov is Apache 2.0 - licensed. ## Support and collaboration If you have any questions, bug reports or feature requests, please feel free to post on our [Github Issues](https://github.com/deepmipt/DeepPavlov/issues) page. Please tag your issue with 'bug', 'feature request', or 'question'. Also we’ll be glad to see your pull-requests to add new datasets, models, embeddings and etc. ## The Team DeepPavlov is built and maintained by [Neural Networks and Deep Learning Lab](https://mipt.ru/english/research/labs/neural-networks-and-deep-learning-lab) at [MIPT](https://mipt.ru/english/) within [iPavlov](http://ipavlov.ai/) project (part of [National Technology Initiative](https://asi.ru/eng/nti/)) and in partnership with [Sberbank](http://www.sberbank.com/).