# NLP **Repository Path**: daiyizheng_admin/nlp ## Basic Information - **Project Name**: NLP - **Description**: 自然语言处理技术栈 - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-12-14 - **Last Updated**: 2021-05-09 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # NLP-tutorial ## 依赖 - pytorch 1.5.1 - tensorflow 2.1 - transformers 3.3.1 - skicit-learn ## 常用文本基础算法 ## 机器学习篇(sklearn) - 01 [tree](https://gitee.com/daiyizheng/nlp/blob/master/01-sklearn-tutoral/01-Tree/tree.py) - 02 [randomForest](https://gitee.com/daiyizheng/nlp/blob/master/01-sklearn-tutoral/02-RandomForest/randomForest.py) - 03 [k-means](https://gitee.com/daiyizheng/nlp/blob/master/01-sklearn-tutoral/03-k-means/k-means.py) - 04 [SVM](https://gitee.com/daiyizheng/nlp/blob/master/01-sklearn-tutoral/04-SVM/svm-linear.py) - 05 [XGBoost](https://gitee.com/daiyizheng/nlp/blob/master/01-sklearn-tutoral/05-XGBoost/Xgboost.py) ## tensorflow2.x基础篇 - 01 [数据类型](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/01-数据类型.ipynb) - 02 [Tensor](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/02-Tensor.ipynb) - 03 [索引切片](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/03-索引切片.ipynb) - 04 [维度变换](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/04-维度变换.ipynb) - 05 [Broadcasting](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/05-Broadcasting.ipynb) - 06 [数学计算](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/06-数学计算.ipynb) - 07 [前向传播](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/07-前向传播.ipynb) - 08 [合并与分割](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/08-合并与分割.ipynb) - 09 [数据统计](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/09-数据统计.ipynb) - 10 [张量排序](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/10-张量排序.ipynb) - 11 [填充和复制](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/11-填充和复制.ipynb) - 12 [张量与限幅](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/12-张量与限幅.ipynb) - 13 [高级操作](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/13-高级操作.ipynb) - 14 [数据加载](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/14-数据加载.ipynb) - 15 [张量实战](https://gitee.com/daiyizheng/nlp/blob/master/02-tensorflow2-tutorial/15-张量实战.ipynb) ## pytorch 基础篇 ## paddlepaddle 基础篇 ## 深度学习篇(pytorch, tensorflow)(部分来自于https://github.com/graykode/nlp-tutorial) 1. Basic Embedding Model - 1-1. NNLM(Neural Network Language Model) - Predict Next Word - Paper [A Neural Probabilistic Language Model(2003)](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf) - Blog [NNLM 原理](https://daiyizheng.github.io/2020/07/06/nnlm/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/1-1NNLM/NNLM-torch.py) - 1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph - Paper [Distributed Representations of Words and Phrases and their Compositionality(2013)](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) - Blog [Word2Vec 原理](https://daiyizheng.github.io/2020/07/05/word2vec/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/1-2Word2Vec/Word2Vec-torch.py) - 1-3. FastText(Application Level) - Paper [Bag of Tricks for Efficient Text Classification(2016)](https://arxiv.org/pdf/1607.01759.pdf) - Blog [Fasttext 原理](https://daiyizheng.github.io/2020/07/26/fasttext/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/1-3FastText/FastText-torch.py) 2. CNN(Convolutional Neural Network) - 2-1 TextCNN - Paper [Convolutional Neural Networks for Sentence Classification(2014)](http://www.aclweb.org/anthology/D14-1181) - Blog [TextCNN 原理](https://daiyizheng.github.io/2020/08/27/textcnn/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/2-1TextCNN/TextCNN-torch.py) 3. RNN(Recurrent Neural Network) - 3-1. TextRNN - Predict Next Step - Paper [Finding Structure in Time(1990)](http://psych.colorado.edu/~kimlab/Elman1990.pdf) - Blog [RNN 原理](https://daiyizheng.github.io/2020/06/06/rnn/#toc-heading-17) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/3-1TextRNN/TextRNN-torch.py) - 3-2. TextLSTM - Autocomplete - Paper [LONG SHORT-TERM MEMORY(1997]() - Blog [RNN 原理](https://daiyizheng.github.io/2020/06/06/rnn/#toc-heading-17) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/3-2TextLSTM/TextLSTM-torch.py) - 3-3. Bi-LSTM - Predict Next Word in Long Sentence - Blog [RNN 原理](https://daiyizheng.github.io/2020/06/06/rnn/#toc-heading-17) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/3-3Bi-LSTM/Bi-LSTM-torch.py) 4. Attention Mechanism - 4-1. Seq2Seq - Change Word - Paper [Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation(2014)](https://arxiv.org/pdf/1406.1078.pdf) - Blog [Seq2Seq 原理](https://daiyizheng.github.io/2020/08/15/seq2seq/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/4-1Seq2Seq/Seq2Seq-torch.py) - 4-2. Seq2Seq with Attention - Translate - Paper [Neural Machine Translation by Jointly Learning to Align and Translate(2014)](https://arxiv.org/abs/1409.0473) - Blog [Seq2Seq with Attention 原理](https://daiyizheng.github.io/2020/08/16/seq2seq-with-attention/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/4-2Seq2Seq(Attention)/Seq2Seq(Attention)-torch.py) - 4-3. Bi-LSTM with Attention - Binary Sentiment Classification - Code [torch]() 5. Model based on Transformer - 5-1. The Transformer - Translate - Paper [Attention Is All You Need(2017)](https://arxiv.org/abs/1706.03762) - Blog [Transformer 原理](https://daiyizheng.github.io/2020/08/26/transformer-yuan-li/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/5-1Transformer/Transformer-torch.py) - 5-2. BERT - Classification Next Sentence & Predict Masked Tokens - Paper [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding(2018)](https://arxiv.org/abs/1810.04805) - Blog [Bert 原理](https://daiyizheng.github.io/2020/08/27/bert-yuan-li/) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/5-2BERT/BERT-torch.py) - 5-3. Elmo - Paper [Elmo](#) - Blog [Elmo 原理](https://daiyizheng.github.io/#) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/05-nlp-tutorial/5-3Elmo/Elmo-ELMoForManyLangs.py) ## 深度学习BERT衍生篇 - 1. XLNET - Paper []() - Blog [](https://daiyizheng.github.io/#) - Code [torch](https://gitee.com/daiyizheng/nlp/blob/master/#) ## 深度学习入门项目 ### huaggingface ## NLP进阶