# PaddleNLP
**Repository Path**: CAPFNT/PaddleNLP
## Basic Information
- **Project Name**: PaddleNLP
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: ZHUI-patch-1
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-07-18
- **Last Updated**: 2024-10-17
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
[็ฎไฝไธญๆ๐](./README.md) | **English๐**
------------------------------------------------------------------------------------------
**PaddleNLP** is a Large Language Model (LLM) development suite based on the PaddlePaddle deep learning framework, supporting efficient large model training, lossless compression, and high-performance inference on various hardware devices. With its **simplicity** and **ultimate performance**, PaddleNLP is dedicated to helping developers achieve efficient industrial applications of large models.
## News ๐ข
* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0)**๏ผEmbrace large models and experience a complete upgrade. With a unified large model toolchain, we achieve full-process access to domestically produced computing chips. We fully support industrial-level application processes for large models, such as PaddlePaddle's 4D parallel configuration, efficient fine-tuning strategies, efficient alignment algorithms, and high-performance reasoning. Our developed RsLoRA+ algorithm, full checkpoint storage mechanism Unified Checkpoint, and generalized support for FastFNN and FusedQKV all contribute to the training and inference of large models. We continuously support updates to mainstream models for providing efficient solutions.
* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**๏ผOur self-developed RsLoRA+ algorithm with extreme convergence significantly improves the convergence speed and training effectiveness of PEFT training. By introducing high-performance generation acceleration into the RLHF PPO algorithm, we have broken through the generation speed bottleneck in PPO training, achieving a significant lead in PPO training performance. We generally support multiple large model training performance optimization methods such as FastFFN and FusedQKV, making large model training faster and more stable.
* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.0)**: The LLM experience is fully upgraded, and the tool chain LLM entrance is unified. Unify the implementation code of pre-training, fine-tuning, compression, inference and deployment to the `PaddleNLP/llm` directory. The new [LLM Toolchain Documentation](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html) provides one-stop guidance for users from getting started with LLM to business deployment and launch. The full breakpoint storage mechanism Unified Checkpoint greatly improves the versatility of LLM storage. Efficient fine-tuning upgrade supports the simultaneous use of efficient fine-tuning + LoRA, and supports QLoRA and other algorithms.
* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: Release [Full-process LLM toolchain](./llm) , covering all aspects of pre-training, fine-tuning, compression, inference and deployment, providing users with end-to-end LLM solutions and one-stop development experience; built-in [4D parallel distributed Trainer](./docs/trainer.md ), [Efficient fine-tuning algorithm LoRA/Prefix Tuning](./llm/README.md#2-%E7%B2%BE%E8%B0%83), [Self-developed INT8/INT4 quantization algorithm](./llm/README.md#4-%E9%87%8F%E5%8C%96), etc.; fully supports [LLaMA 1/2](./llm/config/llama), [BLOOM](./llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [OPT](./llm/config/opt) and other mainstream LLMs.
## Features
### ๐ง Integrated training and inference on multiple hardware platforms
Our development suit supports large model training and inference on multiple hardware platforms, including NVIDIA GPUs, Kunlun XPUs, Ascend NPUs, Enflame GCUs, and Hygon DCUs. The toolkit's interface allows for quick hardware switching, significantly reducing research and development costs associated with hardware transitions.
### ๐ Efficient and easy-to-use pre-training
We support 4D high-performance training with data parallelism, sharding parallelism, tensor parallelism, and pipeline parallelism. The Trainer supports configurable distributed strategies, reducing the cost associated with complex distributed combinations. The Unified Checkpoint large model storage format supports dynamic scaling of model parameter distribution during training, thereby reducing the migration cost caused by hardware switching.
### ๐ค Efficient fine-tuning and alignment
The fine-tuning and alignment algorithms are deeply integrated with zero-padding data streams and high-performance FlashMask operators, reducing invalid data padding and computation during training, and significantly improving the throughput of fine-tuning and alignment training.
### ๐๏ธ Lossless compression and high-performance inference
The high-performance inference module of the large model toolkit incorporates dynamic insertion and operator fusion strategies throughout the entire process, greatly accelerating parallel inference speed. The underlying implementation details are encapsulated, enabling out-of-the-box high-performance parallel inference capabilities.
------------------------------------------------------------------------------------------
## Support Models
| Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Weight convert |
|--------------------------------------------|:--------:|:---:|:----:|:-------------:|:---:|:----:|:------------:|:--------------:|
| [LLaMA](./llm/config/llama) | โ
| โ
| โ
| โ
| โ
| โ
| โ
| โ
|
| [Qwen](./llm/config/qwen) | โ
| โ
| โ
| โ
| โ
| ๐ง | ๐ง | โ
|
| [Mixtral](./llm/config/mixtral) | โ
| โ
| โ
| โ | ๐ง | ๐ง | ๐ง | ๐ง |
| [Baichuan/Baichuan2](./llm/config/llama) | โ
| โ
| โ
| โ
| โ
| ๐ง | โ
| โ
|
| [ChatGLM-6B](./llm/config/chatglm) | โ | โ
| โ
| โ
| ๐ง | ๐ง | โ
| โ |
| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) | โ | โ
| โ
| โ
| ๐ง | ๐ง | โ
| โ
|
| [Bloom](./llm/config/bloom) | โ | โ
| โ
| โ
| ๐ง | ๐ง | โ
| โ
|
| [GPT-3](./llm/config/gpt-3) | โ
| โ
| ๐ง | ๐ง | ๐ง | ๐ง | ๐ง | โ
|
| [OPT](./llm/config/opt) | ๐ง | โ
| โ
| ๐ง | ๐ง | ๐ง | ๐ง | โ
|
* โ
: Supported
* ๐ง: In Progress
* โ: Not Supported
Detailed list ๐ [Supported Model List](https://github.com/PaddlePaddle/PaddleNLP/issues/8663)
## Installation
### Prerequisites
- python >= 3.8
- paddlepaddle >= 3.0.0b0
### Pip Installation
```shell
pip install --upgrade paddlenlp
```
or you can install the latest develop branch code with the following command:
```shell
pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html
```
More information about PaddlePaddle installation please refer to [PaddlePaddle's Website](https://www.paddlepaddle.org.cn).
------------------------------------------------------------------------------------------
## Quick Start
### Text generation with large language model
PaddleNLP provides a convenient and easy-to-use Auto API, which can quickly load models and Tokenizers. Here, we use the `Qwen/Qwen2-0.5B` large model as an example for text generation:
```python
>>> from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B")
>>> model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="float16")
>>> input_features = tokenizer("ไฝ ๅฅฝ๏ผ่ฏท่ชๆไป็ปไธไธใ", return_tensors="pd")
>>> outputs = model.generate(**input_features, max_length=128)
>>> tokenizer.batch_decode(outputs[0])
['ๆๆฏไธไธชAI่ฏญ่จๆจกๅ๏ผๆๅฏไปฅๅ็ญๅ็ง้ฎ้ข๏ผๅ
ๆฌไฝไธ้ไบ๏ผๅคฉๆฐใๆฐ้ปใๅๅฒใๆๅใ็งๅญฆใๆ่ฒใๅจฑไน็ญใ่ฏท้ฎๆจๆไปไน้่ฆไบ่งฃ็ๅ๏ผ']
```
### Pre-training for large language model
```shell
mkdir -p llm/data && cd llm/data
wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.bin
wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.idx
cd .. # change folder to PaddleNLP/llm
python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_pretrain.py ./config/llama/pretrain_argument.json
```
### SFT finetuning forlarge language model
```shell
mkdir -p llm/data && cd llm/data
wget https://bj.bcebos.com/paddlenlp/datasets/examples/AdvertiseGen.tar.gz && tar -zxvf AdvertiseGen.tar.gz
cd .. # change folder to PaddleNLP/llm
python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py ./config/llama/sft_argument.json
```
For more steps in the entire large model process, please refer to the[Large Model Full-Process Toolchain](./llm).
For more PaddleNLP content, please refer to:
- [Model Library](./legacy/model_zoo)๏ผwhich includes end-to-end usage of high-quality pre-trained models.
- [Multi-scenario Examples](./legacy/examples)๏ผto understand how to use PaddleNLP to solve various NLP technical problems, including basic techniques, system applications, and extended applications.
- [Interactive Tutorial](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995)๏ผto quickly learn PaddleNLP on the free computing platform AI Studio.
------------------------------------------------------------------------------------------
## Community
### Slack
To connect with other users and contributors, welcome to join our [Slack channel](https://paddlenlp.slack.com/).
### WeChat
Scan the QR code below with your Wechatโฌ๏ธ. You can access to official technical exchange group. Look forward to your participation.
## Citation
If you find PaddleNLP useful in your research, please consider citing
```
@misc{=paddlenlp,
title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
author={PaddleNLP Contributors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
year={2021}
}
```
## Acknowledge
We have borrowed from Hugging Face's [Transformers](https://github.com/huggingface/transformers)๐ค excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community.
## License
PaddleNLP is provided under the [Apache-2.0 License](./LICENSE).