# ModelLink **Repository Path**: wwzhuo/ModelLink ## Basic Information - **Project Name**: ModelLink - **Description**: 昇腾大模型仓库 - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: https://gitee.com/ascend/ModelLink/ - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 1196 - **Created**: 2024-03-15 - **Last Updated**: 2024-10-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
简体中文 | English
ModelLink provides end-to-end solutions for large language models on Ascend chips, including models, algorithms and tasks. ## ModelLink Solution Overview --- ## supported features Current ModelLink supported features for large model usage: * [Dataset Preparation for Pre-training](#jump11)/[Fine-tuning Instruction Dataset Preparation](#jump12) * [Pre-training](#jump13)/[Full-parameter Fine-tuning](#jump14)/[Low-parameter Fine-tuning](#jump15) * [Inference: human-machine dialogue](#jump16) * [Evaluation with numerous benchmarks](#jump17) * [Utilizing Acceleration Features (Acceleration Algorithms + Fusion Operators)](#jump18) * [Profiling data based on Ascend chips](#jump19) More novel and useful features are developing for LLMs training on Ascend ... ## Supported Models Current ModelLink supports pre-training and fine-tuning for the following models:Model | Parameters | Fine-tuning | Inference | Evaluation | Dataset Support | Contributor |
---|---|---|---|---|---|---|
Aquila | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
Aquila2 | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Community】 |
Baichuan | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
13B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
Baichuan2 | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
13B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
Bloom | 7B1 | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
176B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
CodeLlama | 34B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Community】 |
InternLM | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
65B | -- | -- | -- | alpaca_data.json | 【Model contributed by Ascend】 | |
LLaMA | 7B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
13B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
33B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
65B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
LLaMA2 | 7B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
13B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
34B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
70B | lora | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
LLaMA3 | 8B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
70B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
Qwen | 7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
14B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
72B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 | |
Yi | 34B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Community】 |
Mixtral | 8x7B | -- | inference | evaluation | alpaca_data.json | 【Model contributed by Ascend】 |
Model | Parameters | Cluster Scale | Precision Mode | Performance | Reference Performance | Scripts |
---|---|---|---|---|---|---|
Aquila | 7B | 1x8 | BF16 | 2849 | 2874 | train |
Aquila2 | 7B | 1x8 | FP16 | 3323 | 2673 | train |
Baichuan | 7B | 1x8 | FP16 | 2685 | 2036 | train |
13B | 1x8 | FP16 | 1213 | 862 | train | |
Baichuan2 | 7B | 1x8 | BF16 | 2664 | 3969 | train |
13B | 1x8 | BF16 | 1668 | 2062 | train | |
Bloom | 7B1 | 1x8 | FP16 | 2034 | 2525 | train |
176B | 12x8 | BF16 | 100 | 107 | train | |
CodeLlama | 34B | 2x8 | BF16 | 837 | 762 | train |
InternLM | 7B | 1x8 | BF16 | 2776 | 2854 | train |
65B | 4x8 | BF16 | 341 | 414 | train | |
LLaMA | 7B | 1x8 | FP16 | 3600 | 3804 | train |
13B | 1x8 | FP16 | 1895 | 2012 | train | |
33B | 4x8 | FP16 | 621 | 776 | train | |
65B | 4x8 | |||||
BF16 | 348 | 426 | train | |||
LLaMA2 | 7B | 1x8 | BF16 | 4200 | 3850 | train |
13B | 1x8 | BF16 | 1990 | 1920 | train | |
34B | 2x8 | BF16 | 690 | 796 | train | |
70B | 8x8 | BF16 | 350 | 339 | train | |
LLaMA3 | 8B | 1x8 | BF16 | 2483 | 2674 | train |
70B | 8x8 | BF16 | 283 | -- | train | |
Qwen | 7B | 1x8 | BF16 | 2499 | 2867 | train |
14B | 1x8 | BF16 | 1560 | 1578 | train | |
72B | 16x8 | BF16 | 285 | 345 | train | |
Yi | 34B | 2x8 | BF16 | 809 | 730 | train |
Mixtral | 8x7B | 2x8 | BF16 | 1054 | 1139 | train |
Task | Subset | Model | Ascend | Reference | Benchmark |
---|---|---|---|---|---|
BBH | test | Llama7b | 0.334 | 0.333 | 0.335 |
AGIEval | test | Llama7b | 0.210 | 0.210 | 0.206 |
HumanEval | test | Llama7b | 0.128 | 0.128 | 0.128 |
BoolQ | test | Llama7b | 0.742 | 0.742 | 0.754 |
GSM8K | test | Llama7b | 0.102 | 0.103 | 0.100 |
CEval | val | Llama7b | 0.408 | 0.404 | / |
MMLU | test | Llama7b | 0.333 | 0.324 | 0.351 |