# ModelLink-0712-weight-mcore-2 **Repository Path**: liuyanghan/ModelLink-0712-weight-mcore-2 ## Basic Information - **Project Name**: ModelLink-0712-weight-mcore-2 - **Description**: 昇腾大模型仓库 - **Primary Language**: Python - **License**: MIT - **Default Branch**: 1.1 - **Homepage**: https://gitee.com/ascend/ModelLink/ - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1207 - **Created**: 2024-07-19 - **Last Updated**: 2024-07-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
简体中文 | English
ModelLink aims to provide end-to-end large language model solutions for Huawei Ascend chips, including models, algorithms, and downstream tasks. ## ModelLink Solution Overview --- ### Supported features Current ModelLink supported features for large model usage: * [Dataset Preparation for Pre-training](#jump11)/[Fine-tuning Instruction Dataset Preparation](#jump12) * [Pre-training](#jump13)/[Full-parameter Fine-tuning](#jump14)/[Low-parameter Fine-tuning](#jump15) * [Inference: human-machine dialogue](#jump16) * [Evaluation with numerous benchmarks](#jump17) * [Utilizing Acceleration Features (Acceleration Algorithms + Fusion Operators)](#jump18) * [Profiling data based on Ascend chips](#jump19) * [Convert ckpt between huggingface and megatron](#jump19) * [Enbale deterministic computing function for Ascend](#jump21) More novel and useful features are developing for LLMs training on Ascend ... ### Supported Models Current ModelLink supports pre-training and fine-tuning for the following models:| Model | Parameters | Sequence length | Pretrain | Inference | LoRA | SFT | Chat | Evaluation | Contributor |
|---|---|---|---|---|---|---|---|---|---|
| Aquila | 7B | 2K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| Aquila2 | 7B | 2K | pretrain | generate | -- | -- | -- | eval | 【Community】 |
| 34B | 4K | pretrain | generate | -- | -- | -- | eval | 【Community】 | |
| Baichuan | 7B | 4K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 13B | 4K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| Baichuan2 | 7B | 4K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 13B | 4K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| Bloom | 7B1 | 2K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 176B | 2K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| ChatGLM3 | 6B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 |
| CodeLlama | 34B | 4K | pretrain | generate | -- | -- | -- | eval | 【Community】 |
| InternLM | 7B | 2K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 65B | 2K | pretrain | -- | -- | -- | -- | -- | 【Ascend】 | |
| LLaMA | 7B | 2K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 |
| 13B | 2K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| 33B | 2K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| 65B | 2K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| LLaMA2 | 7B | 4K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 |
| 13B | 4K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| 34B | 4K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| 70B | 4K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| LLaMA3 | 8B | 8K | pretrain | generate | -- | -- | chat | eval | 【Ascend】 |
| 70B | 8K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| Qwen | 7B | 8K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 14B | 2K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| 72B | 8K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 | |
| Qwen1.5 | 0.5B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 |
| 1.8B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 | |
| 4B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 | |
| 7B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 | |
| 14B | 8K | pretrain | generate | -- | -- | -- | eval | 【Community】 | |
| 32B | 8K | pretrain | generate | lora | -- | -- | eval | 【Community】 | |
| 72B | 8K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| Yi | 34B | 4K | pretrain | generate | -- | -- | -- | eval | 【Community】 |
| Mixtral | 8x7B | 32K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| Mistral | 7B | 32K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| Gemma | 2B | 8K | pretrain | generate | -- | -- | -- | eval | 【Ascend】 |
| 7B | 8K | pretrain | generate | lora | -- | -- | eval | 【Ascend】 | |
| GPT3 | 175B | 2K | pretrain | -- | -- | -- | -- | -- | 【Community】 |
| Model | Parameters | Sequence length | Cluster Scale | Precision Mode | Performance | Reference Performance |
|---|---|---|---|---|---|---|
| Aquila | 7B | 2K | 1x8 | BF16 | 2849 | 2874 |
| Aquila2 | 7B | 2K | 1x8 | FP16 | 3323 | 2673 |
| 34B | 4K | 2x8 | BF16 | 854 | 732 | |
| Baichuan | 7B | 4K | 1x8 | FP16 | 2685 | 2036 |
| 13B | 4K | 1x8 | FP16 | 1213 | 862 | |
| Baichuan2 | 7B | 4K | 1x8 | BF16 | 2664 | 3969 |
| 13B | 4K | 1x8 | BF16 | 1668 | 2062 | |
| Bloom | 7B1 | 2K | 1x8 | FP16 | 2034 | 2525 |
| 176B | 2K | 12x8 | BF16 | 100 | 107 | |
| ChatGLM3 | 6B | 8K | 1x8 | FP16 | 4297 | 4267 |
| CodeLlama | 34B | 4K | 2x8 | BF16 | 837 | 762 |
| InternLM | 7B | 2K | 1x8 | BF16 | 2776 | 2854 |
| 65B | 2K | 4x8 | BF16 | 341 | 414 | |
| LLaMA | 7B | 2K | 1x8 | FP16 | 3600 | 3804 |
| 13B | 2K | 1x8 | FP16 | 1895 | 2012 | |
| 33B | 2K | 4x8 | FP16 | 621 | 776 | |
| 65B | 2K | 4x8 | ||||
| BF16 | 348 | 426 | ||||
| LLaMA2 | 7B | 4K | 1x8 | BF16 | 4200 | 3850 |
| 13B | 4K | 1x8 | BF16 | 1990 | 1920 | |
| 34B | 4K | 2x8 | BF16 | 749 | 796 | |
| 70B | 4K | 4x8 | BF16 | 420 | 430 | |
| LLaMA3 | 8B | 8K | 1x8 | BF16 | 2483 | 2674 |
| 70B | 8K | 8x8 | BF16 | 283 | 355 | |
| Qwen | 7B | 8K | 1x8 | BF16 | 2499 | 2867 |
| 14B | 2K | 1x8 | BF16 | 1560 | 1578 | |
| 72B | 8K | 16x8 | BF16 | 285 | 345 | |
| Qwen1.5 | 0.5B | 8K | 1x8 | BF16 | 22834 | 25306 |
| 1.8B | 8K | 1x8 | BF16 | 13029 | 12181 | |
| 4B | 8K | 1x8 | BF16 | 5033 | 5328 | 7B | 8K | 1x8 | BF16 | 2862 | 2621 |
| 14B | 8K | 1x8 | BF16 | 1717 | 1702 | |
| 32B | 8K | 4x8 | BF16 | 751 | 708 | |
| 72B | 8K | 8x8 | BF16 | 301 | 317 | |
| Yi | 34B | 4K | 2x8 | BF16 | 809 | 730 |
| Mixtral | 8x7B | 32K | 2x8 | BF16 | 487 | 610 |
| Mistral | 7B | 32K | 1x8 | BF16 | 2806 | 2734 |
| Gemma | 2B | 8K | 1x8 | BF16 | 6821 | 7602 |
| 7B | 8K | 1x8 | BF16 | 2938 | 2607 | |
| GPT3 | 175B | 2K | 16x8 | FP16 | 153 | -- |