# 9G-Train-llama **Repository Path**: chenzhm23/9G-Train-llama ## Basic Information - **Project Name**: 9G-Train-llama - **Description**: bmtrain llama2 npu训练 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2024-01-30 - **Last Updated**: 2024-07-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 9G-Train-llama #### 介绍 依赖CANN8.0.alpha001 torch npu下载: https://pytorch-package.obs.cn-north-4.myhuaweicloud.com/pta/Daily/v2.1.0/20231229.2/pytorch_v2.1.0_py38.tar.gz #### 使用说明 ##### 仓上最新代码覆盖本地执行 ``` git fetch --all && git reset --hard origin/master && git pull ``` ##### 预训练 1) 获取数据集[quick_start.zip](https://poc-resource.obs.cn-south-1.myhuaweicloud.com:443/%E6%95%B0%E6%8D%AE%E9%9B%86/quick_start.zip?AccessKeyId=HN0CCNB2WPQ4MZGZEJ8J&Expires=1737791406&Signature=uh2ROr1Ey%2BtEpHf9eCpKpjOY3KM%3D),解压至9G-Train-llama/cpm/llama/quick_start 按实际情况修改启动脚本9G-Train-llama/apps/llama/pretrain_llama2_7b.sh模型、数据集路径 2) 单机训练llamam2-7b ``` cd 9G-Train-llama/apps/llama/ bash pretrain_llama2_7b.sh ``` 3) 双机机训练llamam2-70b,需要在两台设备上执行一下命令,脚本中的torchrun分布式训练ip、端口,node_rank需要按实际情况修改 ``` cd 9G-Train-llama/apps/llama/ bash pretrain_llama2_70b.sh ``` ##### sft 1) 获取数据集[flan_plain_0809.zip](https://poc-resource.obs.cn-south-1.myhuaweicloud.com:443/%E6%95%B0%E6%8D%AE%E9%9B%86/flan_plain_0809.zip?AccessKeyId=HN0CCNB2WPQ4MZGZEJ8J&Expires=1737791524&Signature=RWozMNeAGsEm7bTYpzoAVw71m04%3D),解压至9G-Train-llama/cpm/llama/flan_plain_0809 按实际情况修改启动脚本9G-Train-llama/apps/llama/sft_llama2.sh模型、数据集路径 2) 单机微调llamam2-7b ``` cd 9G-Train-llama/apps/llama/ bash sft_llama2_7b.sh ``` 3) 双机微调llamam2-70b ``` cd 9G-Train-llama/apps/llama/ bash sft_llama2_70b.sh ``` #### 参与贡献 1. Fork 本仓库 2. 新建 Feat_xxx 分支 3. 提交代码 4. 新建 Pull Request