# MM-RL **Repository Path**: mr-lin314/MM-RL ## Basic Information - **Project Name**: MM-RL - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 3 - **Forks**: 8 - **Created**: 2025-05-19 - **Last Updated**: 2025-07-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 环境安装 ## torch、torch_npu安装 torch安装2.5.1版本即可 torch_npu安装release v7.0.0的2.5.1版本 https://gitee.com/ascend/pytorch/releases ## CANN包安装 获取推荐版本的CANN包(8.1.RC1),执行如下命令进行安装: ```shell bash Ascend-cann-toolkit_*.run --install-path=/path/to/install --full source /path/to/install/ascend-toolkit/set_env.sh bash Ascend-cann-kernels_*.run --install-path=/path/to/install --install bash Ascend-cann-nnal_*.run --install-path=/path/to/install --install ``` ## 安装MindSpeed-MM git clone MindSpeed-MM仓库,仓库路径https://gitee.com/ascend/MindSpeed-MM ```shell git clone https://gitee.com/ascend/MindSpeed-MM.git ``` ## 安装megatron 安装Megatron仓库 执行如下命令: ```shell git clone https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM git checkout core_r0.8.0 ``` ## 安装MindSpeed 执行如下命令: ```shell git clone https://gitee.com/ascend/MindSpeed.git cd MindSpeed git checkout 6f11a6c9 pip install -r requirements.txt pip3 install -e . ``` 将MindSpeed中的dot_product_attention.py文件替换成MindSpeed-MM中的dot_product_attention.py ```shell cp MindSpeed-MM/examples/qwen2vl/dot_product_attention.py MindSpeed/mindspeed/core/transformer/dot_product_attention.py ``` ## 整合MindSpeed-RL代码文件 (1)首先git clone本仓库的代码。 (2)将Megatron中的megatron文件夹,MindSpeed中的mindspeed文件夹,本仓代码文件下的mindspeed_rl,cli以及configs文件夹复制到MindSpeed-MM文件夹下: ``` |—— checkpoint |—— ci |—— cli (复制自MM-RL) |—— configs (复制自MM-RL) |—— docs |—— examples |—— megatron (复制自Megatron) |—— mindspeed (复制自MindSpeed) |—— mindspeed_mm |—— mindspeed_rl(复制自MM-RL) ``` (3)将本仓库中的examples/grpo文件夹复制到MindSpeed-MM仓的examples文件夹中: ``` |—— cogvideox |—— deepseekvl2 |—— diffusers |—— grpo |—— hunyuanvideo |——internvl2 |—— ... |—— ... ``` ## 安装VLLM 执行如下命令: ```shell git clone --depth 1 --branch v0.8.5.post1 https://github.com/vllm-project/vllm.git cd vllm VLLM_TARGET_DEVICE=empty pip install . --extra-index https://download.pytorch.org/whl/cpu ``` ## 安装VLLM-ASCEND 执行如下命令: ```shell source /usr/local/Ascend/ascend-toolkit/set_env.sh source /usr/local/Ascend/nnal/atb/set_env.sh pip install vllm-ascend==0.8.5rc1 pip install torch_npu-2.5.1*-cp310-cp310m-linux_aarch64.whl # vllm-ascend安装可能会覆盖torch_npu版本,需要重新安装 ``` > Note:CANN和nnal以实际安装路径为准 > 在vllm_ascend的实际安装路径下的vllm_ascend/models/__init__.py中作如下修改: ```python from vllm import ModelRegistry def register_model(): from .deepseek_mtp import CustomDeepSeekMTP # noqa: F401 from .deepseek_v2 import CustomDeepseekV2ForCausalLM # noqa: F401 from .deepseek_v2 import CustomDeepseekV3ForCausalLM # noqa: F401 from .qwen2_5_vl import \ AscendQwen2_5_VLForConditionalGeneration # noqa: F401 from .qwen2_vl import AscendQwen2VLForConditionalGeneration # noqa: F401 ModelRegistry.register_model( "DeepSeekMTPModel", "vllm_ascend.models.deepseek_mtp:CustomDeepSeekMTP") ModelRegistry.register_model( "Qwen2VLForConditionalGeneration", "vllm_ascend.models.qwen2_vl:AscendQwen2VLForConditionalGeneration") # 注释如下几行代码 # ModelRegistry.register_model( # "Qwen2_5_VLForConditionalGeneration", # "vllm_ascend.models. # qwen2_5_vl:AscendQwen2_5_VLForConditionalGeneration" # ) ModelRegistry.register_model( "DeepseekV2ForCausalLM", "vllm_ascend.models.deepseek_v2:CustomDeepseekV2ForCausalLM") ModelRegistry.register_model( "DeepseekV3ForCausalLM", "vllm_ascend.models.deepseek_v2:CustomDeepseekV3ForCausalLM") ``` ## apex安装 参考 https://gitee.com/ascend/apex 进行安装,执行如下命令: ``` git clone -b master https://gitee.com/ascend/apex.git cd apex/ bash scripts/build.sh --python=3.xx(根据自己的python版本确定) cd apex/dist/ pip3 uninstall apex pip3 install --upgrade apex-0.1+ascend-{version}.whl # version为python版本和cpu架构 ``` ## 其他依赖包安装 ```shell pip install -r requirements.txt ``` requirements.txt位于当前仓库下。 ## transformers包安装 执行如下命令: ``` git clone -b verl-fa-npu https://github.com/as12138/transformers.git cd transformers pip install huggingface-hub==0.30.0 python setup.py develop ``` ## 高性能内存库 jemalloc 安装 为了确保 Ray 进程能够正常回收内存,需要安装并使能 jemalloc 库进行内存管理。 ### Ubuntu 操作系统 通过操作系统源安装jemalloc(注意: 要求ubuntu版本>=20.04): ```shell sudo apt install libjemalloc2 ``` 在启动任务前执行如下命令通过环境变量导入jemalloc: ```shell # arm64架构 export LD_PRELOAD="$LD_PRELOAD:/usr/lib/aarch64-linux-gnu/libjemalloc.so.2" # x86_64架构 export LD_PRELOAD="$LD_PRELOAD:/usr/lib/x86_64-linux-gnu/libjemalloc.so.2" ``` ### OpenEuler 操作系统 执行如下命令重操作系统源安装jemalloc ```shell yum install jemalloc ``` 如果上述方法无法正常安装,可以通过源码编译安装 前往jamalloc官网下载最新稳定版本,官网地址:https://github.com/jemalloc/jemalloc/releases/ ```shell tar -xvf jemalloc-{version}.tar.bz2 cd jemalloc-{version} ./configure --prefix=/usr/local make make install ``` 在启动任务前执行如下命令通过环境变量导入jemalloc: ```shell export LD_PRELOAD="$LD_PRELOAD:/usr/local/lib/libjemalloc.so.2" ``` ## Q&A (1)容器内启动时可能会遇到不存在ip命令的错误,可使用如下命令进行安装: ```shell sudo apt-get install iproute2 ``` (2)如果安装vllm ascend失败,提示fatal error: 'cstdint' file not found,可能是gcc版本问题,可参考 https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/softwareinst/instg/instg_0086.html?Mode=PmIns&OS=Ubuntu&Software=cannToolKit 解决。 # 启动 ## 基本启动 进入MindSpeed-MM目录下,以qwen2_5_vl_3b为例,执行如下命令: ```shell bash examples/grpo/grpo_trainer_qwen25vl_3b_integrated.sh ``` 注意需要将该脚本中的cann和nnal环境置换为自己安装的环境。 相应的rl配置文件位于configs/grpo_trainer_qwen25vl_3b_integrated.yaml