# MM-RL **Repository Path**: sunnylee219/MM-RL ## Basic Information - **Project Name**: MM-RL - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 8 - **Created**: 2025-05-19 - **Last Updated**: 2025-05-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 环境安装 ## torch、torch_npu安装 torch安装2.5.1版本即可 torch_npu安装release v0.7.0的2.5.1版本 https://gitee.com/ascend/pytorch/releases ## CANN包安装 获取推荐版本的CANN包(8.1.RC1),执行如下命令进行安装: ```shell bash Ascend-cann-toolkit_*.run --install-path=/path/to/install --full source /path/to/install/ascend-toolkit/set_env.sh bash Ascend-cann-kernels_*.run --install-path=/path/to/install --install bash Ascend-cann-nnal_*.run --install-path=/path/to/install --install ``` ## 安装MindSpeed-MM git clone MindSpeed-MM仓库,仓库路径https://gitee.com/ascend/MindSpeed-MM ```shell git clone https://gitee.com/ascend/MindSpeed-MM.git ``` ## 安装megatron 安装Megatron仓库 执行如下命令: ```shell git clone https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM git checkout core_r0.8.0 ``` ## 安装MindSpeed 执行如下命令: ```shell git clone https://gitee.com/ascend/MindSpeed.git cd MindSpeed git checkout 6f11a6c9 pip install -r requirements.txt pip3 install -e . ``` 将MindSpeed中的dot_product_attention.py文件替换成MindSpeed-MM中的dot_product_attention.py ```shell cp MindSpeed-MM/examples/qwen2vl/dot_product_attention.py MindSpeed/mindspeed/core/transformer/dot_product_attention.py ``` ## 整合MindSpeed-RL代码文件 (1)首先git clone本仓库的代码。 (2)将Megatron中的megatron文件夹,MindSpeed中的mindspeed文件夹,本仓代码文件下的mindspeed_rl,cli以及configs文件夹复制到MindSpeed-MM文件夹下: ``` |—— checkpoint |—— ci |—— cli (复制自MM-RL) |—— configs (复制自MM-RL) |—— docs |—— examples |—— megatron (复制自Megatron) |—— mindspeed (复制自MindSpeed) |—— mindspeed_mm |—— mindspeed_rl(复制自MM-RL) ``` (3)将本仓库中的examples/grpo文件夹复制到MindSpeed-MM仓的examples文件夹中: ``` |—— cogvideox |—— deepseekvl2 |—— diffusers |—— grpo |—— hunyuanvideo |——internvl2 |—— ... |—— ... ``` ## 安装VLLM 执行如下命令: ```shell git clone --depth 1 --branch v0.8.5.post1 https://github.com/vllm-project/vllm.git cd vllm VLLM_TARGET_DEVICE=empty pip install . --extra-index https://download.pytorch.org/whl/cpu ``` ## 安装VLLM-ASCEND 执行如下命令: ```shell source /usr/local/Ascend/ascend-toolkit/set_env.sh source /usr/local/Ascend/nnal/atb/set_env.sh pip install vllm-ascend==0.8.5rc1 pip install torch_npu-2.5.1*-cp310-cp310m-linux_aarch64.whl # vllm-ascend安装可能会覆盖torch_npu版本,需要重新安装 ``` > Note:CANN和nnal以实际安装路径为准 > 在vllm_ascend的实际安装路径下的vllm_ascend/models/__init__.py中作如下修改: ```python from vllm import ModelRegistry def register_model(): from .deepseek_mtp import CustomDeepSeekMTP # noqa: F401 from .deepseek_v2 import CustomDeepseekV2ForCausalLM # noqa: F401 from .deepseek_v2 import CustomDeepseekV3ForCausalLM # noqa: F401 from .qwen2_5_vl import \ AscendQwen2_5_VLForConditionalGeneration # noqa: F401 from .qwen2_vl import AscendQwen2VLForConditionalGeneration # noqa: F401 ModelRegistry.register_model( "DeepSeekMTPModel", "vllm_ascend.models.deepseek_mtp:CustomDeepSeekMTP") ModelRegistry.register_model( "Qwen2VLForConditionalGeneration", "vllm_ascend.models.qwen2_vl:AscendQwen2VLForConditionalGeneration") # 注释如下几行代码 # ModelRegistry.register_model( # "Qwen2_5_VLForConditionalGeneration", # "vllm_ascend.models. # qwen2_5_vl:AscendQwen2_5_VLForConditionalGeneration" # ) ModelRegistry.register_model( "DeepseekV2ForCausalLM", "vllm_ascend.models.deepseek_v2:CustomDeepseekV2ForCausalLM") ModelRegistry.register_model( "DeepseekV3ForCausalLM", "vllm_ascend.models.deepseek_v2:CustomDeepseekV3ForCausalLM") ``` ## transformers包安装 执行如下命令: ``` git clone -b verl-fa-npu https://github.com/as12138/transformers.git cd transformers pip install huggingface-hub==0.30.0 python setup.py develop ``` ## apex安装 参考https://gitee.com/ascend/apex进行安装,执行如下命令: ``` git clone -b master https://gitee.com/ascend/apex.git cd apex/ bash scripts/build.sh --python=3.xx(根据自己的python版本确定) ``` ## 其他依赖包安装 ```shell pip install -r requirements.txt ``` requirements.txt位于当前仓库下。 # 启动 ## 基本启动 进入MindSpeed-MM目录下,以qwen2_5_vl_3b为例,执行如下命令: ```shell bash examples/grpo/grpo_trainer_qwen25vl_3b_integrated.sh ``` 注意需要将该脚本中的cann和nnal环境置换为自己安装的环境。 相应的rl配置文件位于configs/grpo_trainer_qwen25vl_3b_integrated.yaml