368 Star 1.6K Fork 1.1K

MindSpore/docs
关闭

加入 Gitee
与超过 1400万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
克隆/下载
startup_method.rst 2.54 KB
一键复制 编辑 原始数据 按行查看 历史
zhangyinxia 提交于 2026-01-07 16:41 +08:00 . down rank table startup

Distributed Parallel Startup Methods

View Source On Gitee
.. toctree::
  :maxdepth: 1
  :hidden:

  msrun_launcher
  dynamic_cluster
  mpirun

Startup Method

Currently GPU, Ascend and CPU support multiple startup methods respectively, three of which are msrun, dynamic cluster, mpirun:

  • msrun: msrun is the capsulation of Dynamic cluster. It allows user to launch distributed jobs using one single command in each node. It could be used after MindSpore is installed. This method does not rely on third-party libraries and configuration files, has disaster recovery function, good security, and supports three hardware platforms. It is recommended that users prioritize the use of this startup method.
  • Dynamic cluster: dynamic cluster requires user to spawn multiple processes and export environment variables. It's the implementation of msrun. Use this method when running Parameter Server training mode. For other distributed jobs, msrun is recommended.
  • mpirun: this method relies on the open source library OpenMPI, and startup command is simple. Multi-machine need to ensure two-by-two password-free login. It is recommended for users who have experience in using OpenMPI to use this startup method.

Warning

rank_table method has been deprecated in MindSpore 2.4 version.

The hardware support for the four startup methods is shown in the table below:

  GPU Ascend CPU
msrun Support Support Support
Dynamic cluster Support Support Support
mpirun Support Support Not support
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/mindspore/docs.git
git@gitee.com:mindspore/docs.git
mindspore
docs
docs
r2.7.2

搜索帮助