100 Star 1.3K Fork 915

GVPMindSpore/mindformers

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
克隆/下载
predict_qwen3.yaml 963 Bytes
一键复制 编辑 原始数据 按行查看 历史
sunyuxuan 提交于 2025-06-26 20:04 +08:00 . optimize config and fix qwen3_moe name in comment
seed: 0
output_dir: './output' # path to save checkpoint/strategy
load_checkpoint: ''
use_parallel: False
run_mode: 'predict'
use_legacy: False
load_ckpt_format: 'safetensors'
trainer:
type: CausalLanguageModelingTrainer
model_name: 'qwen3'
# default parallel of device num = 8 for Atlas 800T A2
parallel_config:
data_parallel: 1
model_parallel: 1
# HuggingFace file directory
pretrained_model_dir: '/path/hf_dir'
model:
model_config:
compute_dtype: "bfloat16"
layernorm_compute_dtype: "float32"
softmax_compute_dtype: "float32"
rotary_dtype: "bfloat16"
params_dtype: "bfloat16"
# mindspore context init config
context:
mode: 0 #0--Graph Mode; 1--Pynative Mode
enable_graph_kernel: False
ascend_config:
precision_mode: "must_keep_origin_dtype"
max_device_memory: "59GB"
save_graphs: False
save_graphs_path: "./graph"
# parallel context config
parallel:
parallel_mode: "MANUAL_PARALLEL"
enable_alltoall: False
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/mindspore/mindformers.git
git@gitee.com:mindspore/mindformers.git
mindspore
mindformers
mindformers
master

搜索帮助