代码拉取完成,页面将自动刷新
seed: 0
output_dir: './output' # path to save checkpoint/strategy
load_checkpoint: ''
use_parallel: False
run_mode: 'predict'
use_legacy: False
load_ckpt_format: 'safetensors'
trainer:
type: CausalLanguageModelingTrainer
model_name: 'qwen3'
# default parallel of device num = 8 for Atlas 800T A2
parallel_config:
data_parallel: 1
model_parallel: 1
# HuggingFace file directory
pretrained_model_dir: '/path/hf_dir'
model:
model_config:
compute_dtype: "bfloat16"
layernorm_compute_dtype: "float32"
softmax_compute_dtype: "float32"
rotary_dtype: "bfloat16"
params_dtype: "bfloat16"
# mindspore context init config
context:
mode: 0 #0--Graph Mode; 1--Pynative Mode
enable_graph_kernel: False
ascend_config:
precision_mode: "must_keep_origin_dtype"
max_device_memory: "59GB"
save_graphs: False
save_graphs_path: "./graph"
# parallel context config
parallel:
parallel_mode: "MANUAL_PARALLEL"
enable_alltoall: False
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。