2.4K Star 8.2K Fork 4.4K

GVPMindSpore / mindspore

[ST][MS][大集群专项][动态组网]1.3w worker集群,scheduler初始化完一段时间后有worker超时日志打印,schduler退出

r2.2
kind/bug
device/ascend
sig/runtime
master
rca/algorithm
rct/oldrelease
ctl/solutiontest
#IA8OA3 leiwei2 4
负责人: leiwei2

[ST][MS] 编译缓存功能通过context使能,加载会失败

kind/bug
master
attr/function
sig/parallel
rct/bugfix
rca/func/class/obj
ctl/solutiontest
#I9TLN4 duanjiali 4
负责人: duanjiali

[ST][MS][教程]generative_diffusion教程在910B上训练有ERROR日志

kind/bug
attr/function
stage/coding
master
br_base
device/ascend
sig/ops
gitee
foruda
rca/others
rct/oldrelease
ctl/solutiontest
#I9TI1D baimz 4
负责人: baimz

[ST][MS][教程]shufflenet教程在910 上pynative模式训练报错RuntimeError: Launch kernel faile...

kind/bug
attr/function
stage/coding
br_base
device/ascend
sig/pynative
rct/bugfix
rca/codelogic
ctl/solutiontest
foruda
#I9T8AJ baimz 5
负责人: baimz

[ST][MS][教程]generative_diffusion/LSTM+CRF教程在GPU上pynative模式训练出现core dump

kind/bug
attr/function
stage/coding
br_base
sig/pynative
foruda
ctl/componenttest
rca/codelogic
rct/newfeature
#I9T86M baimz 5
负责人: baimz

[ST][MS][教程][br_base]验证教程进阶test_ms_tutorial_advance_initializer_0001等多个教程在pyn...

kind/bug
sig/pynative
br_base
stage/func-debug
gpu
attr/function
sig/runtime
sig/ascend
rct/newfeature
rca/codelogic
ctl/solutiontest
master
#I9T84S 胡蓉 7
负责人: 14315024 hu  rong 1713428225胡蓉

[ST][MS][br_base] 910B环境,pynative模式resnet50设置force_fp32,混合精度O3(在输入为经过混合精度转换的f...

kind/bug
sig/ascend
br_base
device/ascend
attr/function
stage/func-debug
rct/oldrelease
rca/algorithm
ctl/solutiontest
#I9T6HL wenli 4
负责人: wenli

[ST][MS][教程]vision_transformer/generative_diffusion/gan/sentiment_analysis等多个...

kind/bug
attr/function
stage/coding
sig/pynative
br_base
device/ascend
foruda
rca/codelogic
rct/bugfix
ctl/solutiontest
#I9T5KL baimz 5
负责人: baimz

[ST][MS][master] ascend环境,resnet50网络,动态图,添加ts收集dump数据报错,TypeError: Failed cal...

kind/bug
master
device/ascend
sig/visualization
attr/function
stage/func-debug
gitee
rca/others
rct/bugfix
ctl/solutiontest
#I9SYFP wenli 5
负责人: wenli

[ST][MS][OPS]graph模式,动态shape场景下,网络中只使用 x = tuple(x) ,输入输出的value不相等

master
attr/function
stage/func-debug
kind/bug
sig/ops
device/ascend
foruda
rca/algorithm
rct/newfeature
ctl/componenttest
#I9SXSJ 田桐 5
负责人: 田桐

[ST][MS][NET][pangu-alpha][910B3 8p]pangu_alpha网络训练日志有启用告警日志

attr/function
stage/func-debug
kind/bug
device/ascend
master
gitee
ctl/codereview
rca/others
rct/bugfix
docs
#I9SUTR zhongjicheng 4
负责人: zhongjicheng

[ST][MS][master] ascend环境,yolov5动态图,函数式编程,添加ts收集dump数据,jit功能导致训练失败

kind/bug
master
sig/visualization
attr/function
stage/func-debug
device/ascend
rca/others
ctl/rdselftest
rct/newfeature
gitee
#I9SP4R wenli 7
负责人: wenli

[ST][MS][master] GPU环境,Bert网络,设置batch_size为8,训练进程在图算阶段卡住

kind/bug
master
sig/akg
attr/function
stage/func-debug
br_base
gitee
rct/bugfix
rca/algorithm
ctl/solutiontest
#I9SMWG wenli 5
负责人: wenli

[ST][MS][MF][gpt2_13b][910B 8P]评估精度不达标,ppl_acc:11.516 is up than 11.3726

master
stage/prec-tuning
attr/accuracy
device/ascend
sig/mindformers
rct/oldrelease
rca/others
ctl/solutiontest
#I9SGR0 zhangjie18 4
负责人: zhangjie18

[ST][MS][MF][baichuan2_13b/qwen1_5_7b/][910B]网络推理及训练失败,Sync stream error!

kind/bug
master
attr/function
stage/func-debug
sig/mslite
device/ascend
rct/oldrelease
rca/others
ctl/solutiontest
foruda
#I9S4J0 zhangjie18 8
负责人: zhangjie18

[ST][MS[MF][baichuan2_7b/qwen][910B]网络推理失败,TypeError: The parameters number o...

kind/bug
master
attr/function
stage/func-debug
sig/mslite
device/ascend
rct/oldrelease
rca/others
ctl/solutiontest
foruda
#I9S4H1 zhangjie18 4
负责人: zhangjie18

[ST][MS][MF][baichuan2_7b][910B]网络推理失败,TypeError: Failed calling Add with "Ad...

kind/bug
master
attr/function
stage/func-debug
sig/mindformers
device/ascend
gitee
foruda
rct/bugfix
rca/codelogic
ctl/rdselftest
rct/refactor
rca/codespec
ctl/solutiontest
#I9S4G5 zhangjie18 8
负责人: zhangjie18

[ST][MS][大集群专项]pangu_sigma网络单卡模拟编译kbk流程,第一次编译正常,第二次编译失败

kind/bug
attr/function
stage/coding
master
device/ascend
sig/ascend
rct/refactor
rca/timing/seq
ctl/solutiontest
foruda
#I9RPW1 baimz 2
负责人: baimz

[ST][MS][大集群专项]pangu_sigma网络单卡模拟编译kbk流程,第二次编译时间比第一次编译时间要长

kind/bug
attr/function
stage/coding
master
sig/ascend
device/ascend
rct/refactor
rca/timing/seq
ctl/solutiontest
foruda
#I9RPW0 baimz 6
负责人: baimz

[ST][MS][MF][baichuan2_13b][910B1 8/16p]B1环境微调,报内存不足问题

kind/bug
master
stage/func-debug
sig/mindformers
attr/function
device/ascend
gitee
rca/others
rct/oldrelease
ctl/rdselftest
foruda
#I9RIDG zhangjie18 4
负责人: zhangjie18
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助

344bd9b3 5694891 D2dac590 5694891