2.4K Star 8.2K Fork 4.4K

GVPMindSpore / mindspore

[ST][MS][教程]generative_diffusion/LSTM+CRF教程在GPU上pynative模式训练出现core dump

kind/bug
attr/function
stage/coding
br_base
sig/pynative
foruda
ctl/componenttest
rca/codelogic
rct/newfeature
#I9T86M baimz 5
负责人: baimz

[ST][MS][br_base] 910B环境,pynative模式resnet50设置force_fp32,混合精度O3(在输入为经过混合精度转换的f...

kind/bug
sig/ascend
br_base
device/ascend
attr/function
stage/func-debug
rct/oldrelease
rca/algorithm
ctl/solutiontest
#I9T6HL wenli 4
负责人: wenli

[ST][MS][教程]vision_transformer/generative_diffusion/gan/sentiment_analysis等多个...

kind/bug
attr/function
stage/coding
sig/pynative
br_base
device/ascend
foruda
rca/codelogic
rct/bugfix
ctl/solutiontest
#I9T5KL baimz 5
负责人: baimz

[ST][MS][NET][pangu-alpha][910B3 8p]pangu_alpha网络训练日志有启用告警日志

attr/function
stage/func-debug
kind/bug
device/ascend
master
gitee
ctl/codereview
rca/others
rct/bugfix
docs
#I9SUTR zhongjicheng 4
负责人: zhongjicheng

[ST][MS][master] GPU环境,Bert网络,设置batch_size为8,训练进程在图算阶段卡住

kind/bug
master
sig/akg
attr/function
stage/func-debug
br_base
gitee
rct/bugfix
rca/algorithm
ctl/solutiontest
#I9SMWG wenli 5
负责人: wenli

[ST][MF][codellama_34b][910B 4p]网络推理精度异常

kind/bug
master
attr/function
stage/func-debug
sig/mindformers
device/ascend
rct/bugfix
ctl/solutiontest
rca/algorithm
gitee
foruda
#I9RD8V zhangjie18 4
负责人: zhangjie18

[ST][MS][master] 基于官网教程中pipline网络,8卡设置自动并行,recursive_programming执行训练,训练失败。Run...

kind/bug
master
sig/parallel
attr/function
stage/func-debug
br_base
foruda
rca/others
rct/bugfix
ctl/solutiontest
device/ascend
gitee
#I9R43Y wenli 7
负责人: wenli

[ST][MS][master] 910B环境,并行复杂for-for嵌套网络,graph模式下,进行动态shape的网络训练,训练失败。The poin...

kind/bug
master
sig/runtime
attr/function
stage/func-debug
device/ascend
gitee
rct/bugfix
rca/others
ctl/solutiontest
#I9R42U wenli 5
负责人: wenli

[ST][MS][910A/B]construct中构造内置函数abs使用,传入变量报错 Init] For 'EltWiseGrad', it does...

kind/bug
sig/runtime
attr/function
stage/func-debug
master
device/ascend
foruda
rca/codelogic
rct/oldrelease
ctl/componenttest
#I9R3YJ 胡蓉 4
负责人: 14315024 hu  rong 1713428225胡蓉

[ST][MI][master]动态profiling能保存特定的step支持故障场景的定位,保存的数据支持MI加载出现报错“ Call halEsche...

kind/bug
sig/visualization
attr/function
master
device/ascend
stage/func-debug
rct/oldrelease
rca/others
ctl/llt
#I9R0O9 chensijie_Remzz 4
负责人: 13919083 wjchuee 1709966099wangjie

[ST][MS][master] GPU环境,动态shape网络脚本(深圳湾老师提供网络脚本)训练执行失败,报Ones算子相关错误

kind/bug
master
sig/msadapter
attr/function
stage/func-debug
rca/others
rct/bugfix
ctl/solutiontest
#I9QZJQ wenli 5
负责人: wenli

[ST][MS][MF][910B3]llama2/qwen/baichuan2]网络910b3 推理报AttributeError: 'ForkAwar...

attr/function
stage/func-debug
kind/bug
device/ascend
master
sig/ds
gitee
ctl/solutiontest
rca/others
rct/newfeature
#I9QTVB sunjiawei999 4
负责人: sunjiawei999

[ST][MS][master] GPU环境,动态shape,单卡训练,单输入,HW维动态,使用model.train训练并验证mindir导入导出,报...

kind/bug
master
sig/ide
attr/function
stage/func-debug
rct/bugfix
rca/others
ctl/solutiontest
foruda
#I9QSIR wenli 4
负责人: wenli

[ST][MS][master] ascend环境,enable_parallel_optimizer=True,与支持优化器组合后,不使用allredu...

kind/bug
master
device/ascend
sig/ascend
attr/function
stage/func-debug
dts-szv
rct/cann
br_base
rca/others
ctl/solutiontest
#I9QSCF wenli 5
负责人: wenli

[ST][MS][MF][llama2_13b_lora]微调后推理失败,For Operator[ReshapeAndCache], slot_mapp...

kind/bug
master
stage/func-debug
sig/mindformers
device/ascend
attr/function
gitee
rct/bugfix
rca/algorithm
ctl/solutiontest
foruda
#I9QO49 zhangjie18 4
负责人: zhangjie18

[ST][MS] llama2-175B开静默检测,在Feedward进行故障注入失败

kind/bug
attr/reliability
v2.2.14
r2.2
rca/others
rct/newfeature
ctl/solutiontest
#I9QCI1 duanjiali 5
负责人: duanjiali

[ST][MS][master] resnet50 8卡训练,GE_ADPT组件打印大量日志信息,导致info日志大小超过预期限制

kind/bug
sig/ascend
master
stage/func-debug
attr/function
device/ascend
rca/others
rct/newfeature
ctl/solutiontest
#I9QBXG wenli 4
负责人: wenli

[ST][MS][MF][910B3]llama2/internlm/qwen/baichuan2/yi/codellama/qwen1.5/glm3/g...

attr/function
stage/func-debug
kind/bug
device/ascend
master
sig/ds
gitee
ctl/rdselftest
rca/others
rct/refactor
#I9QAST zhongjicheng 5
负责人: zhongjicheng

[ST][MS][Mindie]在docker里拉起服务,遇到RuntimeError: Loading liblowlatency_collective...

kind/bug
sig/runtime
attr/function
stage/func-debug
device/ascend
master
rca/others
rct/bugfix
ctl/componenttest
foruda
#I9QASO Ai122 4
负责人: Ai122

[ST][MS][大集群专项]mixtral网络在910B上单卡模拟编译日志中有打印ERROR日志,不影响编译和训练

kind/bug
attr/function
stage/coding
master
sig/parallel
device/ascend
rca/others
rct/cann
ctl/solutiontest
#I9Q23U baimz 4
负责人: baimz
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助

344bd9b3 5694891 D2dac590 5694891