2.3K Star 8K Fork 4.2K

GVPMindSpore / mindspore

 / 详情

[ST][MS][NET][resnet50 imagenet][910 1p/8p]FPS[14529] can not reach 17000

DONE
Bug-Report
创建于  
2022-05-26 16:27
name about labels
Bug Report Use this template for reporting a bug kind/bug

Describe the current behavior / 问题描述 (Mandatory / 必填)

resnet50、resnet50_boost、resnet50_ge 网络在910环境训练imagenet2012数据集性能劣化
resnet50_ge 8p训练性能14530/fps达不到17000
resnet50_ge 1p训练性能1824/fps达不到2100
resnet50_boost 8p训练性能22367/fps达不到26000
resnet50_boost 1p训练性能2818/fps达不到3400
resnet50 8p训练性能14529/fps达不到17000
resnet50 1p训练性能1824/fps达不到2100

Environment / 环境信息 (Mandatory / 必填)

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 1.7.0.Bxxx) :r1.8 B010 commit_id:42306df4
    -- Python version (e.g., Python 3.7.5) :
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):
    -- GCC/Compiler version (if compiled from source):
    run包:HiAI/HISI_C82/20220518

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode graph

Related testcase / 关联用例 (Mandatory / 必填)

test_ms_resnet50_imagenet_ge_train_check_loss_910_8p_0001.py
test_ms_resnet50_imagenet_ge_train_check_perf_910_1p_0002.py
test_ms_resnet50_imagenet_boost_train_check_fps_910_1p_0002.py
test_ms_resnet50_imagenet_boost_train_check_loss_910_8p_0003.py
test_ms_resnet50_imagenet_train_check_loss_910_8p_0002.py
test_ms_resnet50_imagenet_train_check_perf_910_1p_0003.py

Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)

  1. cd solution_test/cases/02network/00cv/resnet50/train
  2. pytest -s test_ms_resnet50_imagenet_train_check_loss_910_8p_0002.py

Describe the expected behavior / 预期结果 (Mandatory / 必填)

网络训练成功
resnet50_ge 8p训练性能能达到17000
resnet50_ge 1p训练性能能达到2100
resnet50_boost 8p训练性能能达到26000
resnet50_boost 1p训练性能能达到3400
resnet50 8p训练性能能达到17000
resnet50 1p训练性能能达到2100

Related log / screenshot / 日志 / 截图 (Mandatory / 必填)

epoch time: 145087.241 ms, per step time: 290.174 ms
epoch time: 70476.110 ms, per step time: 140.952 ms
epoch time: 70477.973 ms, per step time: 140.956 ms
epoch time: 70476.777 ms, per step time: 140.954 ms
epoch time: 70473.858 ms, per step time: 140.948 ms

Special notes for this issue/备注 (Optional / 选填)

走给曹杰文

评论 (7)

zhongjicheng 创建了Bug-Report
zhongjicheng 添加了
 
attr/performance
标签
zhongjicheng 添加了
 
sig/modelzoo
标签
zhongjicheng 添加了
 
kind/bug
标签
zhongjicheng 添加了
 
v1.8.0
标签
展开全部操作日志

Please assign maintainer to check this issue.
请为此issue分配处理人。
@zhongjicheng

Please add labels (comp or sig), also you can visit https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md to find more.
为了让代码尽快被审核,请您为Pull Request打上 组件(comp)或兴趣组(sig) 标签,打上标签的PR可直接推送给责任人进行审核。
更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md
以组件相关代码提交为例,如果你提交的是data组件代码,你可以这样评论:
//comp/data
当然你也可以邀请data SIG组来审核代码,可以这样写:
//sig/data
另外你还可以给这个PR标记类型,例如是bugfix或者是特性需求:
//kind/bug or //kind/feature
恭喜你,你已经学会了使用命令来打标签,接下来就在下面的评论里打上标签吧!

xiangjiawei007 修改了描述
xiangjiawei007 负责人xiangjiawei007 修改为oacjiewen
oacjiewen 添加协作者oacjiewen
oacjiewen 负责人oacjiewen 修改为anzhengqi

卷积和BN的算子融合失败

zhaoting 负责人anzhengqi 修改为liangzelang
zhaoting 添加协作者anzhengqi

@liangzelang 请适配新run包ub算子融合

Appearance & Root Cause
问题:resnet50_ge 8p训练性能不达标
原因:融合算子编译接口变更,导致算子融合失败。

Fix Solution
解决方法:适配融合算子编译接口
https://e.gitee.com/mind_spore/repos/mindspore/mindspore/pulls/35111

liubuyu 添加了
 
ctl/versionbuild
标签
liubuyu 添加了
 
rca/func/class/obj
标签
liubuyu 里程碑B-SIG-ASCEND 修改为B-SolutionTest
liubuyu 添加协作者liubuyu
liubuyu 负责人liubuyu 修改为zhongjicheng
liubuyu 任务状态TODO 修改为VALIDATION

回归版本:2022-05-30 r1.8 commit_id:5afb7c9d
回归步骤:参考issue复现步骤
基本功能:readme已更新
输入图片说明
测试结论:回归通过
回归人员:zhongjicheng
回归时间: 2022-06-06

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(8)
6584633 zhao ting v 1585658628 6574993 liubuyu 1584443152 6560352 oacjiewen 1584266306 6575381 anzhengqi 1585657544
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助