76 Star 221 Fork 167

Ascend / modelzoo

 / 详情

【众智】【电子科技大学】【ID2363】【DeepMatchVO】混合精度下 性能较差 时间长

DONE
Bug-Report
创建于  
2021-12-20 21:07

一、问题现象(附报错日志上下文):
更改混合精度后执行时间几乎不变

二、软件版本:
-- CANN 版本 (e.g., CANN 3.0.x,5.x.x):
--Tensorflow/Pytorch/MindSpore 版本:1.15
--Python 版本 (e.g., Python 3.7.5):3.75
-- MindStudio版本 (e.g., MindStudio 2.0.0 (beta3)):
--操作系统版本 (e.g., Ubuntu 18.04):Ubuntu

三、测试步骤:
在GPU测试 单个时间约为0.08s(每100step),在npu测试,无论是否开启混合精度,0.5s(每100step)
GPU打屏
Epoch: [55] [ 1941/ 4590] time: 0.0872
total/pixel/smooth loss: [0.272/0.096/0.000]

Epoch: [55] [ 2041/ 4590] time: 0.0871
total/pixel/smooth loss: [0.261/0.097/0.000]
NPU打屏
Epoch: [ 1] [ 301/ 4590] time: 0.5737
total/pixel/smooth loss: [0.043/0.000/0.030]

Epoch: [ 1] [ 401/ 4590] time: 0.5736
total/pixel/smooth loss: [0.047/0.000/0.035]
四、日志信息:
JOBAHJDFBDAIGJFDCGBAGAAAAAAAAAAA
JOBBEDFEGCHBGGGCEFDFFBBAAAAAAAAA (开启混合精度)
为profiling文件

链接obs
URL:
https://e-share.obs-website.cn-north-1.myhuaweicloud.com?token=MXSEz1o7r6UnBj7UO7hvDN+LNb0H/m7rtDdj7YfeQ68wH27ZvAYPiEPUE+N8JYGJG+poQW1vKyUKf+GQnZCdBB/PBsCSrItikUigUUSwcXBidKCEKD4MCC0SWFcYsEfTO2Co7hxTpKwKq1fnNsPwxXqrQY9wkJdOhgyG9lg0cOM/bq4457PBHknC8fK3kE8M66eDZ4u81+FtLUc+QCSCUUt3OcaLd0PGW01xrMPPJvAf3adoQxCAv4PMBVKDv7aFIkI0BmrKnI1Z2V7oq6YzqWoNGw8w79ycT/tJzuYwK9QYdg8QCKdwmpJm88B9fLUogVKEZaaadbVuuBFKPdga9ws+woRiCxh4TVGC3LcdezheYyAHBb60Fai77GDnwU17uGfsCS3L/2TO94ULEvslfdfflA8L9mYjXiIVX4220XXoinu3dlBdB2xZcocdit3sy0xomyUzKHrbz+ebxwllzh+JV0qaIYMkm2/DLG/TuxZ717WTLJE95sAqMD9QWveGDLZOBl2s1X9kXMopx7pJ6JWj4MpwZn830hh29kFjdEo0AR0eSaNRU+W+/Q5pOhue

提取码:
123456

*有效期至: 2022/01/19 21:14:57 GMT+08:00

评论 (9)

kingskymoon 创建了Bug-Report
kingskymoon 修改了描述
kingskymoon 修改了描述
zhujianpeng 负责人设置为张晓龙
zhujianpeng 任务状态TODO 修改为Analysing
展开全部操作日志

你好,请把学校和模型名称补充上来。

张晓龙 负责人张晓龙 修改为未设置
张晓龙 负责人设置为zhujianpeng
张晓龙 负责人zhujianpeng 修改为未设置
张晓龙 添加协作者zhujianpeng
张晓龙 负责人设置为huangqinye
kingskymoon 修改了标题

obs://obsdeepmatchvo/profiling/

你好,能否提供一下训练脚本和数据集,我们本地复现一下。

你好

源码:obs://obsdeepmatchvo/DeepMatchVO-master_for_TensorFlow/

训练指令:python3 train.py --dataset_dir='/home/DeepMatchVo_ID2363_for_TensorFlow/kitti/genertate_wty' --checkpoint_dir='/home/DeepMatchVo_ID2363_for_TensorFlow/ckpt_pro' --img_width=416 --img_height=128 --batch_size=4 --seq_length 3
--max_steps 500 --save_freq 3000 --learning_rate 0.001 --num_scales 1 --continue_train=False --match_num=100

参数说明dataset_dir:数据集;checkpoint_dir保存ckpt;max_steps 每个100steps打印一次信息 所以设置为500

在deep_slam.py 370行可以更改混合精度

数据集dataset_dir:obs://obsdeepmatchvo/root/kitti/dataset/genertate_wty/

您好,请问下有最新的进展了吗

李想 修改了标题
吴定远 关联仓库Ascend/modelzoo-his 修改为Ascend/modelzoo

你好,有进展了吗

你好,这个问题是否已经解决,由于长时间没有交互,这边暂时先关闭这个issue,有问题可以进一步反馈谢谢

wangxiaodan1103 任务状态Analysing 修改为DONE
颜亚文 任务状态DONE 修改为Analysing
颜亚文 添加协作者宋保强

ConcatV2、StridedSlice和TopK三个算子未走aicore的原因是训练脚本中关闭了所有的融合规则,开启ZConcatExt2FusionPass、ConstToAttrStridedSliceFusion、TopKFusionPass三个融合规则后,并将precision_mode参数设置为allow_fp32_to_fp16,50个step的性能数据如下:
---> step: 0 duration = 486.7702262401581
Epoch: [ 1] [ 1/ 4590] time: 4.8678
total/pixel/smooth loss: [ 0.756 / 0.343 / 0.079]
---> step: 1 duration = 357.2175102233887
---> step: 2 duration = 0.08217692375183105
---> step: 3 duration = 0.07405567169189453
---> step: 4 duration = 0.07434725761413574
---> step: 5 duration = 0.0741739273071289
---> step: 6 duration = 0.07226443290710449
---> step: 7 duration = 0.07483863830566406
---> step: 8 duration = 0.07477927207946777
---> step: 9 duration = 0.07180070877075195
---> step: 10 duration = 0.0739748477935791
---> step: 11 duration = 0.07507085800170898
---> step: 12 duration = 0.0742793083190918
---> step: 13 duration = 0.07432818412780762
---> step: 14 duration = 0.07193398475646973
---> step: 15 duration = 0.07378101348876953
---> step: 16 duration = 0.07451605796813965
---> step: 17 duration = 0.07210206985473633
---> step: 18 duration = 0.07362771034240723
---> step: 19 duration = 0.0714104175567627
---> step: 20 duration = 0.0715949535369873
---> step: 21 duration = 0.07439875602722168
---> step: 22 duration = 0.07406878471374512
---> step: 23 duration = 0.07396578788757324
---> step: 24 duration = 0.07132816314697266
---> step: 25 duration = 0.0734710693359375
---> step: 26 duration = 0.07465171813964844
---> step: 27 duration = 0.07469034194946289
---> step: 28 duration = 0.07510900497436523
---> step: 29 duration = 0.07192850112915039
---> step: 30 duration = 0.07355785369873047
---> step: 31 duration = 0.0712735652923584
---> step: 32 duration = 0.07375311851501465
---> step: 33 duration = 0.07411599159240723
---> step: 34 duration = 0.07393836975097656
---> step: 35 duration = 0.07377123832702637
---> step: 36 duration = 0.0738685131072998
---> step: 37 duration = 0.07410073280334473
---> step: 38 duration = 0.07388877868652344
---> step: 39 duration = 0.07418417930603027
---> step: 40 duration = 0.07368040084838867
---> step: 41 duration = 0.07415080070495605
---> step: 42 duration = 0.07429385185241699
---> step: 43 duration = 0.07407760620117188
---> step: 44 duration = 0.0740506649017334
---> step: 45 duration = 0.07155537605285645
---> step: 46 duration = 0.07464265823364258
---> step: 47 duration = 0.07411932945251465
---> step: 48 duration = 0.07437729835510254
---> step: 49 duration = 0.07473373413085938

宋保强 取消协作者宋保强
颜亚文 任务状态Analysing 修改为DONE

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(6)
8283147 kingskymoon 1637651310
1
https://gitee.com/ascend/modelzoo.git
git@gitee.com:ascend/modelzoo.git
ascend
modelzoo
modelzoo

搜索帮助