Hardware Environment(Ascend
/GPU
/CPU
): Atlas 200 DK
Software Environment:
-- MindSpore version (source or binary): 没用,用的ATC命令行工具
-- Python version : Python 3.7.5(环境是照着200DK教程装的)
-- OS platform and distribution: Linux Ubuntu 18.04
(1)使用PytorchToCaffe工具,将部分faster_rcnn的Pytorch模型转换成Caffe模型
(1.1) 维护PytorchToCaffe工具,使其支持昇腾AI处理器支持的拓展算子Proposal
注:用这个工具是因为我们与华为合作的部门也用经过维护的这个工具
(1.1.1)首先在PytorchToCaffe/Caffe/caffe.proto中,按照教程中的方法添加LayerParameter和ProposalParameter的定义,再运行protoc --python_out ./ caffe.proto命令使修改生效。
(1.1.2)在PytorchToCaffe/pytorch_to_caffe.py中,进行对Proposal层的支持,添加的代码如下
def _proposal(raw, cls_prob, bbox_delta):
rois, actual_rois_num = raw(cls_prob, bbox_delta)
bottom_blobs=[]
bottom_blobs.append(log.blobs(cls_prob))
bottom_blobs.append(log.blobs(bbox_delta))
top_blobs=log.add_blobs([rois, actual_rois_num],name='proposal_blob')
layer_name = log.add_layer(name='Proposal')
layer = caffe_net.Layer_param(name=layer_name, type='Proposal',
bottom=bottom_blobs, top=top_blobs)
layer.proposal_param()
log.cnet.add_layer(layer)
return rois, actual_rois_num
F.proposal = Rp(F.proposal, _proposal)
(1.1.3)在pytorch的torch.nn.functional.py中添加代码如下
def proposal(cls_prob, bbox_delta, img_inf0=None):
batch_size = cls_prob.shape[0]
batch_size = bbox_delta.shape[0]
rois = torch.randn([batch_size, 5, 19])
actual_rois_num = torch.randn([batch_size, 8])
return rois, actual_rois_num
由于转模型的时候,并不会把Pytorch层里怎么运算的转过去,所以在这个函数中,我只是让Proposal的输入输出维度保持正确,而没有进行具体计算。请问我在这里这样处理是否正确?
(1.1.4)在Caffe/layer_param.py中的类class Layer_param()下添加生成Proposal层超参数的生成方法
def proposal_param(self):
proposal_param = pb.ProposalParameter()
proposal_param.feat_stride = 16
proposal_param.base_size = 16
proposal_param.min_size = 16
proposal_param.pre_nms_topn = 3000
proposal_param.post_nms_topn = 304
proposal_param.iou_threshold = 0.7
proposal_param.output_actual_rois_num = True
self.param.proposal_param.CopyFrom(proposal_param)
(1.1.5)使用经过上述过程维护的PytorchToCaffe工具,将部分Faster_rcnn模型转换成Caffe模型。
Pytorch模型定义及转换代码请见此链接。转换完成的.prototxt文件请见附件1(toy03.prototxt)。由于.caffemodel模型太大,暂不上传。为了方便老师查看,此模型可视化如下:
可以看到,这'部分Faster_rcnn模型'就是Faster_RCNN的backbone加上Region Proposal Network。
(2)将Caffe模型,用ATC命令,转换成om模型,此时报错
(2.1)运行脚本对模型进行转换
其中--insert_op_conf="aipp_faster_rcnn.cfg文件是社区案例中提供的样例文件。
#!/bin/sh
export PATH=/usr/local/python3.7.5/bin:$PATH:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/ccec_compiler/bin:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/bin
export PYTHONPATH=$PYTHONPATH:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/te.egg:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/te:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/topi.egg:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/topi:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/auto_tune.egg:/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/python/site-packages/schedule_search.egg
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/lib64:/usr/local/Ascend/ascend-toolkit/20.0.RC1/driver/lib64:/usr/local/Ascend/ascend-toolkit/20.0.RC1/add-ons:/usr/local/python3.7.5/lib
export SLOG_PRINT_TO_STDOUT=1
export ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/20.0.RC1/opp
/usr/local/Ascend/ascend-toolkit/20.0.RC1/atc/bin/atc \
--model="faster_rcnn.prototxt" \
--weight="faster_rcnn.caffemodel" \
--framework=0 \
--output="/root/modelzoo/exp09/device/faster_rcnn" \
--soc_version=Ascend310 \
--insert_op_conf="aipp_faster_rcnn.cfg"
(2.2)此时发生报错
[EVENT] FE(6056,atc):2020-11-12-22:54:02.317.747 [fusion_engine/graph_optimizer/fe_graph_optimizer.cpp:246]OptimizeOriginalGraph:"[FE_PERFORMANCE]The time cost of FEGraphOptimizer::OptimizeQuantGraph is [1064293] micro second."
Segmentation fault (core dumped)
root@UbuntuForAscend:/home/pyf/Downloads/Convert_faster_rcnn_exp# 2020-11-12 22:54:03,732 6067 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,762 6062 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,833 6060 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,863 6061 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,889 6065 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,902 6066 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,930 6063 PCOMPILE Master process dead. worker process quiting..
2020-11-12 22:54:03,946 6064 PCOMPILE Master process dead. worker process quiting..
/usr/local/python3.7.5/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 35 leaked semaphores to clean up at shutdown
len(cache))
其完整错误日志请见附件2(log.txt)。
打开GE DUMP图的开关(export DUMP_GE_GRAPH=1),当前目录下生成的.pbtxt和.txt格式的图请见附件3(附件3.rar)。
我尝试过去掉Proposal层,进行转换,能够成功。初步认为问题发生在atc转换工具上,或者是pytorch转caffe时,.caffemodel模型的问题上。
请见网盘
https://cloud.tsinghua.edu.cn/d/31db2689e4db4a9f9083/
.caffemodel文件请见
https://cloud.tsinghua.edu.cn/d/2f869e9438c14ac5be31/
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
从日志和你的操作看不出为什么会出错。方便提供下权重文件么,这样我们可以尝试转下,获取更多的信息,谢谢
@zhutian 老师您好,权重文件已经上传到https://cloud.tsinghua.edu.cn/d/2f869e9438c14ac5be31/了
外网可以访问么?打开后显示如下:
@zhengtao 上面的回复里,把中文字也自动加到链接里了,点这个试试,
https://cloud.tsinghua.edu.cn/d/2f869e9438c14ac5be31/
现在20.1版本已经发布,你可以https://www.huaweicloud.com/ascend/home.html 资源中心 链接中下载到该版本的开发环境报。
在20.1版本上,我用下面的命令转你的模型
export PATH=/usr/local/python3.7.5/bin:$PATH:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/ccec_compiler/bin:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/bin
export PYTHONPATH=$PYTHONPATH:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages/auto_tune.egg/auto_tune:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages/schedule_search.egg:/home/c75/Ascend/ascend-toolkit/20.1.rc1/opp/op_impl/built-in/ai_core/tbe
export LD_LIBRARY_PATH=/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/lib64:/home/c75/Ascend/ascend-toolkit/20.1.rc1/driver/lib64:/home/c75/Ascend/ascend-toolkit/20.1.rc1/add-ons:/usr/local/python3.7.5/lib:/home/c75/Ascend/ascend-toolkit/20.1.rc1/acllib/lib64
export SLOG_PRINT_TO_STDOUT=1
export ASCEND_OPP_PATH=/home/c75/Ascend/ascend-toolkit/20.1.rc1/opp
/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/bin/atc --input_shape="blob1:1,3,600,800" --weight="/home/c75/work/tinghua_fastercnn/toy_fasterrcnn_with_proposal.caffemodel" --check_report=/home/c75/modelzoo/toy03/device/network_analysis.report --input_format=NCHW --output="/home/c75/modelzoo/toy03/device/toy03" --soc_version=Ascend310 --framework=0 --model="/home/c75/work/tinghua_fastercnn/toy03.prototxt"
报错
2020-11-13 02:46:12 E11019: Op[Proposal1]'s input[2] is not linked in weight file
这个可能是你的权重文件中,proposal算子的input2连接有问题
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.766.886 [framework/domi/parser/caffe/caffe_parser.cc:991]4625 AddNode:Caffe layer name:Proposal1, layer type Proposal
[INFO] GE(4625,atc):2020-11-13-02:57:24.766.953 [framework/domi/parser/caffe/caffe_parser.cc:1144]4625 AddTensorDescToOpDescByIr:After GetOpDescFromOperator op[Proposal1] type[Proposal] have all input size: 3, caffe_input_size:2 blob_size 0 output size: 2
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.018 [framework/domi/parser/caffe/caffe_parser.cc:1166]4625 AddTensorDescToOpDescByIr:op [Proposal1], type[Proposal], update input(0) with name cls_prob success
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.050 [framework/domi/parser/caffe/caffe_parser.cc:1166]4625 AddTensorDescToOpDescByIr:op [Proposal1], type[Proposal], update input(1) with name bbox_delta success
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.074 [framework/domi/parser/caffe/caffe_parser.cc:1175]4625 AddTensorDescToOpDescByIr:op [Proposal1], type[Proposal], update output(0) with name rois success
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.090 [framework/domi/parser/caffe/caffe_parser.cc:1175]4625 AddTensorDescToOpDescByIr:op [Proposal1], type[Proposal], update output(1) with name actual_rois_num success
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.104 [framework/domi/parser/caffe/caffe_parser.cc:1005]4625 AddNode:After AddTensorDescToOpDescByIr op[Proposal1] type[Proposal] have input size: 3, output size: 2
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.767.140 [framework/domi/parser/caffe/caffe_custom_parser_adapter.cc:44]4625 ParseParams:Caffe layer name = Proposal1, layer type= Proposal, parse params
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.207 [framework/domi/parser/caffe/caffe_parser.cc:1009]4625 AddNode:After op parser op[Proposal1] type[Proposal] have input size: 3, output size: 2
[INFO] GE(4625,atc):2020-11-13-02:57:24.767.220 [framework/domi/parser/caffe/caffe_parser.cc:1013]4625 AddNode:Enter caffe parser. op name:Proposal1, type:Proposal
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.767.681 [framework/domi/parser/caffe/caffe_parser.cc:1214]4625 AddEdges:Start add edge: From view2:0 To Proposal1:1.
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.767.750 [framework/domi/parser/caffe/caffe_parser.cc:1214]4625 AddEdges:Start add edge: From conv15:0 To Proposal1:0.
[INFO] GE(4625,atc):2020-11-13-02:57:24.768.389 [framework/domi/parser/caffe/caffe_parser.cc:1356]4625 AddEdge4Output:output in top_blob: Proposal1
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.768.407 [framework/domi/parser/caffe/caffe_parser.cc:1361]4625 AddEdge4Output:Start add edge for out node: From Proposal1:0 To toy_fasterrcnn_Node_Output:0.
[INFO] GE(4625,atc):2020-11-13-02:57:24.768.413 [framework/domi/parser/caffe/caffe_parser.cc:1356]4625 AddEdge4Output:output in top_blob: Proposal1
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.768.426 [framework/domi/parser/caffe/caffe_parser.cc:1361]4625 AddEdge4Output:Start add edge for out node: From Proposal1:1 To toy_fasterrcnn_Node_Output:1.
[DEBUG] GE(4625,atc):2020-11-13-02:57:24.768.675 [common/graph/./compute_graph.cc:745]4625 DFSTopologicalSorting:node_vec.push_back Proposal1
[INFO] GE(4625,atc):2020-11-13-02:57:24.768.728 [framework/domi/parser/caffe/caffe_parser.cc:2469]4625 GetLeafNodeTops:The top of out node [Proposal1] is [proposal_blob1]
[INFO] GE(4625,atc):2020-11-13-02:57:24.768.755 [framework/domi/parser/caffe/caffe_parser.cc:2469]4625 GetLeafNodeTops:The top of out node [Proposal1] is [proposal_blob2]
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.204 [framework/domi/parser/caffe/caffe_parser.cc:1752]4625 Parse:out node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.265 [framework/domi/parser/caffe/caffe_parser.cc:1752]4625 Parse:out node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.271 [framework/domi/parser/caffe/caffe_parser.cc:1750]4625 Parse:node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.887 [common/graph/./compute_graph.cc:1033]4625 Dump:node name = conv15, out data node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.948 [common/graph/./compute_graph.cc:1033]4625 Dump:node name = view2, out data node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.954 [common/graph/./compute_graph.cc:1028]4625 Dump:node name = Proposal1.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.960 [common/graph/./compute_graph.cc:1033]4625 Dump:node name = Proposal1, out data node name = toy_fasterrcnn_Node_Output.
[INFO] GE(4625,atc):2020-11-13-02:57:24.769.966 [common/graph/./compute_graph.cc:1033]4625 Dump:node name = Proposal1, out data node name = toy_fasterrcnn_Node_Output.
[DEBUG] GE(4625,atc):2020-11-13-02:57:25.493.601 [framework/domi/parser/caffe/caffe_parser.cc:2078]4625 ParseLayerField:Parse result(name : Proposal1)
[INFO] GE(4625,atc):2020-11-13-02:57:25.493.628 [framework/domi/parser/caffe/caffe_parser.cc:2037]4625 ParseLayerParameter:Parse layer Proposal1
[DEBUG] GE(4625,atc):2020-11-13-02:57:25.493.642 [framework/domi/parser/caffe/caffe_parser.cc:2312]4625 ConvertLayerParameter:Caffe layer name: Proposal1 , layer type: Proposal.
[INFO] GE(4625,atc):2020-11-13-02:57:25.493.663 [framework/domi/parser/caffe/caffe_custom_parser_adapter.cc:79]4625 ParseWeights:layer: Proposal1 blobs_size: 0 bottom_size: 2
[ERROR] GE(4625,atc):2020-11-13-02:57:25.497.945 [framework/domi/parser/caffe/caffe_parser.cc:2356]4625 CheckNodes: ErrorNo: -1(failed) Op[Proposal1]'s input 2 is not linked.
E11019: Op[Proposal1]'s input[2] is not linked.
这个是你的Proposal1相关信息,从
GetOpDescFromOperator op[Proposal1] type[Proposal] have all input size: 3, caffe_input_size:2 blob_size 0 output size: 2
看,你的Proposal是不是有3个输入,但是在caffe里面实际只有两个输入,这样空出来一个输入没有使用?
现在20.1版本已经发布,你可以https://www.huaweicloud.com/ascend/home.html 资源中心 链接中下载到该版本的开发环境报。
在20.1版本上,我用下面的命令转你的模型
export PATH=/usr/local/python3.7.5/bin:$PATH:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/ccec_compiler/bin:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/bin
export PYTHONPATH=$PYTHONPATH:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages/auto_tune.egg/auto_tune:/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/python/site-packages/schedule_search.egg:/home/c75/Ascend/ascend-toolkit/20.1.rc1/opp/op_impl/built-in/ai_core/tbe
export LD_LIBRARY_PATH=/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/lib64:/home/c75/Ascend/ascend-toolkit/20.1.rc1/driver/lib64:/home/c75/Ascend/ascend-toolkit/20.1.rc1/add-ons:/usr/local/python3.7.5/lib:/home/c75/Ascend/ascend-toolkit/20.1.rc1/acllib/lib64
export SLOG_PRINT_TO_STDOUT=1
export ASCEND_OPP_PATH=/home/c75/Ascend/ascend-toolkit/20.1.rc1/opp
/home/c75/Ascend/ascend-toolkit/20.1.rc1/atc/bin/atc --input_shape="blob1:1,3,600,800" --weight="/home/c75/work/tinghua_fastercnn/toy_fasterrcnn_with_proposal.caffemodel" --check_report=/home/c75/modelzoo/toy03/device/network_analysis.report --input_format=NCHW --output="/home/c75/modelzoo/toy03/device/toy03" --soc_version=Ascend310 --framework=0 --model="/home/c75/work/tinghua_fastercnn/toy03.prototxt"
报错
2020-11-13 02:46:12 E11019: Op[Proposal1]'s input[2] is not linked in weight file
这个可能是你的权重文件中,proposal算子的input2连接有问题
@zhutian 老师您好,根据官方文档,对于检测模型来说,需要手动修改.prototxt文件,加入新的输入节点img_info,再在Proposal等层中加入img_info这样的输入节点。我已经手动修改好了.prototxt文件,在下面链接中,请问老师可以再帮忙看一下什么问题吗?
https://cloud.tsinghua.edu.cn/d/450409d4b1124d4baf9a/
E19000: Path[/home/c75/Ascend/ascend-toolkit/20.1.rc1/x86_64-linux/opp/op_impl/custom/ai_core/tbe/config/ascend310]'s realpath is empty, errmsg[The file path does not exist.]
E11004: caffe net input_shape size[2] is not equal input size[3].
应该是你加入的那个img_info节点输入维度是2,实际要求的是3,你看下这个输入shape维度是啥地方定义的呢
登录 后才可以发表评论