77 Star 597 Fork 1.2K

Ascend/pytorch

适配pytorch版本的rtdetr,出现报错,按官网的适配方法,用amp,融合优化器,无改变网络内部结构

DONE
Bug-Report
创建于  
2024-04-03 13:54

一、问题现象(附报错日志上下文):
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.722.640 [engine.cc:1628]2468852 ReportExceptProc:[FINAL][FINAL]Task exception! device_id=0, stream_id=16, task_id=14864, type=5(MEMCPY_ASYNC), failuremode =1, retCode=0x217, [sdma copy error]
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.255 [runtime.cc:4657]2468852 SetWatchDogDevStatus:[FINAL][FINAL]There is errInfo of devId=0, tsId=0
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.355 [device_error_proc.cc:707]2468852 ProcessSdmaErrorInfo:[FINAL][FINAL]report error module_type=3, module_name=EE8888
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.372 [device_error_proc.cc:707]2468852 ProcessSdmaErrorInfo:[FINAL][FINAL]The error from device(0), serial number is 5. there is a sdma error, sdma channel is 0, the channel exist the following problems: The SMMU returns a Terminate error during page table translation.. the value of CQE status is 2. the description of CQE status: When the SQE translates a page table, the SMMU returns a Terminate error.it's config include: setting1=0xc0060008c090000, setting2=0xff009000ff004c, setting3=0x1f, sq base addr=0x60006240d000
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.473 [stream.cc:3187]2468852 EnterFailureAbort:[FINAL][FINAL]stream_id=16 enter failure abort.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.486 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.497 [stream.cc:2440]2468852 WaitForTask:[FINAL][FINAL]Task Wait: device_id=0, stream_id=16 is Abort, RunningState=0
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.508 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.518 [logger.cc:473]2468852 StreamSynchronize:[FINAL][FINAL]Stream synchronize failed, stream_id=16
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.581 [api_c.cc:819]2468852 rtStreamSynchronizeWithTimeout:[FINAL][FINAL]ErrCode=507013, desc=[sdma copy error], InnerCode=0x7150063
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.592 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]report error module_type=3, module_name=EE8888
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.724.604 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]rtStreamSynchronizeWithTimeout execute failed, reason=[sdma copy error]
[ERROR] ASCENDCL(2468852,python):2024-04-03-11:37:01.724.629 [stream.cpp:143]2468852 aclrtSynchronizeStreamWithTimeout: [FINAL][FINAL]synchronize stream failed, runtime result = 507013
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:01.726.508 [device_msg_handler.cc:158]2468852 HandleMsgInHostBuf:[FINAL][FINAL]
DEVICE[0] PID[2468852]:
EXCEPTION STREAM:
Exception info:TGID=2574935, model id=65535, stream id=16, stream phase=SCHEDULE
Message info[0]:RTS_HWTS: hwts sdma error, slot_id=33, stream_id=16
Other info[0]:time=2024-04-03-11:37:01.699.592, function=hwts_sdma_error_slot_proc, line=758, error code=0x20b
EH9999: Inner Error!
The error from device(0), serial number is 5. there is a sdma error, sdma channel is 0, the channel exist the following problems: The SMMU returns a Terminate error during page table translation.. the value of CQE status is 2. the description of CQE status: When the SQE translates a page table, the SMMU returns a Terminate error.it's config include: setting1=0xc0060008c090000, setting2=0xff009000ff004c, setting3=0x1f, sq base addr=0x60006240d000[FUNC:ProcessSdmaErrorInfo][FILE:device_error_proc.cc][LINE:707]
rtStreamSynchronizeWithTimeout execute failed, reason=[sdma copy error][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50]
EH9999 synchronize stream failed, runtime result = 507013[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):

DEVICE[0] PID[2468852]:
EXCEPTION STREAM:
Exception info:TGID=2574935, model id=65535, stream id=16, stream phase=SCHEDULE
Message info[0]:RTS_HWTS: hwts sdma error, slot_id=33, stream_id=16
Other info[0]:time=2024-04-03-11:37:01.699.592, function=hwts_sdma_error_slot_proc, line=758, error code=0x20b
Epoch: [2] [ 0/337] eta: 0:52:42 lr: 0.000100 loss: 21.2900 (21.2900) loss_vfl: 0.5198 (0.5198) loss_bbox: 0.0769 (0.0769) loss_giou: 1.0470 (1.0470) loss_vfl_aux_0: 0.5577 (0.5577) loss_bbox_aux_0: 0.0718 (0.0718) loss_giou_aux_0: 0.9758 (0.9758) loss_vfl_aux_1: 0.5585 (0.5585) loss_bbox_aux_1: 0.0749 (0.0749) loss_giou_aux_1: 0.9818 (0.9818) loss_vfl_aux_2: 0.5608 (0.5608) loss_bbox_aux_2: 0.0690 (0.0690) loss_giou_aux_2: 0.9999 (0.9999) loss_vfl_aux_3: 0.4978 (0.4978) loss_bbox_aux_3: 0.0852 (0.0852) loss_giou_aux_3: 1.0473 (1.0473) loss_vfl_aux_4: 0.4940 (0.4940) loss_bbox_aux_4: 0.0831 (0.0831) loss_giou_aux_4: 1.0486 (1.0486) loss_vfl_aux_5: 0.5345 (0.5345) loss_bbox_aux_5: 0.0930 (0.0930) loss_giou_aux_5: 1.0554 (1.0554) loss_vfl_dn_0: 0.3185 (0.3185) loss_bbox_dn_0: 0.1040 (0.1040) loss_giou_dn_0: 1.2817 (1.2817) loss_vfl_dn_1: 0.3376 (0.3376) loss_bbox_dn_1: 0.0993 (0.0993) loss_giou_dn_1: 1.2253 (1.2253) loss_vfl_dn_2: 0.3502 (0.3502) loss_bbox_dn_2: 0.0954 (0.0954) loss_giou_dn_2: 1.1862 (1.1862) loss_vfl_dn_3: 0.3489 (0.3489) loss_bbox_dn_3: 0.0929 (0.0929) loss_giou_dn_3: 1.1807 (1.1807) loss_vfl_dn_4: 0.3488 (0.3488) loss_bbox_dn_4: 0.0909 (0.0909) loss_giou_dn_4: 1.1773 (1.1773) loss_vfl_dn_5: 0.3503 (0.3503) loss_bbox_dn_5: 0.0898 (0.0898) loss_giou_dn_5: 1.1794 (1.1794) time: 9.3845 data: 3.5033 max mem: 3490
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.431 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.469 [task_info.cc:2748]2468852 DoCompleteSuccessForMemcpyAsyncTask:[FINAL][FINAL]mem async copy error, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.489 [task_info.cc:2706]2468852 PrintErrorInfoForMemcpyAsyncTask:[FINAL][FINAL]Memory async copy failed, device_id=0, stream_id=16, task_id=14864, flip_num=189, copy_type=2, memcpy_type=0, copy_data_type=0, length=250000
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.500 [task_info.cc:2721]2468852 PrintErrorInfoForMemcpyAsyncTask:[FINAL][FINAL]Memory async copy failed, device_id=0, stream_id=16, task_id=14864, flip_num=189, copy_type=2, memcpy_type=0, copy_data_type=0, length=250000, src_addr=0x12403eddaea0, dst_addr=0x12403eddaea0.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.528 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.537 [stream.cc:1454]2468852 GetError:[FINAL][FINAL]report error module_type=2, module_name=EI9999
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.545 [stream.cc:1454]2468852 GetError:[FINAL][FINAL]Memory async copy failed, device_id=0, stream_id=16, task_id=14864, flip_num=189, copy_type=2, memcpy_type=0, copy_data_type=0, length=250000
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.610 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.629 [context.cc:306]2468852 Synchronize:[FINAL][FINAL]sync stream fail, stream_id=16, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.643 [context.cc:312]2468852 Synchronize:[FINAL][FINAL]Synchronize streams, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.651 [logger.cc:875]2468852 DeviceSynchronize:[FINAL][FINAL]Device synchronize failed.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.682 [api_c.cc:1899]2468852 rtDeviceSynchronize:[FINAL][FINAL]ErrCode=507013, desc=[sdma copy error], InnerCode=0x7150063
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.691 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]report error module_type=3, module_name=EE8888
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.947.702 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]rtDeviceSynchronize execute failed, reason=[sdma copy error]
[ERROR] ASCENDCL(2468852,python):2024-04-03-11:37:03.947.723 [device.cpp:284]2468852 aclrtSynchronizeDevice: [FINAL][FINAL]wait for compute device to finish failed, runtime result = 507013.
[ERROR] APP(2468852,python):2024-04-03-11:37:03.949.171 [log_inner.cpp:76]2468852 InitNpuBindings.cpp:THPModule_npu_shutdown:75: "[PTA]:"NPU shutdown synchronize device failed.""
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.210 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.220 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.228 [context.cc:306]2468852 Synchronize:[FINAL][FINAL]sync stream fail, stream_id=16, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.237 [context.cc:312]2468852 Synchronize:[FINAL][FINAL]Synchronize streams, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.244 [logger.cc:875]2468852 DeviceSynchronize:[FINAL][FINAL]Device synchronize failed.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.255 [api_c.cc:1899]2468852 rtDeviceSynchronize:[FINAL][FINAL]ErrCode=507013, desc=[sdma copy error], InnerCode=0x7150063
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.262 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]report error module_type=3, module_name=EE8888
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:03.949.270 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]rtDeviceSynchronize execute failed, reason=[sdma copy error]
[ERROR] ASCENDCL(2468852,python):2024-04-03-11:37:03.949.291 [device.cpp:284]2468852 aclrtSynchronizeDevice: [FINAL][FINAL]wait for compute device to finish failed, runtime result = 507013.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.629 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.654 [stream.cc:1451]2468852 GetError:[FINAL][FINAL]Stream Synchronize failed, stream_id=16, retCode=0x217, [sdma copy error].
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.662 [context.cc:306]2468852 Synchronize:[FINAL][FINAL]sync stream fail, stream_id=16, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.672 [context.cc:312]2468852 Synchronize:[FINAL][FINAL]Synchronize streams, retCode=0x7150063.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.679 [logger.cc:875]2468852 DeviceSynchronize:[FINAL][FINAL]Device synchronize failed.
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.693 [api_c.cc:1899]2468852 rtDeviceSynchronize:[FINAL][FINAL]ErrCode=507013, desc=[sdma copy error], InnerCode=0x7150063
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.700 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]report error module_type=3, module_name=EE8888
[ERROR] RUNTIME(2468852,python):2024-04-03-11:37:04.020.718 [error_message_manage.cc:50]2468852 FuncErrorReason:[FINAL][FINAL]rtDeviceSynchronize execute failed, reason=[sdma copy error]
[ERROR] ASCENDCL(2468852,python):2024-04-03-11:37:04.020.745 [device.cpp:284]2468852 aclrtSynchronizeDevice: [FINAL][FINAL]wait for compute device to finish failed, runtime result = 507013.
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.704 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_MALLOC_FREE, applyTotal = 203, applySucc = 203, releaseTotal = 108, releaseSucc = 108
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.756 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_MALLOC_FREE_HOST, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.766 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_CONTEXT, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.773 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_SET_RESET_DEVICE, applyTotal = 4, applySucc = 4, releaseTotal = 1, releaseSucc = 1
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.781 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_EVENT, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.788 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_STREAM, applyTotal = 2, applySucc = 2, releaseTotal = 1, releaseSucc = 1
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.794 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_DVPP_MALLOC_FREE, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.801 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_RECORD_RESET_EVENT, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.807 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DATA_BUFFER, applyTotal = 29426801, applySucc = 29426801, releaseTotal = 29426801, releaseSucc = 29426801
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.814 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_TENSOR_DESC, applyTotal = 29426801, applySucc = 29426801, releaseTotal = 29426801, releaseSucc = 29426801
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.821 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.828 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DATASET, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.834 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_LOAD_UNLOAD_MODEL, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.841 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_AIPP, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.847 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_ATTR, applyTotal = 3520046, applySucc = 3520046, releaseTotal = 3520046, releaseSucc = 3520046
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.854 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_HANDLE, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.877 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_CHANNEL_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.884 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_PIC_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.889 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_ROI_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.904 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_RESIZE_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.910 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_JPEGE_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.916 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VDEC_CHANNEL_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.922 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VENC_CHANNEL_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.927 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_STREAM_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.933 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VDEC_FRAME_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.943 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VENC_FRAME_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.949 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_CHANNEL, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.955 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VDEC_CHANNEL, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.961 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_VENC_CHANNEL, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.972 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_DVPP_BATCH_PIC_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.978 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_GROUP_INFO, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.984 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_PROF_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.993 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_PROF_SUB_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.076.999 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_MODEL_CONFIG, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.005 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_QUEUE_ID, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.010 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_QUEUE_ATTR, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.017 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_QUEUE_ROUTE, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.023 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_QUEUE_ROUTE_LIST, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.029 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_QUEUE_ROUTE_QUERY, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.034 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_MBUF, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.040 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_GRAPH_DUMP_OPTION, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.046 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_ALLOCATOR_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.051 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_CREATE_DESTROY_ALLOCATOR_BINARY_DESC, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.057 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_RESERVE_RELEASE_MEMORY_ADDRESS, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.063 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_MALLOC_FREE_PHYSICAL_MEMORY, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.068 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_MAP_UNMAP_MEMORY, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[EVENT] ASCENDCL(2468852,python):2024-04-03-11:37:10.077.074 [resource_statistics.cpp:113]2468852 TraverseStatistics: The ResourceType:ACL_STATISTICS_LOAD_UNLOAD_BINARY, applyTotal = 0, applySucc = 0, releaseTotal = 0, releaseSucc = 0
[TRACE] GE(2468852,python):2024-04-03-11:37:10.077.110 [status:INIT] [ge_api.cc:362]2468852 GEFinalize:GEFinalize start
[TRACE] GE(2468852,python):2024-04-03-11:37:10.125.289 [status:RUNNING] [ge_api.cc:374]2468852 GEFinalize:Finalizing environment
[EVENT] FE(2468852,python):2024-04-03-11:37:10.296.683 [fusion_statistic_writer.cc:147]2468852 WriteAllFusionInfoToJsonFile:"Start write fusion result to file fusion_result.json"
[EVENT] TEFUSION(2468852,python):2024-04-03-11:37:11.356.591 [fusion_api.cc:125]2468852 TbeFinalize Reuse binary op count is [0]
[EVENT] TUNE(2468852,python):2024-04-03-11:37:12.521.314 [cann_kb_pyfunc_mgr.cpp:127][CANNKB][Tid:2468852]"CannKbPyfuncMgr: enter PyObjectDeinit function, reference_[1]"
[EVENT] TUNE(2468852,python):2024-04-03-11:37:12.521.374 [cann_kb_pyfunc_mgr.cpp:138][CANNKB][Tid:2468852]"CannKbPyfuncMgr: PyObjectDeinit function end successfully!"
[EVENT] FE(2468852,python):2024-04-03-11:37:12.654.562 [fusion_manager.cc:204]2468852 Finalize:"[FE_PERFORMANCE]The time cost of FusionManager::Finalize is [1298075] micro second."
[EVENT] FFTS(2468852,python):2024-04-03-11:37:12.654.736 [engine_manager.cc:91]2468852 Finalize:"[FFTS_PERFORMANCE]The time cost of EngineManager::Finalize is [37] micro second."
[TRACE] GE(2468852,python):2024-04-03-11:37:13.393.531 [status:STOP] [ge_api.cc:402]2468852 GEFinalize:GEFinalize finished
[INFO] RUNTIME(2468852,python):2024-04-03-11:37:19.627.809 [runtime.cc:1608] 2468852 ~Runtime: deconstruct runtime.
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.748.836 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 152]DEV_MEM dev0 mem stats(Bytes): module_name=RUNTIME, module_id=7 current_alloced_size=0, alloced_peak_size=167968768.
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.102 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 152]DEV_MEM dev0 mem stats(Bytes): module_name=APP, module_id=33 current_alloced_size=1797259264, alloced_peak_size=4188012544.
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.112 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 152]DEV_MEM dev0 mem stats(Bytes): module_name=GE, module_id=45 current_alloced_size=0, alloced_peak_size=1299197952.
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.122 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 158]DEV_MEM dev0 Cached_size:597688320Bytes
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.140 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 152]HOST_MEM mem stats(Bytes): module_name=RUNTIME, module_id=7 current_alloced_size=0, alloced_peak_size=11341824.
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.149 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 158]HOST_MEM Cached_size:12582912Bytes
[EVENT] DRV(2468852,python):2024-04-03-11:37:19.749.158 [ascend][curpid: 2468852, 2468852][drv][devmm][_devmm_mem_stats_show 158]DVPP_MEM dev0 Cached_size:0Bytes

评论 (4)

wenshuaishuai123 创建了Bug-Report 1年前

问题还能复现吗

huangyunlong 任务状态TODO 修改为DONE 1年前

登录 后才可以发表评论

状态
负责人
项目
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
预计工期 (小时)
开始日期   -   截止日期
-
置顶选项
优先级
里程碑
分支
参与者(2)
wenshuaishuai123-mr-wenshuaai huangyunlong-huangyunlong2022
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch

搜索帮助