登录
注册
开源
企业版
高校版
搜索
帮助中心
使用条款
关于我们
开源
企业版
高校版
私有云
模力方舟
登录
注册
代码拉取完成,页面将自动刷新
仓库状态说明
开源项目
>
人工智能
>
机器学习/深度学习
&&
捐赠
捐赠前请先登录
取消
前往登录
扫描微信二维码支付
取消
支付完成
支付提示
将跳转至支付宝完成支付
确定
取消
Watch
不关注
关注所有动态
仅关注版本发行动态
关注但不提醒动态
88
Star
648
Fork
1.4K
Ascend
/
pytorch
暂停
代码
Issues
41
Pull Requests
350
Wiki
统计
流水线
服务
质量分析
Jenkins for Gitee
腾讯云托管
腾讯云 Serverless
悬镜安全
阿里云 SAE
Codeblitz
SBOM
我知道了,不再自动展开
更新失败,请稍后重试!
移除标识
内容风险标识
本任务被
标识为内容中包含有代码安全 Bug 、隐私泄露等敏感信息,仓库外成员不可访问
torch_npu跑多并发时报错:219 NPU error, error code is 500002
DONE
#I9FAYL
需求
Agent-Chu
创建于
2024-04-09 21:04
环境910B单机16卡机器  一、问题现象(附报错日志上下文): 对APE大模型进行3并发测试,报错。 ``` (py39) root@gzxj-sys-rpm46kwprrx:~/APE# ./run_test.sh /root/miniconda3/envs/py39/lib/python3.9/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( /root/miniconda3/envs/py39/lib/python3.9/site-packages/torchvision/transforms/functional_pil.py:5: UserWarning: The torchvision.transforms.functional_pil module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( [04/10 06:46:19 detectron2]: Arguments: Namespace(config_file='configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py', webcam=False, video_input=None, input=None, output=None, confidence_threshold=0.1, opts=['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'], text_prompt=None, with_box=True, with_mask=False, with_sseg=False) Please 'pip install xformers' Please 'pip install xformers' Please 'pip install apex' Please 'pip install xformers' =========== args.opts ============ ['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'] ANTLR runtime and generated code versions disagree: 4.9.3!=4.8 ANTLR runtime and generated code versions disagree: 4.9.3!=4.8 ======== shape of rope freq torch.Size([1024, 64]) ======== ======== shape of rope freq torch.Size([4096, 64]) ======== [04/10 06:46:24 ape.data.detection_utils]: Using builtin metadata 'image_count' for dataset '['lvis_v1_train+coco_panoptic_separated']' [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([1203]) num_classes: 1256 [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: pad fed_loss_cls_weights with type cat and value 0 [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: pad fed_loss_classes with tensor([1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255]) [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: tensor([ 1.0000, 1.0000, 3.1623, 7.3485, 43.8520, 25.0998, 5.5678, 8.3066, 2.6458, 3.3166, 1.0000, 5.4772, 7.0711, 6.7082, 5.2915, 10.6771, 13.8924, 4.5826, 9.5394, 5.5678, 38.3275, 43.8634, 9.3274, 8.7750, 3.3166, 6.8557, 4.5826, 6.8557, 8.3666, 42.8719, 4.3589, 23.0434, 3.3166, 46.6798, 10.6301, 5.0990, 2.2361, 7.4833, 8.5440, 5.6569, 11.3137, 24.9600, 3.4641, 7.2111, 3.3166, 41.0731, 9.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]) [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([1256]) num_classes: 1256 [04/10 06:46:24 ape.data.detection_utils]: Using builtin metadata 'image_count' for dataset '['openimages_v6_train_bbox_nogroup']' [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([601]) num_classes: 601 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 0 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: lvis_v1_train+coco_panoptic_separated [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing+stuff [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 1 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: objects365_train_fixname [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 2 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: openimages_v6_train_bbox_nogroup [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 3 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: visualgenome_77962_box_and_region [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 4 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: sa1b_6m [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 5 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: refcoco-mixed_group-by-image [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 6 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: gqa_region_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 7 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: phrasecut_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 8 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: flickr30k_separateGT_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 9 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: refcoco-mixed [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:47:13 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /root/APE/model_final.pth ... [04/10 06:47:13 fvcore.common.checkpoint]: [Checkpointer] Loading from /root/APE/model_final.pth ... Namespace(config_file='configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py', webcam=False, video_input=None, input=None, output=None, confidence_threshold=0.1, opts=['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'], text_prompt=None, with_box=True, with_mask=False, with_sseg=False) INFO: Started server process [75357] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8198 (Press CTRL+C to quit) /root/APE/ape/modeling/text/clip_wrapper_eva02.py:117: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at /opt/_internal/cpython-3.9.0/lib/python3.9/site-packages/torch/include/ATen/core/LegacyTypeDispatch.h:74.) attention_mask[i, : end_token_idx[i] + 1] = 1 /root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk INFO: 10.92.54.160:60802 - "POST /infer HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 72, in app response = await func(request) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 193, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable, File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run result = context.run(func, *args) File "/root/APE/demo/api.py", line 144, in interface predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( File "/root/APE/demo/predictor_lazy.py", line 212, in run_on_image predictions = self.predictor(image, text_prompt, mask_prompt) File "/root/APE/ape/engine/defaults.py", line 99, in __call__ predictions = self.model([inputs])[0] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/ape_deta.py", line 39, in forward losses = self.model_vision(batched_inputs, do_postprocess=do_postprocess) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_detr_segm_vl.py", line 428, in forward ) = self.transformer( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_transformer_vl.py", line 605, in forward keep_inds_topk = keep_inds[keep_inds_mask] RuntimeError: InnerRun:/usr1/02/workspace/j_ywhtRpPk/pytorch/torch_npu/csrc/framework/OpParamMaker.cpp:219 NPU error, error code is 500002 [Error]: A GE error occurs in the system. Rectify the fault based on the error information in the log, or you can ask us at follwing gitee link by issues: https://gitee.com/ascend/pytorch/issue EH9999: Inner Error! EH9999 [Exec][Op]Execute op failed. op type = NonZero, ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): [WARNING] nms proposals (0) < 900, running naive topk INFO: 10.92.54.160:60898 - "POST /infer HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 72, in app response = await func(request) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 193, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable, File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run result = context.run(func, *args) File "/root/APE/demo/api.py", line 144, in interface predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( File "/root/APE/demo/predictor_lazy.py", line 212, in run_on_image predictions = self.predictor(image, text_prompt, mask_prompt) File "/root/APE/ape/engine/defaults.py", line 99, in __call__ predictions = self.model([inputs])[0] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/ape_deta.py", line 39, in forward losses = self.model_vision(batched_inputs, do_postprocess=do_postprocess) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_detr_segm_vl.py", line 428, in forward ) = self.transformer( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_transformer_vl.py", line 605, in forward keep_inds_topk = keep_inds[keep_inds_mask] RuntimeError: InnerRun:/usr1/02/workspace/j_ywhtRpPk/pytorch/torch_npu/csrc/framework/OpParamMaker.cpp:219 NPU error, error code is 500002 [Error]: A GE error occurs in the system. Rectify the fault based on the error information in the log, or you can ask us at follwing gitee link by issues: https://gitee.com/ascend/pytorch/issue EH9999: Inner Error! EH9999 [Exec][Op]Execute op failed. op type = NonZero, ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): [04/10 06:53:56 detectron2]: ./tmp/1712731968.7509215.jpg: detected 7 instances in 67.40s INFO: 10.92.54.160:60698 - "POST /infer HTTP/1.1" 200 OK [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [04/10 06:54:44 detectron2]: ./tmp/1712732025.6573904.jpg: detected 14 instances in 58.55s INFO: 10.92.54.160:35880 - "POST /infer HTTP/1.1" 200 OK [04/10 06:54:45 detectron2]: ./tmp/1712732026.2869112.jpg: detected 14 instances in 59.58s INFO: 10.92.54.160:36146 - "POST /infer HTTP/1.1" 200 OK [04/10 06:54:56 detectron2]: ./tmp/1712732039.161774.jpg: detected 14 instances in 57.36s INFO: 10.92.54.160:36822 - "POST /infer HTTP/1.1" 200 OK [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [04/10 06:55:44 detectron2]: ./tmp/1712732085.083786.jpg: detected 14 instances in 59.38s INFO: 10.92.54.160:39896 - "POST /infer HTTP/1.1" 200 OK EH9999: Inner Error! rtStreamSynchronize execute failed, reason=[vector core exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50] EH9999 synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): E90003: Compile operator failed, cause: Template constraint, detailed information: check_op_cap func failed, check_type: op_select_format, op_type:LinSpace failed, failure details: Compile_info: empty_compile_info Inputs: {'shape': (1,), 'ori_shape': (), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} {'shape': (1,), 'ori_shape': (), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} {'shape': (1,), 'ori_shape': (1,), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'int32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} Outputs: {'shape': (-2,), 'ori_shape': (-2,), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [-2], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} Attrs: []. TraceBack (most recent call last): The error from device(chipId:1, dieId:0), serial number is 12, there is an aivec error exception, core id is 34, error code = 0x800000, dump info: pc start: 0x1240c140d0a8, current: 0x1240c140d300, vec error info: 0xd115465300, mte error info: 0x3403000096, ifu error info: 0x23c9f37f1f880, ccu error info: 0x5f082e8a43b6e9e3, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd000288, para base: 0x1241803e6400.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1165] The extend info: errcode:(0x800000, 0, 0) errorStr: The DDR address of the MTE instruction is out of range. fixp_error0 info: 0x3000096, fixp_error1 info: 0x34 fsmId:1, tslot:0, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1177] Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1677] AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1454] Aicore kernel execute failed, device_id=1, stream_id=2, report_stream_id=2, task_id=49049, flip_num=0, fault kernel_name=Cast_e87590d11ccda8b259ab6b1ea7212319_high_performance_210000000, program id=121, hash=3394887288916785353.[FUNC:GetError][FILE:stream.cc][LINE:1454] [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1454] rtStreamSynchronizeWithTimeout execute failed, reason=[vector core exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50] Assert ((rt_ret) == 0) failed[FUNC:DoRtStreamSyncWithTimeout][FILE:utils.cc][LINE:40] [Exec][Op]Execute op failed. op type = NonMaxSuppressionV3, ge result = 1343225857[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] ./run_test.sh: line 54: 75357 Aborted (core dumped) python3.9 demo/api.py --config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py --with-box --opts train.init_checkpoint="/root/APE/model_final.pth" model.model_language.cache_dir="" model.model_vision.select_box_nums_for_evaluation=500 model.model_vision.text_feature_bank_reset=True model.model_vision.backbone.net.xattn=False ``` 二、软件版本:  -- CANN 版本: 7.0.RC1.10 -- Python 版本:3.9 -- 操作系统版本: Ubuntu 18.04 -- arch : x86_64 三、测试步骤 在910b上适配了APE大模型,并使用fastapi 代码进行测试: ``` # Copyright (c) Facebook, Inc. and its affiliates. import argparse import json import multiprocessing as mp import os import tempfile import time import warnings from collections import abc import sys import numpy as np import tqdm import torch import torch_npu from detectron2.config import LazyConfig, get_cfg from detectron2.data.detection_utils import read_image from detectron2.evaluation.coco_evaluation import instances_to_coco_json # from detectron2.projects.deeplab import add_deeplab_config # from detectron2.projects.panoptic_deeplab import add_panoptic_deeplab_config from detectron2.utils.logger import setup_logger from predictor_lazy import VisualizationDemo import base64 import io import gc import uvicorn import requests from ctypes import * from PIL import Image from fastapi import FastAPI from pydantic import BaseModel app = FastAPI() # constants WINDOW_NAME = "APE" def setup_cfg(args): # load config from file and command-line arguments cfg = LazyConfig.load(args.config_file) print ("=========== args.opts ============", args.opts) cfg = LazyConfig.apply_overrides(cfg, args.opts) if "output_dir" in cfg.model: cfg.model.output_dir = cfg.train.output_dir if "model_vision" in cfg.model and "output_dir" in cfg.model.model_vision: cfg.model.model_vision.output_dir = cfg.train.output_dir if "train" in cfg.dataloader: if isinstance(cfg.dataloader.train, abc.MutableSequence): for i in range(len(cfg.dataloader.train)): if "output_dir" in cfg.dataloader.train[i].mapper: cfg.dataloader.train[i].mapper.output_dir = cfg.train.output_dir else: if "output_dir" in cfg.dataloader.train.mapper: cfg.dataloader.train.mapper.output_dir = cfg.train.output_dir if "model_vision" in cfg.model: cfg.model.model_vision.test_score_thresh = args.confidence_threshold else: cfg.model.test_score_thresh = args.confidence_threshold # default_setup(cfg, args) setup_logger(name="ape") setup_logger(name="timm") return cfg def get_parser(): parser = argparse.ArgumentParser(description="Detectron2 demo for builtin configs") parser.add_argument( "--config-file", default="configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml", metavar="FILE", help="path to config file", ) parser.add_argument("--webcam", action="store_true", help="Take inputs from webcam.") parser.add_argument("--video-input", help="Path to video file.") parser.add_argument( "--input", nargs="+", help="A list of space separated input images; " "or a single glob pattern such as 'directory/*.jpg'", ) parser.add_argument( "--output", help="A file or directory to save output visualizations. " "If not given, will show output in an OpenCV window.", ) parser.add_argument( "--confidence-threshold", type=float, default=0.1, help="Minimum score for instance predictions to be shown", ) parser.add_argument( "--opts", help="Modify config options using the command-line 'KEY VALUE' pairs", default=[], nargs=argparse.REMAINDER, ) parser.add_argument("--text-prompt", default=None) parser.add_argument("--with-box", action="store_true", help="show box of instance") parser.add_argument("--with-mask", action="store_true", help="show mask of instance") parser.add_argument("--with-sseg", action="store_true", help="show mask of class") return parser class Req(BaseModel): image: str text: str @app.post('/infer') def interface(req: Req): image, text_prompt = req.image, req.text if not image or not text_prompt or '<' in text_prompt or '>' in text_prompt: return {"error": "input error"} image = Image.open(io.BytesIO(base64.b64decode(image))).convert("RGB") fn = time.time() try: images = [] os.makedirs('./tmp', exist_ok=True) image_path = f"./tmp/{fn}.jpg" image.save(image_path) images.append(image_path) for path in tqdm.tqdm(images, disable=not args.output): # use PIL, to be consistent with evaluation try: img = read_image(path, format="BGR") except Exception as e: print("*" * 60) print("fail to open image: ", e) print("*" * 60) continue start_time = time.time() predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( img, text_prompt=text_prompt, with_box=args.with_box, with_mask=args.with_mask, with_sseg=args.with_sseg, ) logger.info( "{}: {} in {:.2f}s".format( path, "detected {} instances".format(len(predictions["instances"])) if "instances" in predictions else "finished", time.time() - start_time, ) ) results = [] if "instances" in predictions: results = instances_to_coco_json( predictions["instances"].to(demo.cpu_device), path ) for result in results: result["category_name"] = metadata.thing_classes[result["category_id"]] result["image_name"] = result["image_id"] if args.output: os.makedirs(args.output, exist_ok=True) if os.path.isdir(args.output): assert os.path.isdir(args.output), args.output out_filename = os.path.join(args.output, os.path.basename(path)) else: assert len(args.input) == 1, "Please specify a directory with args.output" out_filename = args.output out_filename = out_filename.replace(".webp", ".png") out_filename = out_filename.replace(".crdownload", ".png") out_filename = out_filename.replace(".jfif", ".png") visualized_output.save(out_filename) for i in range(len(visualized_outputs)): out_filename = ( os.path.join(args.output, os.path.basename(path)) + "." + str(i) + ".png" ) visualized_outputs[i].save(out_filename) with open(out_filename + ".json", "w") as outp: json.dump(results, outp) gc.collect() finally: os.remove(f'./tmp/{fn}.jpg') return {'result': results} if __name__ == "__main__": import setproctitle setproctitle.setproctitle("APE") torch.npu.set_device('npu:1') # init model mp.set_start_method("spawn", force=True) args = get_parser().parse_args() setup_logger(name="fvcore") setup_logger(name="ape") logger = setup_logger() logger.info("Arguments: " + str(args)) cfg = setup_cfg(args) demo = VisualizationDemo(cfg, args=args) uvicorn.run(app, port=8198, host="0.0.0.0") ``` 启动该脚本后,通过jmter发压,在1并发和2并发时无异常,3并发之后开始报错,程序崩溃。
环境910B单机16卡机器  一、问题现象(附报错日志上下文): 对APE大模型进行3并发测试,报错。 ``` (py39) root@gzxj-sys-rpm46kwprrx:~/APE# ./run_test.sh /root/miniconda3/envs/py39/lib/python3.9/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( /root/miniconda3/envs/py39/lib/python3.9/site-packages/torchvision/transforms/functional_pil.py:5: UserWarning: The torchvision.transforms.functional_pil module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional. warnings.warn( [04/10 06:46:19 detectron2]: Arguments: Namespace(config_file='configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py', webcam=False, video_input=None, input=None, output=None, confidence_threshold=0.1, opts=['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'], text_prompt=None, with_box=True, with_mask=False, with_sseg=False) Please 'pip install xformers' Please 'pip install xformers' Please 'pip install apex' Please 'pip install xformers' =========== args.opts ============ ['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'] ANTLR runtime and generated code versions disagree: 4.9.3!=4.8 ANTLR runtime and generated code versions disagree: 4.9.3!=4.8 ======== shape of rope freq torch.Size([1024, 64]) ======== ======== shape of rope freq torch.Size([4096, 64]) ======== [04/10 06:46:24 ape.data.detection_utils]: Using builtin metadata 'image_count' for dataset '['lvis_v1_train+coco_panoptic_separated']' [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([1203]) num_classes: 1256 [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: pad fed_loss_cls_weights with type cat and value 0 [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: pad fed_loss_classes with tensor([1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255]) [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: tensor([ 1.0000, 1.0000, 3.1623, 7.3485, 43.8520, 25.0998, 5.5678, 8.3066, 2.6458, 3.3166, 1.0000, 5.4772, 7.0711, 6.7082, 5.2915, 10.6771, 13.8924, 4.5826, 9.5394, 5.5678, 38.3275, 43.8634, 9.3274, 8.7750, 3.3166, 6.8557, 4.5826, 6.8557, 8.3666, 42.8719, 4.3589, 23.0434, 3.3166, 46.6798, 10.6301, 5.0990, 2.2361, 7.4833, 8.5440, 5.6569, 11.3137, 24.9600, 3.4641, 7.2111, 3.3166, 41.0731, 9.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]) [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([1256]) num_classes: 1256 [04/10 06:46:24 ape.data.detection_utils]: Using builtin metadata 'image_count' for dataset '['openimages_v6_train_bbox_nogroup']' [04/10 06:46:24 ape.modeling.ape_deta.deformable_criterion]: fed_loss_cls_weights: torch.Size([601]) num_classes: 601 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 0 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: lvis_v1_train+coco_panoptic_separated [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing+stuff [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 1 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: objects365_train_fixname [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 2 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: openimages_v6_train_bbox_nogroup [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 3 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: visualgenome_77962_box_and_region [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 4 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: sa1b_6m [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 5 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: refcoco-mixed_group-by-image [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 6 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: gqa_region_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 7 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: phrasecut_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 8 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: flickr30k_separateGT_train [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_id: 9 [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_name: refcoco-mixed [04/10 06:46:24 ape.modeling.ape_deta.deformable_detr]: dataset_entity: thing [04/10 06:47:13 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /root/APE/model_final.pth ... [04/10 06:47:13 fvcore.common.checkpoint]: [Checkpointer] Loading from /root/APE/model_final.pth ... Namespace(config_file='configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py', webcam=False, video_input=None, input=None, output=None, confidence_threshold=0.1, opts=['train.init_checkpoint=/root/APE/model_final.pth', 'model.model_language.cache_dir=', 'model.model_vision.select_box_nums_for_evaluation=500', 'model.model_vision.text_feature_bank_reset=True', 'model.model_vision.backbone.net.xattn=False'], text_prompt=None, with_box=True, with_mask=False, with_sseg=False) INFO: Started server process [75357] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8198 (Press CTRL+C to quit) /root/APE/ape/modeling/text/clip_wrapper_eva02.py:117: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at /opt/_internal/cpython-3.9.0/lib/python3.9/site-packages/torch/include/ATen/core/LegacyTypeDispatch.h:74.) attention_mask[i, : end_token_idx[i] + 1] = 1 /root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk INFO: 10.92.54.160:60802 - "POST /infer HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 72, in app response = await func(request) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 193, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable, File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run result = context.run(func, *args) File "/root/APE/demo/api.py", line 144, in interface predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( File "/root/APE/demo/predictor_lazy.py", line 212, in run_on_image predictions = self.predictor(image, text_prompt, mask_prompt) File "/root/APE/ape/engine/defaults.py", line 99, in __call__ predictions = self.model([inputs])[0] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/ape_deta.py", line 39, in forward losses = self.model_vision(batched_inputs, do_postprocess=do_postprocess) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_detr_segm_vl.py", line 428, in forward ) = self.transformer( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_transformer_vl.py", line 605, in forward keep_inds_topk = keep_inds[keep_inds_mask] RuntimeError: InnerRun:/usr1/02/workspace/j_ywhtRpPk/pytorch/torch_npu/csrc/framework/OpParamMaker.cpp:219 NPU error, error code is 500002 [Error]: A GE error occurs in the system. Rectify the fault based on the error information in the log, or you can ask us at follwing gitee link by issues: https://gitee.com/ascend/pytorch/issue EH9999: Inner Error! EH9999 [Exec][Op]Execute op failed. op type = NonZero, ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): [WARNING] nms proposals (0) < 900, running naive topk INFO: 10.92.54.160:60898 - "POST /infer HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__ await self.middleware_stack(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/routing.py", line 72, in app response = await func(request) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/fastapi/routing.py", line 193, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable, File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run result = context.run(func, *args) File "/root/APE/demo/api.py", line 144, in interface predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( File "/root/APE/demo/predictor_lazy.py", line 212, in run_on_image predictions = self.predictor(image, text_prompt, mask_prompt) File "/root/APE/ape/engine/defaults.py", line 99, in __call__ predictions = self.model([inputs])[0] File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/ape_deta.py", line 39, in forward losses = self.model_vision(batched_inputs, do_postprocess=do_postprocess) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_detr_segm_vl.py", line 428, in forward ) = self.transformer( File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/envs/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/root/APE/ape/modeling/ape_deta/deformable_transformer_vl.py", line 605, in forward keep_inds_topk = keep_inds[keep_inds_mask] RuntimeError: InnerRun:/usr1/02/workspace/j_ywhtRpPk/pytorch/torch_npu/csrc/framework/OpParamMaker.cpp:219 NPU error, error code is 500002 [Error]: A GE error occurs in the system. Rectify the fault based on the error information in the log, or you can ask us at follwing gitee link by issues: https://gitee.com/ascend/pytorch/issue EH9999: Inner Error! EH9999 [Exec][Op]Execute op failed. op type = NonZero, ge result = 4294967295[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): [04/10 06:53:56 detectron2]: ./tmp/1712731968.7509215.jpg: detected 7 instances in 67.40s INFO: 10.92.54.160:60698 - "POST /infer HTTP/1.1" 200 OK [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [04/10 06:54:44 detectron2]: ./tmp/1712732025.6573904.jpg: detected 14 instances in 58.55s INFO: 10.92.54.160:35880 - "POST /infer HTTP/1.1" 200 OK [04/10 06:54:45 detectron2]: ./tmp/1712732026.2869112.jpg: detected 14 instances in 59.58s INFO: 10.92.54.160:36146 - "POST /infer HTTP/1.1" 200 OK [04/10 06:54:56 detectron2]: ./tmp/1712732039.161774.jpg: detected 14 instances in 57.36s INFO: 10.92.54.160:36822 - "POST /infer HTTP/1.1" 200 OK [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [WARNING] nms proposals (0) < 900, running naive topk [04/10 06:55:44 detectron2]: ./tmp/1712732085.083786.jpg: detected 14 instances in 59.38s INFO: 10.92.54.160:39896 - "POST /infer HTTP/1.1" 200 OK EH9999: Inner Error! rtStreamSynchronize execute failed, reason=[vector core exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50] EH9999 synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] TraceBack (most recent call last): E90003: Compile operator failed, cause: Template constraint, detailed information: check_op_cap func failed, check_type: op_select_format, op_type:LinSpace failed, failure details: Compile_info: empty_compile_info Inputs: {'shape': (1,), 'ori_shape': (), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} {'shape': (1,), 'ori_shape': (), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} {'shape': (1,), 'ori_shape': (1,), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'int32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [1], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'is_first_layer': False, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} Outputs: {'shape': (-2,), 'ori_shape': (-2,), 'format': 'ND', 'sub_format': 0, 'ori_format': 'ND', 'dtype': 'float32', 'addr_type': 0, 'ddr_base_prop': 0, 'total_shape': [-2], 'slice_offset': (), 'L1_addr_offset': 0, 'L1_fusion_type': -1, 'L1_workspace_size': -1, 'valid_shape': (), 'split_index': 0, 'range': (), 'ori_range': (), 'atomic_type': '', 'input_c_values': -1} Attrs: []. TraceBack (most recent call last): The error from device(chipId:1, dieId:0), serial number is 12, there is an aivec error exception, core id is 34, error code = 0x800000, dump info: pc start: 0x1240c140d0a8, current: 0x1240c140d300, vec error info: 0xd115465300, mte error info: 0x3403000096, ifu error info: 0x23c9f37f1f880, ccu error info: 0x5f082e8a43b6e9e3, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd000288, para base: 0x1241803e6400.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1165] The extend info: errcode:(0x800000, 0, 0) errorStr: The DDR address of the MTE instruction is out of range. fixp_error0 info: 0x3000096, fixp_error1 info: 0x34 fsmId:1, tslot:0, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1177] Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1677] AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1454] Aicore kernel execute failed, device_id=1, stream_id=2, report_stream_id=2, task_id=49049, flip_num=0, fault kernel_name=Cast_e87590d11ccda8b259ab6b1ea7212319_high_performance_210000000, program id=121, hash=3394887288916785353.[FUNC:GetError][FILE:stream.cc][LINE:1454] [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1454] rtStreamSynchronizeWithTimeout execute failed, reason=[vector core exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50] Assert ((rt_ret) == 0) failed[FUNC:DoRtStreamSyncWithTimeout][FILE:utils.cc][LINE:40] [Exec][Op]Execute op failed. op type = NonMaxSuppressionV3, ge result = 1343225857[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] ./run_test.sh: line 54: 75357 Aborted (core dumped) python3.9 demo/api.py --config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py --with-box --opts train.init_checkpoint="/root/APE/model_final.pth" model.model_language.cache_dir="" model.model_vision.select_box_nums_for_evaluation=500 model.model_vision.text_feature_bank_reset=True model.model_vision.backbone.net.xattn=False ``` 二、软件版本:  -- CANN 版本: 7.0.RC1.10 -- Python 版本:3.9 -- 操作系统版本: Ubuntu 18.04 -- arch : x86_64 三、测试步骤 在910b上适配了APE大模型,并使用fastapi 代码进行测试: ``` # Copyright (c) Facebook, Inc. and its affiliates. import argparse import json import multiprocessing as mp import os import tempfile import time import warnings from collections import abc import sys import numpy as np import tqdm import torch import torch_npu from detectron2.config import LazyConfig, get_cfg from detectron2.data.detection_utils import read_image from detectron2.evaluation.coco_evaluation import instances_to_coco_json # from detectron2.projects.deeplab import add_deeplab_config # from detectron2.projects.panoptic_deeplab import add_panoptic_deeplab_config from detectron2.utils.logger import setup_logger from predictor_lazy import VisualizationDemo import base64 import io import gc import uvicorn import requests from ctypes import * from PIL import Image from fastapi import FastAPI from pydantic import BaseModel app = FastAPI() # constants WINDOW_NAME = "APE" def setup_cfg(args): # load config from file and command-line arguments cfg = LazyConfig.load(args.config_file) print ("=========== args.opts ============", args.opts) cfg = LazyConfig.apply_overrides(cfg, args.opts) if "output_dir" in cfg.model: cfg.model.output_dir = cfg.train.output_dir if "model_vision" in cfg.model and "output_dir" in cfg.model.model_vision: cfg.model.model_vision.output_dir = cfg.train.output_dir if "train" in cfg.dataloader: if isinstance(cfg.dataloader.train, abc.MutableSequence): for i in range(len(cfg.dataloader.train)): if "output_dir" in cfg.dataloader.train[i].mapper: cfg.dataloader.train[i].mapper.output_dir = cfg.train.output_dir else: if "output_dir" in cfg.dataloader.train.mapper: cfg.dataloader.train.mapper.output_dir = cfg.train.output_dir if "model_vision" in cfg.model: cfg.model.model_vision.test_score_thresh = args.confidence_threshold else: cfg.model.test_score_thresh = args.confidence_threshold # default_setup(cfg, args) setup_logger(name="ape") setup_logger(name="timm") return cfg def get_parser(): parser = argparse.ArgumentParser(description="Detectron2 demo for builtin configs") parser.add_argument( "--config-file", default="configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml", metavar="FILE", help="path to config file", ) parser.add_argument("--webcam", action="store_true", help="Take inputs from webcam.") parser.add_argument("--video-input", help="Path to video file.") parser.add_argument( "--input", nargs="+", help="A list of space separated input images; " "or a single glob pattern such as 'directory/*.jpg'", ) parser.add_argument( "--output", help="A file or directory to save output visualizations. " "If not given, will show output in an OpenCV window.", ) parser.add_argument( "--confidence-threshold", type=float, default=0.1, help="Minimum score for instance predictions to be shown", ) parser.add_argument( "--opts", help="Modify config options using the command-line 'KEY VALUE' pairs", default=[], nargs=argparse.REMAINDER, ) parser.add_argument("--text-prompt", default=None) parser.add_argument("--with-box", action="store_true", help="show box of instance") parser.add_argument("--with-mask", action="store_true", help="show mask of instance") parser.add_argument("--with-sseg", action="store_true", help="show mask of class") return parser class Req(BaseModel): image: str text: str @app.post('/infer') def interface(req: Req): image, text_prompt = req.image, req.text if not image or not text_prompt or '<' in text_prompt or '>' in text_prompt: return {"error": "input error"} image = Image.open(io.BytesIO(base64.b64decode(image))).convert("RGB") fn = time.time() try: images = [] os.makedirs('./tmp', exist_ok=True) image_path = f"./tmp/{fn}.jpg" image.save(image_path) images.append(image_path) for path in tqdm.tqdm(images, disable=not args.output): # use PIL, to be consistent with evaluation try: img = read_image(path, format="BGR") except Exception as e: print("*" * 60) print("fail to open image: ", e) print("*" * 60) continue start_time = time.time() predictions, visualized_output, visualized_outputs, metadata = demo.run_on_image( img, text_prompt=text_prompt, with_box=args.with_box, with_mask=args.with_mask, with_sseg=args.with_sseg, ) logger.info( "{}: {} in {:.2f}s".format( path, "detected {} instances".format(len(predictions["instances"])) if "instances" in predictions else "finished", time.time() - start_time, ) ) results = [] if "instances" in predictions: results = instances_to_coco_json( predictions["instances"].to(demo.cpu_device), path ) for result in results: result["category_name"] = metadata.thing_classes[result["category_id"]] result["image_name"] = result["image_id"] if args.output: os.makedirs(args.output, exist_ok=True) if os.path.isdir(args.output): assert os.path.isdir(args.output), args.output out_filename = os.path.join(args.output, os.path.basename(path)) else: assert len(args.input) == 1, "Please specify a directory with args.output" out_filename = args.output out_filename = out_filename.replace(".webp", ".png") out_filename = out_filename.replace(".crdownload", ".png") out_filename = out_filename.replace(".jfif", ".png") visualized_output.save(out_filename) for i in range(len(visualized_outputs)): out_filename = ( os.path.join(args.output, os.path.basename(path)) + "." + str(i) + ".png" ) visualized_outputs[i].save(out_filename) with open(out_filename + ".json", "w") as outp: json.dump(results, outp) gc.collect() finally: os.remove(f'./tmp/{fn}.jpg') return {'result': results} if __name__ == "__main__": import setproctitle setproctitle.setproctitle("APE") torch.npu.set_device('npu:1') # init model mp.set_start_method("spawn", force=True) args = get_parser().parse_args() setup_logger(name="fvcore") setup_logger(name="ape") logger = setup_logger() logger.info("Arguments: " + str(args)) cfg = setup_cfg(args) demo = VisualizationDemo(cfg, args=args) uvicorn.run(app, port=8198, host="0.0.0.0") ``` 启动该脚本后,通过jmter发压,在1并发和2并发时无异常,3并发之后开始报错,程序崩溃。
评论 (
4
)
登录
后才可以发表评论
状态
DONE
TODO
WIP
DONE
CLOSED
REJECTED
负责人
未设置
标签
未设置
项目
未立项任务
未立项任务
里程碑
未关联里程碑
未关联里程碑
Pull Requests
未关联
未关联
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
未关联
分支 (79)
标签 (186)
master
v2.8.0
v2.1.0
v2.6.0
v2.7.1
v2.5.1
v2.6.0-7.1.0
v2.5.1-7.1.0
v2.1.0-7.1.0
revert-merge-23967-master
revert-merge-23966-v2.8.0
revert-merge-23965-v2.7.1
revert-merge-23964-v2.6.0
revert-merge-23962-v2.5.1
revert-merge-23789-v2.1.0
v2.1.0-7.0.0
v2.4.0-7.0.0
v2.4.0
v2.3.1
v2.3.1-7.0.0
v2.5.1-7.0.0
v2.4.0-6.0.0
v2.3.1-6.0.0
v2.1.0-6.0.0
v2.1.0-6.0.rc3
v2.3.1-6.0.rc3
v2.4.0-6.0.rc3
v2.2.0
v1.11.0-6.0.rc1
v2.1.0-6.0.rc1
v2.2.0-6.0.rc1
v1.11.0-6.0.rc2
v2.1.0-6.0.rc2
v2.2.0-6.0.rc2
v2.3.1-6.0.rc2
v1.11.0
v2.1.0-5.0.0
v2.0.1-5.0.0
v1.11.0-5.0.0
v2.0.1
v2.1.0-5.0.rc3
v1.11.0-5.0.rc3
v2.0.1-5.0.rc3
v1.11.0-5.0.rc3.3
v1.8.1
v1.11.0-x1
v1.8.1-5.0.rc3
v1.11.0-5.0.rc2.2
v1.11.0-zj
v1.11.0-5.0.rc2.1
v2.0.1-5.0.rc2
v1.11.0-5.0.rc2
v1.8.1-5.0.rc2
v2.0.0-5.0.rc2
v1.8.1-5.0.rc1
v1.11.0-5.0.rc1
v1.11.0-yd
v1.11.0-xf
v1.11.0-infer
v1.11.0-bigkernel
v1.11.0-host_api
v1.8.1-3.0.0
v1.11.0-5.0.rc2.t100
v1.8.1-5.0.rc2.t100
v1.8.1-3.0.0-dev
v1.11.0-3.0.0
v2.0-dev
v1.8.1-3.0.rc3
v1.5.0-3.0.0
v1.5.0
v1.8.1-3.0.rc1
v1.11.0-3.0.rc3
v1.8.1-3.0.rc2
v1.5.0-3.0.rc3
v1.5.0-3.0.rc2
2.0.4.tr5
v1.5.0-3.0.rc1
2.0.2.tr5
2.0.3.tr5
v7.2.RC1.alpha002-pytorch2.8.0
v7.2.RC1.alpha002-pytorch2.7.1
v7.2.RC1.alpha002-pytorch2.6.0
v7.2.RC1.alpha002-pytorch2.1.0
v7.1.0.2-pytorch2.5.1
v7.1.0.2-pytorch2.6.0
v7.1.0.2-pytorch2.1.0
v7.0.0.1-pytorch2.4.0
v7.0.0.1-pytorch2.1.0
v7.2.RC1.alpha001-pytorch2.8.0
v7.2.RC1.alpha001-pytorch2.7.1
v7.2.RC1.alpha001-pytorch2.6.0
v7.2.RC1.alpha001-pytorch2.5.1
v7.2.RC1.alpha001-pytorch2.1.0
v7.1.0.1-pytorch2.6.0
v7.1.0.1-pytorch2.5.1
v7.1.0.1-pytorch2.1.0
v7.1.0-pytorch2.6.0
v7.1.0-pytorch2.5.1
v7.1.0-pytorch2.1.0
v7.1.RC1.alpha003-pytorch2.6.0
v7.1.RC1.alpha003-pytorch2.5.1
v7.1.RC1.alpha003-pytorch2.1.0
v7.1.RC1.alpha002-pytorch2.7.1
v7.1.RC1.alpha002-pytorch2.6.0
v7.1.RC1.alpha002-pytorch2.5.1
v7.1.RC1.alpha002-pytorch2.4.0
v7.1.RC1.alpha002-pytorch2.3.1
v7.1.RC1.alpha002-pytorch2.1.0
v6.0.0.1-pytorch2.4.0
v6.0.0.1-pytorch2.3.1
v6.0.0.1-pytorch2.1.0
v7.1.RC1.alpha001-pytorch2.6.0
v7.1.RC1.alpha001-pytorch2.5.1
v7.1.RC1.alpha001-pytorch2.4.0
v7.1.RC1.alpha001-pytorch2.3.1
v7.1.RC1.alpha001-pytorch2.1.0
v7.0.0-pytorch2.5.1
v7.0.0-pytorch2.4.0
v7.0.0-pytorch2.3.1
v7.0.RC1.alpha002-pytorch2.6.0
v7.0.0-pytorch2.1.0
v7.0.RC1.alpha002-pytorch2.5.1
v7.0.RC1.alpha002-pytorch2.4.0
v7.0.RC1.alpha002-pytorch2.3.1
v7.0.RC1.alpha002-pytorch2.1.0
v7.0.RC1.alpha001-pytorch2.5.1
v7.0.RC1.alpha001-pytorch2.1.0
v7.0.RC1.alpha001-pytorch2.4.0
v7.0.RC1.alpha001-pytorch2.3.1
v6.0.0-pytorch2.4.0
v6.0.0-pytorch2.3.1
v6.0.0-pytorch2.1.0
v6.0.0.alpha003-pytorch2.4.0
v6.0.0.alpha003-pytorch2.3.1
v6.0.0.alpha003-pytorch2.1.0
v6.0.0.alpha002-pytorch2.4.0
v6.0.0.alpha002-pytorch2.3.1
v6.0.0.alpha002-pytorch2.1.0
v6.0.0.alpha001-pytorch2.5.1
v6.0.rc3-pytorch2.4.0
v6.0.rc3-pytorch2.3.1
v6.0.rc3-pytorch2.1.0
v6.0.0.alpha001-pytorch2.4.0
v6.0.0.alpha001-pytorch2.3.1
v6.0.0.alpha001-pytorch2.1.0
v6.0.rc2.1-pytorch1.11.0
v6.0.rc2.1-pytorch2.3.1
v6.0.rc2.1-pytorch2.2.0
v6.0.rc2.1-pytorch2.1.0
v6.0.rc3.alpha003-pytorch2.3.1
v6.0.rc3.alpha003-pytorch2.1.0
v6.0.rc3.alpha001-pytorch2.4.0
v6.0.rc3.alpha002-pytorch2.3.1
v6.0.rc3.alpha002-pytorch2.2.0
v6.0.rc3.alpha002-pytorch2.1.0
v6.0.rc3.alpha002-pytorch1.11.0
v6.0.rc2-pytorch2.1.0
v6.0.rc2-pytorch2.3.1
v6.0.rc2-pytorch2.2.0
v6.0.rc2-pytorch1.11.0
v6.0.rc3.alpha001-pytorch2.3.1
v6.0.rc3.alpha001-pytorch2.2.0
v6.0.rc3.alpha001-pytorch2.1.0
v6.0.rc3.alpha001-pytorch1.11.0
v6.0.rc2.alpha002-pytorch2.3.1
v6.0.rc2.alpha003-pytorch1.11.0
v6.0.rc2.alpha003-pytorch2.2.0
v6.0.rc2.alpha003-pytorch2.1.0
v6.0.rc1.1-pytorch2.2.0
v6.0.rc1.1-pytorch2.1.0
v6.0.rc1.1-pytorch1.11.0
v5.0.1.2-pytorch1.11.0
v5.0.1.2-pytorch2.1.0
v5.0.1.2-pytorch2.0.1
v6.0.rc2.alpha002-pytorch2.2.0
v6.0.rc2.alpha002-pytorch2.1.0
v6.0.rc2.alpha002-pytorch1.11.0
v6.0.rc1-pytorch2.2.0
v6.0.rc1-pytorch2.1.0
v6.0.rc1-pytorch1.11.0
v6.0.rc2.alpha001-pytorch2.2.0
v6.0.rc2.alpha001-pytorch2.1.0
v6.0.rc2.alpha001-pytorch1.11.0
v6.0.rc1.alpha003-pytorch2.0.1
v6.0.rc1.alpha003-pytorch2.1.0
v5.0.1.1-pytorch2.0.1
v5.0.1.1-pytorch1.11.0
v5.0.1.1-pytorch2.1.0
v6.0.rc1.alpha003-pytorch1.11.0
v6.0.rc1.alpha002-pytorch2.1.0
v6.0.rc1.alpha002-pytorch1.11.0
v6.0.rc1.alpha002-pytorch2.0.1
v6.0.rc1.alpha001-pytorch2.2.0
v5.0.1-pytorch2.1.0
v5.0.1-pytorch2.0.1
v5.0.1-pytorch1.11.0
v6.0.RC1.alpha001-pytorch2.0.1
v6.0.RC1.alpha001-pytorch2.1.0
v6.0.RC1.alpha001-pytorch1.11.0
v5.0.0-pytorch2.1.0
v5.0.0-pytorch2.0.1
v5.0.0-pytorch1.11.0
v5.0.0.alpha003-pytorch2.1.0
v5.0.0.alpha003-pytorch2.0.1
v5.0.0.alpha003-pytorch1.11.0
v5.0.rc3.3-pytorch1.11.0
v5.0.rc3.2-pytorch1.11.0
v5.0.0.alpha002-pytorch2.1.0
v5.0.0.alpha002-pytorch2.0.1
v5.0.0.alpha002-pytorch1.11.0
v5.0.rc3.1-pytorch1.11.0
v5.0.0.alpha001-pytorch2.1.0
v5.0.0.alpha001-pytorch2.0.1
v5.0.0.alpha001-pytorch1.11.0
v5.0.rc3-pytorch2.1.0
v5.0.rc3-pytorch2.0.1
v5.0.rc3-pytorch1.11.0
v5.0.rc3.alpha003-pytorch2.0.1
v5.0.rc3.alpha003-pytorch1.11.0
v5.0.rc3.alpha003-pytorch1.8.1
v5.0.rc2.2-pytorch1.11.0
v5.0.rc2.1-pytorch1.11.0
v5.0.rc3.alpha002-pytorch2.0.1
v5.0.rc3.alpha002-pytorch1.11.0
v5.0.rc3.alpha002-pytorch1.8.1
v5.0.rc2-pytorch2.0.1
v5.0.rc2-pytorch1.11.0
v5.0.rc2-pytorch1.8.1
v5.0.rc3.alpha001-pytorch1.8.1
v5.0.rc3.alpha001-pytorch1.11.0
v5.0.rc2.alpha003-pytorch1.11.0
v5.0.rc2.alpha003-pytorch1.8.1
v5.0.rc2.alpha002-pytorch1.11.0
v5.0.rc2.alpha002-pytorch1.8.1
v5.0.rc1.alpha003-pytorch1.11.0
v5.0.rc1.alpha003-pytorch1.8.1
v5.0.rc1-pytorch1.11.0
v5.0.rc1-pytorch1.8.1
v5.0.rc1.alpha002-pytorch1.11.0
v5.0.rc1.alpha002-pytorch1.8.1
v5.0.rc1.alpha001-pytorch1.11.0
v5.0.rc1.alpha001-pytorch1.8.1
v3.0.0-pytorch1.11.0
v3.0.0-pytorch1.8.1
v3.0.0-pytorch1.5.0
v3.0.alpha006-pytorch1.8.1
v3.0.alpha005-pytorch1.8.1
v3.0.alpha003-pytorch1.8.1
v3.0.rc3-pytorch1.11.0
v3.0.rc3-pytorch1.8.1
v3.0.rc3-pytorch1.5.0
v3.0.rc2-pytorch1.8.1
v3.0.rc2-pytorch1.5.0
v3.0.rc1-pytorch1.8.1
v3.0.rc1-pytorch1.5.0
v2.0.4
v2.0.4-rc2
v2.0.4-rc1
v2.0.3.1
v2.0.3
v2.0.3-rc4
v2.0.3-rc3
v2.0.3-rc2
v2.0.3-rc1
v2.0.2
开始日期   -   截止日期
-
置顶选项
不置顶
置顶等级:高
置顶等级:中
置顶等级:低
优先级
不指定
严重
主要
次要
不重要
预计工期
(小时)
参与者(1)
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch
点此查找更多帮助
搜索帮助
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
仓库举报
回到顶部
登录提示
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册