76 Star 588 Fork 1.1K

Ascend/pytorch

torchvision与PyTorch版本不匹配

DONE
需求
创建于  
2023-03-16 15:01

"Couldn't load custom C++ ops. This can happen if your PyTorch and "
RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.

评论 (9)

wangh 创建了需求 2年前

Please add labels , also you can visit https://gitee.com/ascend/community/blob/master/labels.md to find more.
为了让代码尽快被审核,请您为Issue打上标签,打上标签的Issue可以直接推送给责任人进行审核。
更多的标签可以查看https://gitee.com/ascend/community/blob/master/labels.md
以模型训练相关代码提交为例,如果你提交的是模型训练代码,你可以这样评论:
//train/model
另外你还可以给这个Issue标记类型,例如是bugfix或者是特性需求:
//kind/bug or //kind/feature
恭喜你,你已经学会了使用命令来打标签,接下来就在下面的评论里打上标签吧!

您好,具体的操作可以给下截图么,如果是torchvision和pytorch版本不对应,请严格按照readme进行操作

(MindSpore) [ma-user TransMVSNet-master]$bash scripts/train.sh
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

FutureWarning,
[W OperatorEntry.cpp:133] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::is_pinned(Tensor self, Device? device=None) -> (bool)
registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
dispatch key: BackendSelect
previous kernel: registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:606
new kernel: registered at /usr1/workspace/FPTA_Daily_Plugin_open_date/CODE/torch_npu/csrc/aten/PinMemory.cpp:43 (function registerKernel)
[W OperatorEntry.cpp:133] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::_pin_memory(Tensor self, Device? device=None) -> (Tensor)
registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
dispatch key: BackendSelect
previous kernel: registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:606
new kernel: registered at /usr1/workspace/FPTA_Daily_Plugin_open_date/CODE/torch_npu/csrc/aten/PinMemory.cpp:43 (function registerKernel)
[W OperatorEntry.cpp:133] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::empty.memory_format(int[] size, , int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None, int? memory_format=None) -> (Tensor)
registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
dispatch key: CPU
previous kernel: registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:606
new kernel: registered at /usr1/workspace/FPTA_Daily_Plugin_open_date/CODE/torch_npu/csrc/aten/common/EmptyTensor.cpp:115 (function registerKernel)
[W OperatorEntry.cpp:133] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::empty_strided(int[] size, int[] stride, , int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
dispatch key: CPU
previous kernel: registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:606
new kernel: registered at /usr1/workspace/FPTA_Daily_Plugin_open_date/CODE/torch_npu/csrc/aten/common/EmptyTensor.cpp:115 (function registerKernel)
[W OperatorEntry.cpp:133] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::_has_compatible_shallow_copy_type(Tensor self, Tensor from) -> (bool)
registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
dispatch key: (catch all)
previous kernel: registered at /usr1/v1.11.0/pytorch/build/aten/src/ATen/RegisterCompositeImplicitAutograd.cpp:10303
new kernel: registered at /usr1/workspace/FPTA_Daily_Plugin_open_date/CODE/torch_npu/csrc/aten/ops/HasCompatibleShallowCopyType.cpp:37 (function registerKernel)
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/resource_handle_pb2.py:23: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_pb=_b('\n(tensorboardX/proto/resource_handle.proto\x12\x0ctensorboardX"r\n\x13ResourceHandleProto\x12\x0e\n\x06\x64\x65vice\x18\x01 \x01(\t\x12\x11\n\tcontainer\x18\x02 \x01(\t\x12\x0c\n\x04name\x18\x03 \x01(\t\x12\x11\n\thash_code\x18\x04 \x01(\x04\x12\x17\n\x0fmaybe_type_name\x18\x05 \x01(\tB/\n\x18org.tensorflow.frameworkB\x0eResourceHandleP\x01\xf8\x01\x01\x62\x06proto3')
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/resource_handle_pb2.py:42: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_options=None, file=DESCRIPTOR),
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/resource_handle_pb2.py:84: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_end=172,
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_shape_pb2.py:23: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_pb=_b('\n%tensorboardX/proto/tensor_shape.proto\x12\x0ctensorboardX"|\n\x10TensorShapeProto\x12/\n\x03\x64im\x18\x02 \x03(\x0b\x32".tensorboardX.TensorShapeProto.Dim\x12\x14\n\x0cunknown_rank\x18\x03 \x01(\x08\x1a!\n\x03\x44im\x12\x0c\n\x04size\x18\x01 \x01(\x03\x12\x0c\n\x04name\x18\x02 \x01(\tB2\n\x18org.tensorflow.frameworkB\x11TensorShapeProtosP\x01\xf8\x01\x01\x62\x06proto3')
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_shape_pb2.py:42: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_options=None, file=DESCRIPTOR),
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_shape_pb2.py:63: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_end=179,
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/types_pb2.py:24: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_pb=_b('\n\x1etensorboardX/proto/types.proto\x12\x0ctensorboardX
\xc2\x05\n\x08\x44\x61taType\x12\x0e\n\nDT_INVALID\x10\x00\x12\x0c\n\x08\x44T_FLOAT\x10\x01\x12\r\n\tDT_DOUBLE\x10\x02\x12\x0c\n\x08\x44T_INT32\x10\x03\x12\x0c\n\x08\x44T_UINT8\x10\x04\x12\x0c\n\x08\x44T_INT16\x10\x05\x12\x0b\n\x07\x44T_INT8\x10\x06\x12\r\n\tDT_STRING\x10\x07\x12\x10\n\x0c\x44T_COMPLEX64\x10\x08\x12\x0c\n\x08\x44T_INT64\x10\t\x12\x0b\n\x07\x44T_BOOL\x10\n\x12\x0c\n\x08\x44T_QINT8\x10\x0b\x12\r\n\tDT_QUINT8\x10\x0c\x12\r\n\tDT_QINT32\x10\r\x12\x0f\n\x0b\x44T_BFLOAT16\x10\x0e\x12\r\n\tDT_QINT16\x10\x0f\x12\x0e\n\nDT_QUINT16\x10\x10\x12\r\n\tDT_UINT16\x10\x11\x12\x11\n\rDT_COMPLEX128\x10\x12\x12\x0b\n\x07\x44T_HALF\x10\x13\x12\x0f\n\x0b\x44T_RESOURCE\x10\x14\x12\x10\n\x0c\x44T_FLOAT_REF\x10\x65\x12\x11\n\rDT_DOUBLE_REF\x10\x66\x12\x10\n\x0c\x44T_INT32_REF\x10g\x12\x10\n\x0c\x44T_UINT8_REF\x10h\x12\x10\n\x0c\x44T_INT16_REF\x10i\x12\x0f\n\x0b\x44T_INT8_REF\x10j\x12\x11\n\rDT_STRING_REF\x10k\x12\x14\n\x10\x44T_COMPLEX64_REF\x10l\x12\x10\n\x0c\x44T_INT64_REF\x10m\x12\x0f\n\x0b\x44T_BOOL_REF\x10n\x12\x10\n\x0c\x44T_QINT8_REF\x10o\x12\x11\n\rDT_QUINT8_REF\x10p\x12\x11\n\rDT_QINT32_REF\x10q\x12\x13\n\x0f\x44T_BFLOAT16_REF\x10r\x12\x11\n\rDT_QINT16_REF\x10s\x12\x12\n\x0e\x44T_QUINT16_REF\x10t\x12\x11\n\rDT_UINT16_REF\x10u\x12\x15\n\x11\x44T_COMPLEX128_REF\x10v\x12\x0f\n\x0b\x44T_HALF_REF\x10w\x12\x13\n\x0f\x44T_RESOURCE_REF\x10xB,\n\x18org.tensorflow.frameworkB\x0bTypesProtosP\x01\xf8\x01\x01\x62\x06proto3')
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/types_pb2.py:36: DeprecationWarning: Call to deprecated create function EnumValueDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
type=None),
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/types_pb2.py:201: DeprecationWarning: Call to deprecated create function EnumDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_end=755,
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_pb2.py:28: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
dependencies=[tensorboardX_dot_proto_dot_resource__handle__pb2.DESCRIPTOR,tensorboardX_dot_proto_dot_tensor__shape__pb2.DESCRIPTOR,tensorboardX_dot_proto_dot_types__pb2.DESCRIPTOR,])
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_pb2.py:46: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_options=None, file=DESCRIPTOR),
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/tensor_pb2.py:151: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_end=588,
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/summary_pb2.py:26: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
dependencies=[tensorboardX_dot_proto_dot_tensor__pb2.DESCRIPTOR,])
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/summary_pb2.py:44: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_options=None, file=DESCRIPTOR),
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/tensorboardX/proto/summary_pb2.py:58: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
serialized_end=122,
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
current time 20230316_124845
creating new summary file
argv: ['--local_rank=0', '--logdir=./outputs/dtu_training', '--dataset=dtu_yao', '--batch_size=1', '--epochs=16', '--trainpath=/data/DTU/dtu/', '--trainlist=lists/dtu/train.txt', '--testlist=lists/dtu/val.txt', '--numdepth=192', '--ndepths=48,32,8', '--nviews=5', '--wd=0.0001', '--depth_inter_r=4.0,1.0,0.5', '--lrepochs=6,8,12:2', '--dlossw=1.0,1.0,1.0']
################################ args ################################
mode train <class 'str'>
model mvsnet <class 'str'>
device npu <class 'str'>
dataset dtu_yao <class 'str'>
trainpath /data/DTU/dtu/ <class 'str'>
testpath /data/DTU/dtu/ <class 'str'>
trainlist lists/dtu/train.txt <class 'str'>
testlist lists/dtu/val.txt <class 'str'>
epochs 16 <class 'int'>
lr 0.001 <class 'float'>
lrepochs 6,8,12:2 <class 'str'>
wd 0.0001 <class 'float'>
nviews 5 <class 'int'>
batch_size 1 <class 'int'>
numdepth 192 <class 'int'>
interval_scale 1.06 <class 'float'>
loadckpt None <class 'NoneType'>
logdir ./outputs/dtu_training <class 'str'>
resume False <class 'bool'>
summary_freq 10 <class 'int'>
save_freq 1 <class 'int'>
eval_freq 1 <class 'int'>
seed 1 <class 'int'>
pin_m False <class 'bool'>
local_rank 0 <class 'int'>
share_cr False <class 'bool'>
ndepths 48,32,8 <class 'str'>
depth_inter_r 4.0,1.0,0.5 <class 'str'>
dlossw 1.0,1.0,1.0 <class 'str'>
cr_base_chs 8,8,8 <class 'str'>
grad_method detach <class 'str'>
using_apex False <class 'bool'>
sync_bn False <class 'bool'>
opt_level O0 <class 'str'>
keep_batchnorm_fp32 None <class 'NoneType'>
loss_scale None <class 'NoneType'>
########################################################################
netphs:[48, 32, 8], depth_intervals_ratio:[4.0, 1.0, 0.5], grad:detach, chs:[8, 8, 8]
*

start at epoch 0
Number of model parameters: 1148924
mvsdataset kwargs {}
dataset train metas: 27097
mvsdataset kwargs {}
dataset test metas: 6174
Traceback (most recent call last):
File "train.py", line 402, in
train(model, model_loss, optimizer, TrainImgLoader, TestImgLoader, start_epoch, args)
File "train.py", line 78, in train
loss, scalar_outputs, image_outputs = train_sample(model, model_loss, optimizer, sample, args)
File "train.py", line 159, in train_sample
outputs = model(sample_cuda["imgs"], sample_cuda["proj_matrices"], sample_cuda["depth_values"])
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/work/TransMVSNet-master/models/TransMVSNet.py", line 161, in forward
features.append(self.feature(img))
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/work/TransMVSNet-master/models/module.py", line 411, in forward
out = self.out1(intra_feat)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/work/TransMVSNet-master/models/dcn.py", line 79, in forward
mask=mask
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torchvision/ops/deform_conv.py", line 65, in deform_conv2d
_assert_has_ops()
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torchvision/extension.py", line 34, in _assert_has_ops
"Couldn't load custom C++ ops. This can happen if your PyTorch and "
RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.
THPModule_npu_shutdown success.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 91665) of binary: /home/ma-user/anaconda3/envs/MindSpore/bin/python
Traceback (most recent call last):
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/run.py", line 718, in run
)(*cmd_args)
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2023-03-16_12:49:03
host : notebook-23100b59-742d-4c21-91df-e9e7e16dcdbc.notebook-23100b59-742d-4c21-91df-e9e7e16dcdbc-distributed.default.svc.cluster.local
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 91665)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

是否按照readme中的描述,安装对应版本的包呢

好的,我仔细看看

建议编译安装

好的,多谢

问题解决了没有,如果已经解决,请关闭一下

Destiny 任务状态TODO 修改为DONE 2年前

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
优先级
预计工期 (小时)
开始日期   -   截止日期
-
置顶选项
参与者(4)
ascend-robot-ascend-robot Destiny-wx1103340 wangh-wanghao19890128 白浪-radish-and-cabbage
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch

搜索帮助