使用torch2.1+torch_npu跑glm3-6b推理报错
硬件:910A
驱动:23.0.0
cann版本:7.0.0.beta1
报错信息:
Traceback (most recent call last):
File "/root/test/glm3_6b.py", line 14, in
response, history = model.chat(tokenizer, "你好", history=[])
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1035, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/transformers/generation/utils.py", line 2651, in _sample
outputs = self(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 937, in forward
transformer_outputs = self.transformer(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 830, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 640, in forward
layer_ret = layer(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 544, in forward
attention_output, kv_cache = self.self_attention(
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/torch_glm3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 408, in forward
query_layer = apply_rotary_pos_emb(query_layer, rotary_pos_emb)
NotImplementedError: Unknown device for graph fuser
是不是用了torch.jit
经排查,发现是cann版本问题,cann版本降到7.0.RC1可以正常推理。
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
登录 后才可以发表评论