OLMo-7B模型推理报错

一、问题现象（附报错日志上下文）：
/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/transformers/generation/logits_process.py:453: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
sorted_indices_to_remove[..., -self.min_tokens_to_keep :] = 0
E39999: Inner Error!
E39999 An exception occurred during AICPU execution, stream_id:261, task_id:2528, errcode:21008, msg:inner error[FUNC:ProcessAicpuErrorInfo][FILE:device_error_proc.cc][LINE:726]
TraceBack (most recent call last):
Kernel task happen error, retCode=0x2a, [aicpu exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1612]
Aicpu kernel execute failed, device_id=0, stream_id=261, task_id=2528, errorCode=2a.[FUNC:PrintAicpuErrorInfo][FILE:task_info.cc][LINE:1457]
Aicpu kernel execute failed, device_id=0, stream_id=261, task_id=2528, fault op_name=[FUNC:GetError][FILE:stream.cc][LINE:1483]
rtStreamSynchronizeWithTimeout execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50]
synchronize stream failed, runtime result = 507018[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]

DEVICE[0] PID[2163110]:
EXCEPTION TASK:
Exception info:TGID=3586480, model id=65535, stream id=261, stream phase=SCHEDULE, task id=2528, task type=aicpu kernel, recently received task id=2528, recently send task id=2527, task phase=RUN
Message info[0]:aicpu=0,slot_id=0,report_mailbox_flag=0x5a5a5a5a,state=0x5210
Other info[0]:time=2024-02-23-11:43:58.049.374, function=proc_aicpu_task_done, line=970, error code=0x2a
Traceback (most recent call last):
File "/home/ma-user/work/olm_infer.py", line 20, in
response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
File "/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/transformers/generation/utils.py", line 1765, in generate
return self.sample(
File "/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/transformers/generation/utils.py", line 2921, in sample
if unfinished_sequences.max() == 0:
RuntimeError: ACL stream synchronize failed.
[W NPUStream.cpp:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50]
EH9999 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W NPUStream.cpp:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50]
EH9999 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W NPUStream.cpp:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:50]
EH9999 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)

二、软件版本:
-- CANN 版本 (e.g., CANN 3.0.x，5.x.x): CANN-7.0.RC1
--Tensorflow/Pytorch/MindSpore 版本: pytorch 2.1.0
--Python 版本 (e.g., Python 3.7.5):Python 3.9.8
--操作系统版本 (e.g., Ubuntu 18.04):EulerOS 2.0 (SP8)

三、测试步骤：

import hf_olmo
import torch_npu
import torch
import time
from transformers import AutoModelForCausalLM, AutoTokenizer
# 开启二进制
torch_npu.npu.set_compile_mode(jit_compile=False)

device = torch.device("npu:0")
olmo = AutoModelForCausalLM.from_pretrained("/home/ma-user/work/OLMo-7B", trust_remote_code=True)
print(type(olmo))
olmo = olmo.to(device).eval()
tokenizer = AutoTokenizer.from_pretrained("/home/ma-user/work/OLMo-7B", trust_remote_code=True)
message = ["python是"]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
print(inputs)
# optional verifying cuda
inputs = {k: v.to(device) for k,v in inputs.items()}
# olmo = olmo.to('cuda')
response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])

去掉do_sample=True可以进行推理，但是那些top_k啥的参数就不起作用了

什么芯片形态，有aicpu日志么

昇腾910，aicpu日志指的是哪些呢

/root/ascend/log/debug/device-x
看下这个日志

问题解决了吗，我也遇到相同问题了

Ascend/pytorch

内容风险标识

评论 (6)

Ascend/pytorch .gitee-modal { width: 500px !important; }

内容风险标识