310idou卡量化DeepSeek-R1-Distill-Qwen-7B模型遇到错误

按照https://gitee.com/ascend/ModelZoo-PyTorch/blob/master/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Qwen-7B/README.md说明进行稀疏量化遇到如下问题：
[root@localhost Qwen]# export ASCEND_RT_VISIBLE_DEVICES=0
[root@localhost Qwen]# export PYTORCH_NPU_ALLOC_CONF=expandable_segments:False
[root@localhost Qwen]# python3 quant_qwen.py \
> --model_path /home/HwHiAiUser/Projects/weights/DeepSeekR1-7B/ \
> --save_directory /home/HwHiAiUser/Projects/weights/w8a8/ds7b/ \
> --calib_file ../common/boolq.jsonl \
> --w_bit 4 \
> --a_bit 8 \
> --fraction 0.011 \
> --co_sparse True \
> --device_type npu \
> --use_sigma True \
> --is_lowbit True
/usr/local/lib64/python3.9/site-packages/torch_npu/contrib/transfer_to_npu.py:295: ImportWarning:
    *************************************************************************************************************
    The torch.Tensor.cuda and torch.nn.Module.cuda are replaced with torch.Tensor.npu and torch.nn.Module.npu now..
    The torch.cuda.DoubleTensor is replaced with torch.npu.FloatTensor cause the double type is not supported now..
    The backend in torch.distributed.init_process_group set to hccl now..
    The torch.cuda.* and torch.cuda.amp.* are replaced with torch.npu.* and torch.npu.amp.* now..
    The device parameters have been replaced with npu in the function below:
    torch.logspace, torch.randint, torch.hann_window, torch.rand, torch.full_like, torch.ones_like, torch.rand_like, torch.randperm, torch.arange, torch.frombuffer, torch.normal, torch._empty_per_channel_affine_quantized, torch.empty_strided, torch.empty_like, torch.scalar_tensor, torch.tril_indices, torch.bartlett_window, torch.ones, torch.sparse_coo_tensor, torch.randn, torch.kaiser_window, torch.tensor, torch.triu_indices, torch.as_tensor, torch.zeros, torch.randint_like, torch.full, torch.eye, torch._sparse_csr_tensor_unsafe, torch.empty, torch._sparse_coo_tensor_unsafe, torch.blackman_window, torch.zeros_like, torch.range, torch.sparse_csr_tensor, torch.randn_like, torch.from_file, torch._cudnn_init_dropout_state, torch._empty_affine_quantized, torch.linspace, torch.hamming_window, torch.empty_quantized, torch._pin_memory, torch.autocast, torch.load, torch.Generator, torch.set_default_device, torch.Tensor.new_empty, torch.Tensor.new_empty_strided, torch.Tensor.new_full, torch.Tensor.new_ones, torch.Tensor.new_tensor, torch.Tensor.new_zeros, torch.Tensor.to, torch.Tensor.pin_memory, torch.nn.Module.to, torch.nn.Module.to_empty
    *************************************************************************************************************

warnings.warn(msg, ImportWarning)
/usr/local/lib64/python3.9/site-packages/torch_npu/contrib/transfer_to_npu.py:250: RuntimeWarning: torch.jit.script and torch.jit.script_method will be disabled by transfer_to_npu, which currently does not support them, if you need to enable them, please do not use transfer_to_npu.
  warnings.warn(msg, RuntimeWarning)
2025-05-28 15:47:55,870 - msmodelslim-logger - WARNING - The current CANN version does not support importing the migration and migration_vit packages.
2025-05-28 15:47:55,877 - msmodelslim-logger - WARNING - The current CANN version does not support recall_window method.
2025-05-28 15:47:55,880 - msmodelslim-logger - WARNING - The current CANN version does not support LayerSelector quantile method.
2025-05-28 15:47:55,882 - msmodelslim-logger - WARNING - The file path '/home/HwHiAiUser/Projects/weights' may be insecure because it can be written by others.
2025-05-28 15:47:55,882 - msmodelslim-logger - INFO - write directory exists, write file to directory '/home/HwHiAiUser/Projects/weights/w8a8/ds7b/'
2025-05-28 15:47:55,882 - msmodelslim-logger - WARNING - The file path '/home/HwHiAiUser/Projects/weights' may be insecure because it can be written by others.
2025-05-28 15:47:55,884 - msmodelslim-logger - WARNING - The file path '/home/HwHiAiUser/Projects/weights' may be insecure because it can be written by others.
2025-05-28 15:47:55,885 - msmodelslim-logger - WARNING - The file path '/home/HwHiAiUser/Projects/weights' may be insecure because it can be written by others.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████| 2/2 [00:04<00:00,  2.21s/it]
2025-05-28 15:48:06,831 - msmodelslim-logger - WARNING - The file path '/home/HwHiAiUser/Projects/weights' may be insecure because it can be written by others.
2025-05-28 15:48:07,023 - msmodelslim-logger - INFO - Automatically disabling the last linear layer: lm_head based on the `disable_last_linear` parameter setting.
feature process:   0%|                                                                       | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 450, in rollback_names_process
    self.act_states = get_features(model, calib_data[:5], "features.npy", enable_tensor_dump)
  File "quant_funcs.py", line 1417, in quant_funcs.get_features
  File "quant_funcs.py", line 1418, in quant_funcs.get_features
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1164, in forward
    outputs = self.model(
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 854, in forward
    inputs_embeds = self.embed_tokens(input_ids)
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/usr/local/lib64/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: call aclnnEmbedding failed, detail:EZ9999: Inner Error!
EZ9999: [PID: 1020913] 2025-05-28-15:48:07.078.206 Parse dynamic kernel config fail.
        TraceBack (most recent call last):
       AclOpKernelInit failed opType
       GatherV2AiCore ADD_TO_LAUNCHER_LIST_AICORE failed.

[ERROR] 2025-05-28-15:48:07 (PID:1020913, Device:0, RankID:-1) ERR01100 OPS call acl api failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/HwHiAiUser/Projects/msit-master/msmodelslim/example/Qwen/quant_qwen.py", line 322, in <module>
    quantifier.convert(tokenized_calib_data, save_directory, args.disable_level, part_file_size=args.part_file_size, \
  File "/home/HwHiAiUser/Projects/msit-master/msmodelslim/example/Qwen/quant_qwen.py", line 244, in convert
    calibrator = Calibrator(self.model, self.quant_config, calib_data=tokenized_data, disable_level=disable_level)
  File "/usr/local/lib64/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 160, in __init__
    self.rollback_names_process(model)
  File "/usr/local/lib/python3.9/site-packages/msmodelslim/pytorch/llm_ptq/llm_ptq_tools/quant_tools.py", line 452, in rollback_names_process
    raise Exception("Please check the model and calibration data, "
Exception: ('Please check the model and calibration data, ensure that your model can run with `model(*(calib_data[0]))`.', RuntimeError('call aclnnEmbedding failed, detail:EZ9999: Inner Error!\nEZ9999: [PID: 1020913] 2025-05-28-15:48:07.078.206 Parse dynamic kernel config fail.\n        TraceBack (most recent call last):\n       AclOpKernelInit failed opType\n       GatherV2AiCore ADD_TO_LAUNCHER_LIST_AICORE failed.\n\n[ERROR] 2025-05-28-15:48:07 (PID:1020913, Device:0, RankID:-1) ERR01100 OPS call acl api failed'))

OS版本：openEuler release 22.03 (LTS-SP4)
cann版本：Ascend-cann-toolkit_8.0.0.alpha003_linux-x86_64.run
driver版本：Ascend-hdk-310p-npu-driver_24.1.0.1_linux-x86-64.run
msit安装也是下载的最新分支了

Ascend/msit
暂停

内容风险标识

评论 (2)

Ascend/msit暂停 .gitee-modal { width: 500px !important; }

内容风险标识