按官方文档的范例操作下来,碰到报错
INFO 04-15 06:19:23 model_runner.py:1094] Starting to load model /data/models/Qwen2.5-Coder-7B-Instruct...
2025-04-15 06:19:26,198 - mindformers[mindformers/core/context/build_context.py:322] - INFO - env: {'HCCL_DETERMINISTIC': 'false', 'ASCEND_LAUNCH_BLOCKING': '0', 'TE_PARALLEL_COMPILER': '0', 'CUSTOM_MATMUL_SHUFFLE': 'on', 'LCCL_DETERMINISTIC': '0', 'MS_ENABLE_GRACEFUL_EXIT': '0'}
[rank0]: Traceback (most recent call last):
[rank0]: File "", line 1, in
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/utils.py", line 986, in inner
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 230, in init
[rank0]: self.llm_engine = self.engine_class.from_engine_args(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 517, in from_engine_args
[rank0]: engine = cls(
[rank0]: ^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 273, in init
[rank0]: self.model_executor = executor_class(vllm_config=vllm_config, )
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 36, in init
[rank0]: self._init_executor()
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/executor/gpu_executor.py", line 35, in _init_executor
[rank0]: self.driver_worker.load_model()
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/worker/worker.py", line 155, in load_model
[rank0]: self.model_runner.load_model()
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 1096, in load_model
[rank0]: self.model = get_model(vllm_config=self.vllm_config)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/model_executor/model_loader/init.py", line 12, in get_model
[rank0]: return loader.load_model(vllm_config=vllm_config)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 363, in load_model
[rank0]: model = _initialize_model(vllm_config=vllm_config)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 116, in _initialize_model
[rank0]: return model_class(vllm_config=vllm_config, prefix=prefix)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/root/miniconda3/lib/python3.11/site-packages/vllm_mindspore/model_executor/models/mf_models/qwen2.py", line 92, in init
[rank0]: self.mf_config.model.model_config.parallel_config = (
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: AttributeError: 'NoneType' object has no attribute 'model_config'
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
文档范例有一些写的不够清晰,一些信息会有缺失,我们会尽快修复范例文档
对于您的问题可以尝试通过以下步骤解决:
拉取并加载hub.oepkgs.net/oedeploy/openeuler/aarch64/mindspore:20250411镜像
设置环境变量
export vLLM_MODEL_BACKEND=MindFormers
export MINDFORMERS_MODEL_CONFIG=/home/predict_qwen2_5_7b_instruct.yaml
export vLLM_MODEL_MEMORY_USE_GB=18
export ASCEND_TOTAL_MEMORY_GB=29
export MS_ALLOC_CONF=enable_vmm:true
predict_qwen2_5_7b_instruct.yaml文件使用
https://gitee.com/mindspore/mindformers/blob/br_infer_deepseek_os/research/qwen2_5/predict_qwen2_5_7b_instruct.yaml
以qwen2.5-coder-7B为例,从社区(https://www.modelscope.cn/models/Qwen/Qwen2.5-Coder-7B-Instruct/summary)下载Qwen2.5-Coder-7B-Instruct模型,存放到本地:/home/Qwen2.5-Coder-7B-Instruct
更改代码为:
import vllm_mindspore # Add this line on the top of script.
from vllm import LLM, SamplingParams
# Sample prompts.
prompts = [
"I am",
"Today is",
"What is"
]
# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.0, top_p=0.95)
# Create an LLM.
llm = LLM(model="/home/Qwen2.5-Coder-7B-Instruct")
# Generate texts from the prompts. The output is a list of RequestOutput objects
# that contain the prompt, generated text, and other information.
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
Sign in to comment