76 Star 588 Fork 1.1K

Ascend/pytorch

启动模型报错:AssertionError: Please install FlashAttention first, e.g., with pip install flash-attn

DONE
需求
创建于  
2024-07-17 11:12

Traceback (most recent call last):
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/cli/deploy.py", line 5, in
deploy_main()
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/llm/deploy.py", line 543, in llm_deploy
model, template = prepare_model_template(args)
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/llm/infer.py", line 182, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/llm/utils/model.py", line 5572, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/llm/utils/model.py", line 4766, in get_model_tokenizer_phi
return get_model_tokenizer_from_repo(
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/swift/llm/utils/model.py", line 947, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
return model_class.from_pretrained(
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/modelscope/utils/hf_util.py", line 76, in from_pretrained
return ori_from_pretrained(cls, model_dir, *model_args, **kwargs)
File "/root/miniconda3/envs/test/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3375, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 743, in init
self.transformer = TelechatModel(config)
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 603, in init
self.h = nn.ModuleList([TelechatBlock(config, _) for _ in range(config.num_hidden_layers)])
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 603, in
self.h = nn.ModuleList([TelechatBlock(config, _) for _ in range(config.num_hidden_layers)])
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 512, in init
self.self_attention = TelechatAttention(config, layer_idx)
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 357, in init
self.core_attention_flash = FlashSelfAttention(
File "/root/.cache/huggingface/modules/transformers_modules/TeleChat-12B/modeling_telechat.py", line 167, in init
assert flash_attn_unpadded_func is not None, ('Please install FlashAttention first, '
AssertionError: Please install FlashAttention first, e.g., with pip install flash-attn

pip安装后显示cuda相关信息未找到。

评论 (3)

czs1886 创建了需求 11个月前
czs1886 修改了描述 11个月前
展开全部操作日志

flash-attn安装需要cuda

目前cann不支持吗?

huangyunlong 任务状态TODO 修改为DONE 11个月前

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(2)
huangyunlong-huangyunlong2022 czs1886-czs1886
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch

搜索帮助