74 Star 572 Fork 1.1K

Ascend/pytorch

 / 详情

模型保存报错 TypeError: torch_save() takes 2 positional arguments but 6 were given

DONE
训练问题
创建于  
2024-07-15 16:52

一、问题现象(附报错日志上下文):
模型保存时报错,TypeError: torch_save() takes 2 positional arguments but 6 were given

二、软件版本:
-- CANN 版本 (e.g., CANN 3.0.x,5.x.x):
--Tensorflow/Pytorch/MindSpore 版本: Pytorch 2.1.0
--Python 版本 (e.g., Python 3.7.5): Python 3.9.10
-- MindStudio版本 (e.g., MindStudio 2.0.0 (beta3)):
--操作系统版本 (e.g., Ubuntu 18.04): EulerOS 2.0 (SP10)

三、测试步骤:

# 训练代码
trainer.train()
trainer.save_state()
torch.cuda.synchronize()
trainer.save_model(output_dir)

四、日志信息:

  File "work_path/train.py", line 351, in train
    trainer.save_model(output_dir)
  File "transformer_path/transformers/trainer.py", line 2818, in save_model
    self._save(output_dir)
  File "transformer_path/transformers/trainer.py", line 2899, in _save
    self.model.save_pretrained(
  File "transformer_path/transformers/modeling_utils.py", line 1843, in save_pretrained
    save_function(shard, os.path.join(save_directory, shard_file))
  File "torch_npu_path/torch_npu/utils/serialization.py", line 256, in save
    return _get_npu_save_result(obj, f, pickle_module, pickle_protocol, True, _disable_byteorder_record)
  File "torch_npu_path/torch_npu/utils/serialization.py", line 237, in _get_npu_save_result
    result = torch.serialization.save(obj, f, pickle_module, pickle_protocol, True, _disable_byteorder_record)
TypeError: torch_save() takes 2 positional arguments but 6 were given
[ERROR] 2024-07-15-16:49:25 (PID:1799707, Device:0, RankID:0) ERR99999 UNKNOWN application exception

评论 (1)

wenyuan 创建了训练问题 10个月前

麻烦看下torch_save是不是脚本中自定义的,对torch.serialization.save进行了patch

huangyunlong 任务状态TODO 修改为DONE 10个月前

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(2)
huangyunlong-huangyunlong2022 wenyuan-bjtuwenyuan
Python
1
https://gitee.com/ascend/pytorch.git
git@gitee.com:ascend/pytorch.git
ascend
pytorch
pytorch

搜索帮助