diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/1.5_requirements.txt b/PyTorch/contrib/cv/classification/SqueezeNet1_1/1.5_requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..2cad530ad373ad69ed745ec63caa894456afca41 --- /dev/null +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/1.5_requirements.txt @@ -0,0 +1,2 @@ +torchvision==0.2.2.post3 +pillow==8.4.0 \ No newline at end of file diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/requirements.txt b/PyTorch/contrib/cv/classification/SqueezeNet1_1/1.8_requirements.txt similarity index 39% rename from PyTorch/contrib/cv/classification/SqueezeNet1_1/requirements.txt rename to PyTorch/contrib/cv/classification/SqueezeNet1_1/1.8_requirements.txt index 579afd421b85e2d75eeb41c701211659df4cc8ee..5c226da18bf1c5e74170a14f43ab41ef13b6e698 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/requirements.txt +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/1.8_requirements.txt @@ -1,2 +1,2 @@ -torchvision +torchvision==0.9.1 pillow==9.1.0 \ No newline at end of file diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/README.md b/PyTorch/contrib/cv/classification/SqueezeNet1_1/README.md index b0a223da98be48ed4ed5f5e96810943603329a18..49336ec82508754e3a4f3c30b4aa11d0a35d6880 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/README.md +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/README.md @@ -1,41 +1,181 @@ -# Squeezenet1_1 +# SqueezeNet1_1 for PyTorch -This implements training of Squeezenet1_1 on the ImageNet dataset, mainly modified from [pytorch/examples](https://github.com/pytorch/examples/tree/master/imagenet). -## Squeezenet1_1 Detail +- [概述](#概述) +- [准备训练环境](#准备训练环境) +- [开始训练](#开始训练) +- [训练结果展示](#训练结果展示) +- [版本说明](#版本说明) -As of the current date, Ascend-Pytorch is still inefficient for contiguous operations. -Therefore, Squeezenet1_1 is re-implemented using semantics such as custom OP. For details, see models/Squeezenet.py . +# 概述 -## Requirements +## 简述 -- Install PyTorch ([pytorch.org](http://pytorch.org)) -- `pip install -r requirements.txt` - Note: pillow recommends installing a newer version. If the corresponding torchvision version cannot be installed directly, you can use the source code to install the corresponding version. The source code reference link: https://github.com/pytorch/vision, -Suggestion the pillow is 9.1.0 and the torchvision is 0.6.0 -- Download the ImageNet dataset from http://www.image-net.org/ - - Then, and move validation images to labeled subfolders, using [the following shell script](https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh) +SqueezeNet是Han等提出的一种轻量且高效的CNN模型,SqueezeNet网络基本单元是采用了模块化的卷积,其称为Fire module。Fire模块主要包含两层卷积操作:一是采用1x1卷积核的squeeze层;二是混合使用1x1和3x3卷积核的expand层。它参数比AlexNet少50x,但模型性能与AlexNet接近。 -## Training +- 参考实现: -To train a model, run `main.py or main_8p.py` with the desired model architecture and the path to the ImageNet dataset: + ``` + url=https://github.com/pytorch/examples.git + commit_id=d1f1a5445dcbbd0d733dc38a32d9ae153337daae + ``` -```bash +- 适配昇腾 AI 处理器的实现: -# O2 training 1p -bash scripts/run_1p.sh + ``` + url=https://gitee.com/ascend/ModelZoo-PyTorch.git + code_path=PyTorch/contrib/cv/classification + ``` -# O2 training 8p -bash scripts/run_8p.sh -``` -## Squeezenet1_1 training result +# 准备训练环境 -| Acc@1 | FPS | Npu_nums | Epochs | AMP_Type | -| :------: | :------: | :------: | :------: | :------: | -| - | 384 | 1 | 240 | O2 | -| 58.54 | 1963 | 8 | 240 | O2 | +## 准备环境 +- 当前模型支持的 PyTorch 版本和已知三方库依赖如下表所示。 + **表 1** 版本支持表 + | Torch_Version | 三方库依赖版本 | + | :--------: | :----------------------------------------------------------: | + | PyTorch 1.5 | torchvision==0.2.2.post3;pillow==8.4.0 | + | PyTorch 1.8 | torchvision==0.9.1;pillow==9.1.0 | | +- 环境准备指导。 + + 请参考《[Pytorch框架训练环境准备](https://www.hiascend.com/document/detail/zh/ModelZoo/pytorchframework/ptes)》。 + +- 安装依赖。 + + 在模型源码包根目录下执行命令,安装模型对应PyTorch版本需要的依赖。 + + ``` + pip install -r 1.5_requirements.txt # Pytorch1.5版本 + + pip install -r 1.8_requirements.txt # Pytorch1.8版本 + ``` + + > **说明:** + > 只需执行一条对应的PyTorch版本依赖安装命令。 + + +## 准备数据集 + +1. 获取数据集。 + + 用户自行获取原始数据集,可选用的开源数据集包括ImageNet2012,将数据集上传到服务器任意路径下并解压。 + + 以ImageNet2012数据集为例,数据集目录结构参考如下所示。 + + ``` + ├── ImageNet2012 + ├──train + ├──类别1 + │──图片1 + │──图片2 + │ ... + ├──类别2 + │──图片1 + │──图片2 + │ ... + ├──... + ├──val + ├──类别1 + │──图片1 + │──图片2 + │ ... + ├──类别2 + │──图片1 + │──图片2 + │ ... + ``` + + > **说明:** + >该数据集的训练过程脚本只作为一种参考示例。 + + +# 开始训练 + +## 训练模型 + +1. 进入解压后的源码包根目录。 + + ``` + cd /${模型文件夹名称} + ``` + +2. 运行训练脚本。 + + 该模型支持单机单卡训练和单机8卡训练。 + + - 单机单卡训练 + + 启动单卡训练。 + + ``` + bash ./test/train_full_1p.sh --data_path=/data/xxx/ # 单卡精度 + + bash ./test/train_performance_1p.sh --data_path=/data/xxx/ # 单卡性能 + ``` + + - 单机8卡训练 + + 启动8卡训练。 + + ``` + bash ./test/train_full_8p.sh --data_path=/data/xxx/ # 8卡精度 + + bash ./test/train_performance_8p.sh --data_path=/data/xxx/ # 8卡性能 + ``` + + - 单机8卡评测 + + 启动8卡评测。 + + ``` + bash ./test/train_eval_8p.sh --data_path=/data/xxx/ --resume=real_pre_train_model_path + ``` + + --data_path参数填写数据集路径,需写到数据集的一级目录。 + + --resume参数填写训练权重生成路径。 + + 模型训练脚本参数说明如下: + + ``` + 公共参数: + --addr //主机地址 + --seed //训练的随机数种子 + --workers //加载数据进程数 + --lr //初始学习率 + --momentum //动量 + --weight-decay //权重衰减 + --print-freq //打印周期 + --device //使用npu还是gpu + --dist-backend //通信后端 + --epochs //重复训练次数 + --batch-size //训练批次大小 + --amp //是否使用混合精度 + ``` + + 训练完成后,权重文件保存在当前路径下,并输出模型训练精度和性能信息。 + + +# 训练结果展示 + +**表 2** 训练结果展示表 + +| NAME | Acc@1 | FPS | Epochs | AMP_Type | Torch_version | +| :-------: |:-----: | :---: | :------: | :-------: | :-------: | +| 1p-NPU | - |2099.210| 1 | O2 | 1.8 | +| 8p-NPU | 57.634 |10374.2 | 240 | O2 | 1.8 | + +# 版本说明 + +## 变更 + +2023.04.18:更新pytorch1.8版本,并发布。 + +## FAQ + +无。 \ No newline at end of file diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/main.py b/PyTorch/contrib/cv/classification/SqueezeNet1_1/main.py index 3852a80149bfa320d4bf1d54f10c4af570924c6c..346d2bade7cb542e7963c6c9c36dd1a702864512 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/main.py +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/main.py @@ -23,6 +23,8 @@ import time import warnings import torch +if torch.__version__ >= '1.8': + import torch_npu import torch.nn as nn import torch.nn.parallel import torch.backends.cudnn as cudnn diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/main_8p.py b/PyTorch/contrib/cv/classification/SqueezeNet1_1/main_8p.py index 1be42ba0cb79e7a454b9a701002d9d8e0f632f56..ab3c235c74b6eb3309748b20d44d5b4292da7f97 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/main_8p.py +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/main_8p.py @@ -23,6 +23,8 @@ import time import warnings import torch +if torch.__version__ >= '1.8': + import torch_npu import torch.nn as nn import torch.nn.parallel import torch.backends.cudnn as cudnn diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_eval_8p.sh b/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_eval_8p.sh index f639f1f48d51b8a7c2c7cb6f12d4cf272654bacc..2f58dac836111e08ae1f02d47d9e088c021c112c 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_eval_8p.sh +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_eval_8p.sh @@ -29,6 +29,8 @@ do batch_size=`echo ${para#*=}` elif [[ $para == --learning_rate* ]];then learning_rate=`echo ${para#*=}` + elif [[ $para == --resume* ]];then + resume=`echo ${para#*=}` fi done diff --git a/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_full_8p.sh b/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_full_8p.sh index 05c135bdc3efe12e94ff246afa3eafd92c69fddf..99e127eede9d76ba5b8e0bfc55f6e6690d9ded31 100644 --- a/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_full_8p.sh +++ b/PyTorch/contrib/cv/classification/SqueezeNet1_1/test/train_full_8p.sh @@ -145,4 +145,5 @@ echo "CaseName = ${CaseName}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${Cas echo "ActualFPS = ${ActualFPS}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${CaseName}.log echo "TrainingTime = ${TrainingTime}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${CaseName}.log echo "ActualLoss = ${ActualLoss}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainAccuracy = ${train_accuracy}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${CaseName}.log echo "E2ETrainingTime = ${e2e_time}" >> ${test_path_dir}/output/$ASCEND_DEVICE_ID/${CaseName}.log \ No newline at end of file