108 Star 867 Fork 1.5K

MindSpore/models

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

Contents

M2Det Description

M2Det (Multi-Level Multi-Scale Detector) is an end-to-end one-stage object detection model. It uses Multi-Level Feature Pyramid Network (MLFPN) to extract features from input image, and then produces dense bounding boxes and category scores.

Paper: Q. Zhao, T. Sheng, Y.Wang, Zh. Tang, Y. Chen, L. Cai, H. Ling. M2Det: A Single-Shot Object Detector base on Multi-Level Feature Pyramid Network.

Model Architecture

M2Det consists of several modules. Feature Fusion Module (FFM) rescales and concatenates features from several backbone feature layers (VGG, ResNet, etc.) to produce base feature for further modules. Thinned U-shape Modules (TUMs) use encoder-decoder architecture to produce multi-level multi-scale features, which afterwards aggregated by Scale-wise Aggregation Module (SFAM). Resulting Multi-Level Feature Pyramid is used by prediction layers to achieve local bounding box regression and classification.

Dataset

Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.

Dataset used: MS-COCO

COCO is a large-scale object detection, segmentation, and captioning dataset. The COCO train, validation, and test sets, containing more than 200,000 images and 80 object categories. All object instance are annotated with bounding boxes and detailed segmentation mask.

For training the M2Det model, download the following files:

  • 2014 Train images [83K / 13GB]
  • 2014 Val images [41K / 6GB]
  • 2015 Test images [81K / 12GB]
  • 2014 Train/Val annotations [241MB]
  • 2014 Testing Image info [1MB]
  • 2015 Testing Image info [2MB]

Dataset organize way

.
└─ coco
  ├─ annotations
    ├── captions_train2014.json
    ├── captions_val2014.json
    ├── image_info_test2014.json
    ├── image_info_test2015.json
    ├── image_info_test-dev2015.json
    ├── instances_minival2014.json
    ├── instances_train2014.json
    ├── instances_val2014.json
    └── instances_valminusminival2014.json
  ├─images
    ├── test2015
      └── COCO_test2015_*.jpg
    ├── train2014
      └── COCO_train2014_*.jpg
    └── val2014
      └── COCO_val2014_*.jpg

...

You can find instances_minival2014.json and instances_valminusminival2014.json here: http://datasets.d2.mpi-inf.mpg.de/hosang17cvpr/coco_minival2014.tar.gz

Environment Requirements

Quick Start

After installing MindSpore via the official website, specify dataset location in src/config.py file. Run Soft-NMS building script with the following command:

bash ./scripts/make.sh

Download pretrained VGG-16 backbone from https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth Convert pretrained VGG-16 backbone to Mindspore format with the following command:

bash ./scripts/convert_vgg.sh /path/to/vgg16_reducedfc.pth

A converted checkpoint will be in the same directory as the original file, but with ".ckpt" extension.

You can start training and evaluation as follows:

  • Training

    For GPU training, set device = 'GPU' in src/config.py.

    # Single GPU training
    bash ./scripts/run_standalone_train.sh [DEVICE_ID] [PRETRAINED_BACKBONE] [DATASET_PATH]
    
    # Multi-GPU training
    bash ./scripts/run_distributed_train_gpu.sh [RANK_SIZE] [DEVICE_START] [PRETRAINED_BACKBONE] [DATASET_PATH]
    

    Example:

    # Single GPU training
    bash ./scripts/run_standalone_train.sh 0 /path/to/vgg16_reducedfc.ckpt /path/to/COCO/
    
    # Multi-GPU training
    bash ./scripts/run_distributed_train_gpu.sh 8 0 /path/to/vgg16_reducedfc.ckpt /path/to/COCO/
    
  • Evaluation:

    bash ./scripts/run_eval.sh [DEVICE_ID] [CHECKPOINT_PATH] [DATASET_PATH]
    

    Example:

    bash ./scripts/run_eval.sh 0 /path/to/checkpoint /path/to/COCO/
    

Script Description

Script and Sample Code

|-- README.md                                      # English README
|-- convert.py                                     # Script for pretrained VGG backbone conversion
|-- eval.py                                        # Evaluation
|-- export.py                                      # MINDIR model export
|-- requirements.txt                               # pip dependencies
|-- scripts
|   |-- convert_vgg.sh                             # Script for pretrained VGG backbone conversion
|   |-- make.sh                                    # Script for building Soft-NMS function
|   |-- run_distributed_train_gpu.sh               # GPU distributed training script
|   |-- run_eval.sh                                # Evaluation script
|   |-- run_export.sh                              # MINDIR model export script
|   `-- run_standalone_train.sh                    # Single-device training script
|-- src
|   |-- nms
        `-- cpu_nms.pyx                            # Soft-NMS algorithm
|   |-- box_utils.py                               # Function for bounding boxes processing
|   |-- build.py                                   # Script for building Soft-NMS function
|   |-- callback.py                                # Custom callback functions
|   |-- coco_utils.py                              # COCO dataset functions
|   |-- config.py                                  # Configuration file
|   |-- dataset.py                                 # Dataset loader
|   |-- detector.py                                # Bounding box detector
|   |-- loss.py                                    # Multibox loss function
|   |-- lr_scheduler.py                            # Learning rate scheduler utilities
|   |-- model.py                                   # M2Det model architecture
|   |-- priors.py                                  # SSD prior boxes definition
|   `-- utils.py                                   # General utilities
`-- train.py                                       # Training

Parameter configuration

Parameters for both training and evaluation can be set in src/config.py.

random_seed = 1
experiment_tag = 'm2det512_vgg16_lr_7.5e-4'

train_cfg = dict(
    lr = 7.5e-4,
    warmup = 5,
    per_batch_size = 7,
    gamma = [0.5, 0.2, 0.1, 0.1],
    lr_epochs = [90, 110, 130, 150, 160],
    total_epochs = 160,
    print_epochs = 10,
    num_workers = 3,
    )

test_cfg = dict(
    cuda = True,
    topk = 0,
    iou = 0.45,
    soft_nms = True,
    score_threshold = 0.1,
    keep_per_class = 50,
    save_folder = 'eval'
    )

optimizer = dict(
    type='SGD',
    momentum=0.9,
    weight_decay=0.00005,
    loss_scale=1,
    dampening=0.0,
    clip_grad_norm=5.)

Training Process

Training

Run M2Det on GPU

For GPU training, set device = 'GPU' in src/config.py.

  • Training using single device (1p)

    bash ./scripts/run_standalone_train.sh 0 /path/to/vgg16_reducedfc.ckpt /path/to/COCO/
    
  • Distributed Training (8p)

    bash ./scripts/run_distributed_train_gpu.sh 8 0 /path/to/vgg16_reducedfc.ckpt /path/to/COCO/
    

Checkpoints will be saved in ./checkpoints/[EXPERIMENT_TAG] folder. Checkpoint filename format: [MODEL.M2DET_CONFIG.BACKBONE]_[MODEL.INPUT_SIZE]-[EPOCH]_[ITERATION].ckpt. Final checkpoint filename format: [MODEL.M2DET_CONFIG.BACKBONE]_[MODEL.INPUT_SIZE]-final.ckpt

Evaluation Process

Evaluation

To start evaluation, run the following command:

bash ./scripts/run_eval.sh [DEVICE_ID] [CHECKPOINT_PATH] [DATASET_PATH]

# Example:
bash ./scripts/run_eval.sh 0 /path/to/checkpoint /path/to/COCO/

Model Description

Performance

Training Performance

Training performance in the following tables is obtained by the M2Det-512-VGG16 model based on the COCO dataset:

Parameters M2Det-512-VGG16 (8GPU)
Model Version M2Det-512-VGG16
Resource Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 8x V100-PCIE
Uploaded Date 2022-06-29
MindSpore version 1.5.2
Dataset COCO
Training Parameters seed=1;epoch=160;batch_size = 7;lr=1e-3;weight_decay = 5e-5, clip_by_global_norm = 4.0
Optimizer SGD
Loss Function Multibox MSE loss
Outputs Bounding boxes and class scores
Loss value 2.299
Average checkpoint (.ckpt file) size 507 Mb
Speed 707 ms/step, 1493 s/epoch
Total time 2 days 18 hours 16 minutes
Scripts M2Det training script

Evaluation Performance

  • Evaluation performance in the following tables is obtained by the M2Det-512-VGG16 model based on the COCO dataset:
Parameters M2Det-512-VGG16 (8GPU)
Model Version M2Det-512-VGG16
Resource Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 8x V100-PCIE
Uploaded Date 2022-06-29
MindSpore version 1.5.2
Dataset COCO
Loss Function Multibox MSE loss
AP 36.2
Scripts M2Det evaluation script

Description of Random Situation

Global training random seed is fixed in src/config.py with random_seed parameter. 'None' value will execute training without dataset shuffle.

ModelZoo Homepage

Please check the official homepage.

马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/mindspore/models.git
git@gitee.com:mindspore/models.git
mindspore
models
models
master

搜索帮助