models: Models of MindSpore

Contents

Focus-DETR Description

Focus-DETR is a model that focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Compared with the state-of-the-art sparse transformed-based detector under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO.

Paper: Less is More: Focus Attention for Efficient DETR. Dehua Zheng*, Wenhui Dong*, Hailin Hu, Xinghao Chen, Yunhe Wang.

Model architecture

Our Focus-DETR comprises a backbone network, a Transformer encoder, and a Transformer decoder. We design a foreground token selector (FTS) based on top-down score modulations across multi-scale features. And the selected tokens by a multi-category score predictor and foreground tokens go through the Pyramid Encoder to remedy the limitation of deformable attention in distant information mixing.

Focus-DETR

Dataset

Dataset used: COCO2017

Dataset size：~19G
- Train - 18G，118000 images
- Val - 1G，5000 images
- Annotations - 241M，instances，captions，person_keypoints etc
Data format：image and json files
- The directory structure is as follows:

.
├── annotations  # annotation jsons
├── test2017  # test data
├── train2017  # train dataset
└── val2017  # val dataset

Environment Requirements

Hardware(GPU)
- Prepare hardware environment with GPU.
Framework
- MindSpore
For more information, please check the resources below£º
- MindSpore Tutorials
- MindSpore Python API

Eval process

Usage

After installing MindSpore via the official website, you can start evaluation as follows:

Launch

# infer example python
bash scripts/DINO_eval_ms_coco.sh /path/to/your/COCODIR /path/to/your/checkpoint
# bash scripts/DINO_eval_ms_coco.sh coco2017 ./logs/best_ckpt.ckpt

checkpoint can be downloaded at https://download.mindspore.cn/model_zoo/research/cv/Focus-DETR/.

Result

Results of Focus-DETR with Resnet50 backbone:
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.479
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.659
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.521
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.323
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.505
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.619
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.372
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.640
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.720
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.568
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.757
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.878

ModelZoo Homepage

Please check the official homepage.

MindSpore/models

Contents

Focus-DETR Description

Model architecture

Dataset

Environment Requirements

Eval process

Usage

Launch

Result

ModelZoo Homepage

About

Releases

Contributors

Language(Optional)

Activities

MindSpore/models .gitee-modal { width: 500px !important; }

Contents

Focus-DETR Description

Model architecture

Dataset

Environment Requirements

Eval process

Usage

Launch

Result

ModelZoo Homepage

About

Releases

The Open Source Evaluation Index is derived from the OSS Compass evaluation system, which evaluates projects around the following three dimensions

Contributors

Language(Optional)

Activities

Search

MindSpore/models