Multi-Scale Positive Sample Refinement for Few-Shot Object Detection (ECCV'2020)

Abstract

Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances, and is useful when manual annotation is time-consuming or data acquisition is limited. Unlike previous attempts that exploit few-shot classification techniques to facilitate FSOD, this work highlights the necessity of handling the problem of scale variations, which is challenging due to the unique sample distribution. To this end, we propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD. It generates multi-scale positive samples as object pyramids and refines the prediction at various scales. We demonstrate its advantage by integrating it as an auxiliary branch to the popular architecture of Faster R-CNN with FPN, delivering a strong FSOD solution. Several experiments are conducted on PASCAL VOC andMS COCO, and the proposed approach achieves state of the art results and significantly outperforms other counterparts, which shows its effectiveness. Code is available at https://github.com/jiaxi-wu/MPSR.

Citation

@inproceedings{wu2020mpsr,
  title={Multi-Scale Positive Sample Refinement for Few-Shot Object Detection},
  author={Wu, Jiaxi and Liu, Songtao and Huang, Di and Wang, Yunhong},
  booktitle={European Conference on Computer Vision},
  year={2020}
}

Note: ALL the reported results use the data split released from TFA official repo. Currently, each setting is only evaluated with one fixed few shot dataset. Please refer to DATA Preparation to get more details about the dataset and data preparation.

How to reproduce MPSR

Following the original implementation, it consists of 2 steps:

Step1: Base training
- use all the images and annotations of base classes to train a base model.
Step2: Few shot fine-tuning:
- use the base model from step1 as model initialization and further fine tune the model with few shot datasets.

An example of VOC split1 1 shot setting with 8 gpus

# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
    configs/detection/mpsr/voc/split1/mpsr_r101_fpn_2xb2_voc-split1_base-training.py 8

# step2: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
    configs/detection/mpsr/voc/split1/mpsr_r101_fpn_2xb2_voc-split1_1shot-fine-tuning.py 8

Note:

The default output path of the reshaped base model in step2 is set to work_dirs/{BASE TRAINING CONFIG}/base_model_random_init_bbox_head.pth. When the model is saved to different path, please update the argument load_from in step3 few shot fine-tune configs instead of using resume_from.
To use pre-trained checkpoint, please set the load_from to the downloaded checkpoint path.

Results on VOC dataset

Note:

We follow the official implementation using batch size 2x2 for training.
The performance of few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.
The difficult samples will not be used in base training or few shot setting.

Base Training

Arch	Split	Base AP50	ckpt	log
r101 fpn	1	80.5	ckpt	log
r101 fpn	2	81.3	ckpt	log
r101 fpn	3	81.8	ckpt	log
r101 fpn*	1	77.8	ckpt	-
r101 fpn*	2	78.3	ckpt	-
r101 fpn*	3	77.8	ckpt	-

Note:

* means the model is converted from official repo, as we find that the base model trained from mmfewshot will get worse performance in fine-tuning especially in 1/2/3 shots, even their base training performance are higher. We will continue to investigate and improve it.

Few Shot Fine-tuning

Arch	Split	Shot	Base AP50	Novel AP50	ckpt	log
r101 fpn*	1	1	60.6	38.5	ckpt	log
r101 fpn*	1	2	65.9	45.9	ckpt	log
r101 fpn*	1	3	68.1	49.2	ckpt	log
r101 fpn*	1	5	69.2	55.8	ckpt	log
r101 fpn*	1	10	71.2	58.7	ckpt	log
r101 fpn*	2	1	61.0	25.8	ckpt	log
r101 fpn*	2	2	66.9	29.0	ckpt	log
r101 fpn*	2	3	67.6	40.6	ckpt	log
r101 fpn*	2	5	70.4	41.5	ckpt	log
r101 fpn*	2	10	71.7	47.1	ckpt	log
r101 fpn*	3	1	57.9	34.6	ckpt	log
r101 fpn*	3	2	65.7	41.0	ckpt	log
r101 fpn*	3	3	69.1	44.1	ckpt	log
r101 fpn*	3	5	70.4	48.5	ckpt	log
r101 fpn*	3	10	72.5	51.7	ckpt	log

* means using base model converted from official repo

Results on COCO dataset

Note:

We follow the official implementation using batch size 2x2 for training.
The performance of base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.

Base Training

Arch	Base mAP	ckpt	log
r101 fpn	34.6	ckpt	log

Few Shot Fine-tuning

Arch	Shot	Base mAP	Novel mAP	ckpt	log
r101 fpn	10	23.2	12.6	ckpt	log
r101 fpn	30	25.2	18.1	ckpt	log

OpenMMLab / mmfewshot

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection (ECCV'2020)

Abstract

Citation

How to reproduce MPSR

An example of VOC split1 1 shot setting with 8 gpus

Results on VOC dataset

Base Training

Few Shot Fine-tuning

Results on COCO dataset

Base Training

Few Shot Fine-tuning

简介

发行版

贡献者

近期动态

OpenMMLab / mmfewshot .gitee-modal { width: 500px !important; }

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection (ECCV'2020)

Abstract

Citation

How to reproduce MPSR

An example of VOC split1 1 shot setting with 8 gpus

Results on VOC dataset

Base Training

Few Shot Fine-tuning

Results on COCO dataset

Base Training

Few Shot Fine-tuning

简介

发行版

开源评估指数源自 OSS-Compass 评估体系，评估体系围绕以下三个维度对项目展开评估：

贡献者

近期动态

搜索帮助

OpenMMLab / mmfewshot