Resembling the rapid learning capability of human, low-shot learning empowers vision systems to understand new concepts by training with few samples. Leading approaches derived from meta-learning on images with a single visual object. Obfuscated by a complex background and multiple objects in one image, they are hard to promote the research of low-shot object detection/segmentation. In this work, we present a flexible and general methodology to achieve these tasks. Our work extends Faster /Mask R-CNN by proposing meta-learning over RoI (Region-of-Interest) features instead of a full image feature. This simple spirit disentangles multi-object information merged with the background, without bells and whistles, enabling Faster / Mask R-CNN turn into a meta-learner to achieve the tasks. Specifically, we introduce a Predictor-head Remodeling Network (PRN) that shares its main backbone with Faster / Mask R-CNN. PRN receives images containing low-shot objects with their bounding boxes or masks to infer their class attentive vectors. The vectors take channel-wise soft-attention on RoI features, remodeling those R-CNN predictor heads to detect or segment the objects consistent with the classes these vectors represent. In our experiments, Meta R-CNN yields the new state of the art in low-shot object detection and improves low-shot object segmentation by Mask R-CNN. Code: https://yanxp.github.io/metarcnn.html.
@inproceedings{yan2019meta,
title={Meta r-cnn: Towards general solver for instance-level low-shot learning},
author={Yan, Xiaopeng and Chen, Ziliang and Xu, Anni and Wang, Xiaoxi and Liang, Xiaodan and Lin, Liang},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
year={2019}
}
Note: ALL the reported results use the data split released from TFA official repo. Currently, each setting is only evaluated with one fixed few shot dataset. Please refer to DATA Preparation to get more details about the dataset and data preparation.
Following the original implementation, it consists of 2 steps:
Step1: Base training
Step2: Few shot fine-tuning:
# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_base-training.py 8
# step2: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
configs/detection/meta_rcnn/voc/split1/meta-rcnn_r101_c4_8xb4_voc-split1_1shot-fine-tuning.py 8
Note:
work_dirs/{BASE TRAINING CONFIG}/base_model_random_init_bbox_head.pth
.
When the model is saved to different path, please update the argument load_from
in step3 few shot fine-tune configs instead
of using resume_from
.load_from
to the downloaded checkpoint path.Note:
Arch | Split | Base AP50 | ckpt | log |
---|---|---|---|---|
r101 c4 | 1 | 72.8 | ckpt | log |
r101 c4 | 2 | 73.3 | ckpt | log |
r101 c4 | 3 | 74.2 | ckpt | log |
Arch | Split | Shot | Base AP50 | Novel AP50 | ckpt | log |
---|---|---|---|---|---|---|
r101 c4 | 1 | 1 | 58.8 | 40.2 | ckpt | log |
r101 c4 | 1 | 2 | 67.7 | 49.9 | ckpt | log |
r101 c4 | 1 | 3 | 69.0 | 54.0 | ckpt | log |
r101 c4 | 1 | 5 | 70.8 | 55.0 | ckpt | log |
r101 c4 | 1 | 10 | 71.7 | 56.3 | ckpt | log |
r101 c4 | 2 | 1 | 61.0 | 27.3 | ckpt | log |
r101 c4 | 2 | 2 | 69.5 | 34.8 | ckpt | log |
r101 c4 | 2 | 3 | 71.0 | 39.0 | ckpt | log |
r101 c4 | 2 | 5 | 71.7 | 36.0 | ckpt | log |
r101 c4 | 2 | 10 | 72.6 | 40.1 | ckpt | log |
r101 c4 | 3 | 1 | 63.0 | 32.0 | ckpt | log |
r101 c4 | 3 | 2 | 70.1 | 37.9 | ckpt | log |
r101 c4 | 3 | 3 | 71.3 | 42.5 | ckpt | log |
r101 c4 | 3 | 5 | 72.3 | 49.6 | ckpt | log |
r101 c4 | 3 | 10 | 73.2 | 49.1 | ckpt | log |
Note:
Arch | Base mAP | ckpt | log |
---|---|---|---|
r50 c4 | 27.8 | ckpt | log |
Few Shot Finetuning
Arch | Shot | Base mAP | Novel AP50 | ckpt | log |
---|---|---|---|---|---|
r50 c4 | 10 | 25.1 | 9.4 | ckpt | log |
r50 c4 | 30 | 26.9 | 11.5 | ckpt | log |
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。