# DAB-DETR **Repository Path**: lemonjack/DAB-DETR ## Basic Information - **Project Name**: DAB-DETR - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-07-16 - **Last Updated**: 2025-07-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper [DAB-DETR](https://arxiv.org/abs/2201.12329). Authors: [Shilong Liu](https://www.lsl.zone/), [Feng Li](https://scholar.google.com/citations?hl=zh-CN&user=ybRe9GcAAAAJ), [Hao Zhang](https://scholar.google.com/citations?user=B8hPxMQAAAAJ&hl=zh-CN), [Xiao Yang](https://ml.cs.tsinghua.edu.cn/~xiaoyang/), [Xianbiao Qi](https://scholar.google.com/citations?user=odjSydQAAAAJ&hl=en), [Hang Su](https://www.suhangss.me/), [Jun Zhu](https://ml.cs.tsinghua.edu.cn/~jun/index.shtml), [Lei Zhang](https://www.leizhang.org/) # News [2022/9/22]: We release a toolbox [**detrex**](https://github.com/IDEA-Research/detrex) that provides state-of-the-art Transformer-based detection algorithms. It includes DINO **with better performance**. Welcome to use it! [2022/7/12]: Code for [DINO](https://arxiv.org/abs/2203.03605) is available now! [[code for DINO](https://github.com/IDEACVR/DINO)]. \ [2022/6]: We release a unified detection and segmentation model [Mask DINO](https://arxiv.org/pdf/2206.02777.pdf) that achieves the best results on all the three segmentation tasks (**54.5** AP on [COCO instance leaderboard](https://paperswithcode.com/sota/instance-segmentation-on-coco-minival), **59.4** PQ on [COCO panoptic leaderboard](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-minival), and **60.8** mIoU on [ADE20K semantic leaderboard](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k))! Code will be available [here](https://github.com/IDEACVR/MaskDINO). \ [2022/5/28] Code for [DN-DETR](https://arxiv.org/abs/2203.01305) is available [here](https://github.com/IDEA-opensource/DN-DETR)! \ [2022/5/22] We release a notebook for visualizion in [inference_and_visualize.ipynb](inference_and_visualize.ipynb). \ [2022/4/14] We release the [```.pptx``` file](resources/comparison_raleted_works_raw.pptx) of our [DETR-like models comparison figure](#comparison-of-detr-like-models) for those who want to draw model arch figures in paper. \ [2022/4/12] We fix a bug in the file ```datasets/coco_eval.py```. The parameter ```useCats``` of ```CocoEvaluator``` should be ```True``` by default. \ [2022/4/9] Our code is available! \ [2022/3/9] We build a repo [Awesome Detection Transformer](https://github.com/IDEACVR/awesome-detection-transformer) to present papers about transformer for detection and segmenttion. Welcome to your attention! \ [2022/3/8] Our new work [DINO](https://arxiv.org/abs/2203.03605) set a new record of **63.3AP** on the MS-COCO leader board. [[code for DINO](https://github.com/IDEACVR/DINO)]. \ [2022/3/8] Our new work [DN-DETR](https://arxiv.org/abs/2203.01305) has been accpted by CVPR 2022! [[code for DN-DETR](https://github.com/IDEA-opensource/DN-DETR)]. \ [2022/1/21] Our work has been accepted to ICLR 2022. # Abstract We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. # Model  # Model Zoo We provide our models with R50 backbone, including both **DAB-DETR** and **DAB-Deformable-DETR** (See Appendix C of [our paper](https://arxiv.org/abs/2201.12329) for more details).
name | backbone | box AP | Log/Config/Checkpoint | Where in Our Paper | |
---|---|---|---|---|---|
0 | DAB-DETR-R50 | R50 | 42.2 | Google Drive | Tsinghua Cloud | Table 2 |
1 | DAB-DETR-R50(3 pat)1 | R50 | 42.6 | Google Drive | Tsinghua Cloud | Table 2 |
2 | DAB-DETR-R50-DC5 | R50 | 44.5 | Google Drive | Tsinghua Cloud | Table 2 |
3 | DAB-DETR-R50-DC5-fixxy2 | R50 | 44.7 | Google Drive | Tsinghua Cloud | Table 8. Appendix H. |
4 | DAB-DETR-R50-DC5(3 pat) | R50 | 45.7 | Google Drive | Tsinghua Cloud | Table 2 |
5 | DAB-Deformbale-DETR (Deformbale Encoder Only)3 |
R50 | 46.9 | Baseline for DN-DETR | |
6 | DAB-Deformable-DETR-R50-v24 | R50 | 48.7 | Google Drive | Tsinghua Cloud | Extend Results for Table 5, Appendix C. |
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection.
Hao Zhang*, Feng Li*, Shilong Liu*, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum
arxiv 2022.
[paper] [code]
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising.
Feng Li*, Hao Zhang*, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
[paper] [code]