# DEIM **Repository Path**: tjc4814/DEIM ## Basic Information - **Project Name**: DEIM - **Description**: ε‚θ€ƒοΌšhttps://blog.csdn.net/csdn_xmj/article/details/144686465 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-03-08 - **Last Updated**: 2025-03-08 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

DEIM: DETR with Improved Matching for Fast Convergence

license arXiv project webpage prs issues stars Contact Us

DEIM is an advanced training framework designed to enhance the matching mechanism in DETRs, enabling faster convergence and improved accuracy. It serves as a robust foundation for future research and applications in the field of real-time object detection.

---
Shihua Huang1, Zhichao Lu2, Xiaodong Cun3, Yongjun Yu1, Xiao Zhou4, Xi Shen1*

1. Intellindust AI Lab   2. City University of Hong Kong   3. Great Bay University   4. Hefei Normal University

**πŸ“§ Corresponding author:** shenxiluc@gmail.com

sota

If you like our work, please give us a ⭐!

Image 1 Image 2

## πŸš€ Updates - [x] **\[2025.03.05\]** The Nano DEIM model is released. - [x] **\[2025.02.27\]** The DEIM paper is accepted to CVPR 2025. Thanks to all co-authors. - [x] **\[2024.12.26\]** A more efficient implementation of Dense O2O, achieving nearly a 30% improvement in loading speed (See [the pull request](https://github.com/ShihuaHuang95/DEIM/pull/13) for more details). Huge thanks to my colleague [Longfei Liu](https://github.com/capsule2077). - [x] **\[2024.12.03\]** Release DEIM series. Besides, this repo also supports the re-implmentations of [D-FINE](https://arxiv.org/abs/2410.13842) and [RT-DETR](https://arxiv.org/abs/2407.17140). ## Table of Content * [1. Model Zoo](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#1-model-zoo) * [2. Quick start](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#2-quick-start) * [3. Usage](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#3-usage) * [4. Tools](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#4-tools) * [5. Citation](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#5-citation) * [6. Acknowledgement](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#6-acknowledgement) ## 1. Model Zoo ### DEIM-D-FINE | Model | Dataset | APD-FINE | APDEIM | #Params | Latency | GFLOPs | config | checkpoint | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: **N** | COCO | **42.8** | **43.0** | 4M | 2.12ms | 7 | [yml](./configs/deim_dfine/deim_hgnetv2_n_coco.yml) | [ckpt](https://drive.google.com/file/d/1ZPEhiU9nhW4M5jLnYOFwTSLQC1Ugf62e/view?usp=sharing) | **S** | COCO | **48.7** | **49.0** | 10M | 3.49ms | 25 | [yml](./configs/deim_dfine/deim_hgnetv2_s_coco.yml) | [ckpt](https://drive.google.com/file/d/1tB8gVJNrfb6dhFvoHJECKOF5VpkthhfC/view?usp=drive_link) | **M** | COCO | **52.3** | **52.7** | 19M | 5.62ms | 57 | [yml](./configs/deim_dfine/deim_hgnetv2_m_coco.yml) | [ckpt](https://drive.google.com/file/d/18Lj2a6UN6k_n_UzqnJyiaiLGpDzQQit8/view?usp=drive_link) | **L** | COCO | **54.0** | **54.7** | 31M | 8.07ms | 91 | [yml](./configs/deim_dfine/deim_hgnetv2_l_coco.yml) | [ckpt](https://drive.google.com/file/d/1PIRf02XkrA2xAD3wEiKE2FaamZgSGTAr/view?usp=drive_link) | **X** | COCO | **55.8** | **56.5** | 62M | 12.89ms | 202 | [yml](./configs/deim_dfine/deim_hgnetv2_x_coco.yml) | [ckpt](https://drive.google.com/file/d/1dPtbgtGgq1Oa7k_LgH1GXPelg1IVeu0j/view?usp=drive_link) | ### DEIM-RT-DETRv2 | Model | Dataset | APRT-DETRv2 | APDEIM | #Params | Latency | GFLOPs | config | checkpoint | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: **S** | COCO | **47.9** | **49.0** | 20M | 4.59ms | 60 | [yml](./configs/deim_rtdetrv2/deim_r18vd_120e_coco.yml) | [ckpt](https://drive.google.com/file/d/153_JKff6EpFgiLKaqkJsoDcLal_0ux_F/view?usp=drive_link) | **M** | COCO | **49.9** | **50.9** | 31M | 6.40ms | 92 | [yml](./configs/deim_rtdetrv2/deim_r34vd_120e_coco.yml) | [ckpt](https://drive.google.com/file/d/1O9RjZF6kdFWGv1Etn1Toml4r-YfdMDMM/view?usp=drive_link) | **M*** | COCO | **51.9** | **53.2** | 33M | 6.90ms | 100 | [yml](./configs/deim_rtdetrv2/deim_r50vd_m_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/10dLuqdBZ6H5ip9BbBiE6S7ZcmHkRbD0E/view?usp=drive_link) | **L** | COCO | **53.4** | **54.3** | 42M | 9.15ms | 136 | [yml](./configs/deim_rtdetrv2/deim_r50vd_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/1mWknAXD5JYknUQ94WCEvPfXz13jcNOTI/view?usp=drive_link) | **X** | COCO | **54.3** | **55.5** | 76M | 13.66ms | 259 | [yml](./configs/deim_rtdetrv2/deim_r101vd_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/1BIevZijOcBO17llTyDX32F_pYppBfnzu/view?usp=drive_link) | ## 2. Quick start ### Setup ```shell conda create -n deim python=3.11.9 conda activate deim pip install -r requirements.txt ``` ### Data Preparation
COCO2017 Dataset 1. Download COCO2017 from [OpenDataLab](https://opendatalab.com/OpenDataLab/COCO_2017) or [COCO](https://cocodataset.org/#download). 1. Modify paths in [coco_detection.yml](./configs/dataset/coco_detection.yml) ```yaml train_dataloader: img_folder: /data/COCO2017/train2017/ ann_file: /data/COCO2017/annotations/instances_train2017.json val_dataloader: img_folder: /data/COCO2017/val2017/ ann_file: /data/COCO2017/annotations/instances_val2017.json ```
Custom Dataset To train on your custom dataset, you need to organize it in the COCO format. Follow the steps below to prepare your dataset: 1. **Set `remap_mscoco_category` to `False`:** This prevents the automatic remapping of category IDs to match the MSCOCO categories. ```yaml remap_mscoco_category: False ``` 2. **Organize Images:** Structure your dataset directories as follows: ```shell dataset/ β”œβ”€β”€ images/ β”‚ β”œβ”€β”€ train/ β”‚ β”‚ β”œβ”€β”€ image1.jpg β”‚ β”‚ β”œβ”€β”€ image2.jpg β”‚ β”‚ └── ... β”‚ β”œβ”€β”€ val/ β”‚ β”‚ β”œβ”€β”€ image1.jpg β”‚ β”‚ β”œβ”€β”€ image2.jpg β”‚ β”‚ └── ... └── annotations/ β”œβ”€β”€ instances_train.json β”œβ”€β”€ instances_val.json └── ... ``` - **`images/train/`**: Contains all training images. - **`images/val/`**: Contains all validation images. - **`annotations/`**: Contains COCO-formatted annotation files. 3. **Convert Annotations to COCO Format:** If your annotations are not already in COCO format, you'll need to convert them. You can use the following Python script as a reference or utilize existing tools: ```python import json def convert_to_coco(input_annotations, output_annotations): # Implement conversion logic here pass if __name__ == "__main__": convert_to_coco('path/to/your_annotations.json', 'dataset/annotations/instances_train.json') ``` 4. **Update Configuration Files:** Modify your [custom_detection.yml](./configs/dataset/custom_detection.yml). ```yaml task: detection evaluator: type: CocoEvaluator iou_types: ['bbox', ] num_classes: 777 # your dataset classes remap_mscoco_category: False train_dataloader: type: DataLoader dataset: type: CocoDetection img_folder: /data/yourdataset/train ann_file: /data/yourdataset/train/train.json return_masks: False transforms: type: Compose ops: ~ shuffle: True num_workers: 4 drop_last: True collate_fn: type: BatchImageCollateFunction val_dataloader: type: DataLoader dataset: type: CocoDetection img_folder: /data/yourdataset/val ann_file: /data/yourdataset/val/ann.json return_masks: False transforms: type: Compose ops: ~ shuffle: False num_workers: 4 drop_last: False collate_fn: type: BatchImageCollateFunction ```
## 3. Usage
COCO2017 1. Training ```shell CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0 ``` 2. Testing ```shell CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --test-only -r model.pth ``` 3. Tuning ```shell CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0 -t model.pth ```
Customizing Batch Size For example, if you want to double the total batch size when training D-FINE-L on COCO2017, here are the steps you should follow: 1. **Modify your [dataloader.yml](./configs/base/dataloader.yml)** to increase the `total_batch_size`: ```yaml train_dataloader: total_batch_size: 64 # Previously it was 32, now doubled ``` 2. **Modify your [deim_hgnetv2_l_coco.yml](./configs/deim_dfine/deim_hgnetv2_l_coco.yml)**. Here’s how the key parameters should be adjusted: ```yaml optimizer: type: AdamW params: - params: '^(?=.*backbone)(?!.*norm|bn).*$' lr: 0.000025 # doubled, linear scaling law - params: '^(?=.*(?:encoder|decoder))(?=.*(?:norm|bn)).*$' weight_decay: 0. lr: 0.0005 # doubled, linear scaling law betas: [0.9, 0.999] weight_decay: 0.0001 # need a grid search ema: # added EMA settings decay: 0.9998 # adjusted by 1 - (1 - decay) * 2 warmups: 500 # halved lr_warmup_scheduler: warmup_duration: 250 # halved ```
Customizing Input Size If you'd like to train **DEIM** on COCO2017 with an input size of 320x320, follow these steps: 1. **Modify your [dataloader.yml](./configs/base/dataloader.yml)**: ```yaml train_dataloader: dataset: transforms: ops: - {type: Resize, size: [320, 320], } collate_fn: base_size: 320 dataset: transforms: ops: - {type: Resize, size: [320, 320], } ``` 2. **Modify your [dfine_hgnetv2.yml](./configs/base/dfine_hgnetv2.yml)**: ```yaml eval_spatial_size: [320, 320] ```
## 4. Tools
Deployment 1. Setup ```shell pip install onnx onnxsim ``` 2. Export onnx ```shell python tools/deployment/export_onnx.py --check -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth ``` 3. Export [tensorrt](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) ```shell trtexec --onnx="model.onnx" --saveEngine="model.engine" --fp16 ```
Inference (Visualization) 1. Setup ```shell pip install -r tools/inference/requirements.txt ``` 2. Inference (onnxruntime / tensorrt / torch) Inference on images and videos is now supported. ```shell python tools/inference/onnx_inf.py --onnx model.onnx --input image.jpg # video.mp4 python tools/inference/trt_inf.py --trt model.engine --input image.jpg python tools/inference/torch_inf.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth --input image.jpg --device cuda:0 ```
Benchmark 1. Setup ```shell pip install -r tools/benchmark/requirements.txt ``` 2. Model FLOPs, MACs, and Params ```shell python tools/benchmark/get_info.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml ``` 2. TensorRT Latency ```shell python tools/benchmark/trt_benchmark.py --COCO_dir path/to/COCO2017 --engine_dir model.engine ```
Fiftyone Visualization 1. Setup ```shell pip install fiftyone ``` 4. Voxel51 Fiftyone Visualization ([fiftyone](https://github.com/voxel51/fiftyone)) ```shell python tools/visualization/fiftyone_vis.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth ```
Others 1. Auto Resume Training ```shell bash reference/safe_training.sh ``` 2. Converting Model Weights ```shell python reference/convert_weight.py model.pth ```
## 5. Citation If you use `DEIM` or its methods in your work, please cite the following BibTeX entries:
bibtex ```latex @misc{huang2024deim, title={DEIM: DETR with Improved Matching for Fast Convergence}, author={Shihua Huang, Zhichao Lu, Xiaodong Cun, Yongjun Yu, Xiao Zhou, and Xi Shen}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2025}, } ```
## 6. Acknowledgement Our work is built upon [D-FINE](https://github.com/Peterande/D-FINE) and [RT-DETR](https://github.com/lyuwenyu/RT-DETR). ✨ Feel free to contribute and reach out if you have any questions! ✨