# DEIM
**Repository Path**: tjc4814/DEIM
## Basic Information
- **Project Name**: DEIM
- **Description**: εθοΌhttps://blog.csdn.net/csdn_xmj/article/details/144686465
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-03-08
- **Last Updated**: 2025-03-08
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
DEIM: DETR with Improved Matching for Fast Convergence
DEIM is an advanced training framework designed to enhance the matching mechanism in DETRs, enabling faster convergence and improved accuracy. It serves as a robust foundation for future research and applications in the field of real-time object detection.
---
1. Intellindust AI Lab 2. City University of Hong Kong 3. Great Bay University 4. Hefei Normal University
**π§ Corresponding author:** shenxiluc@gmail.com
If you like our work, please give us a β!
## π Updates
- [x] **\[2025.03.05\]** The Nano DEIM model is released.
- [x] **\[2025.02.27\]** The DEIM paper is accepted to CVPR 2025. Thanks to all co-authors.
- [x] **\[2024.12.26\]** A more efficient implementation of Dense O2O, achieving nearly a 30% improvement in loading speed (See [the pull request](https://github.com/ShihuaHuang95/DEIM/pull/13) for more details). Huge thanks to my colleague [Longfei Liu](https://github.com/capsule2077).
- [x] **\[2024.12.03\]** Release DEIM series. Besides, this repo also supports the re-implmentations of [D-FINE](https://arxiv.org/abs/2410.13842) and [RT-DETR](https://arxiv.org/abs/2407.17140).
## Table of Content
* [1. Model Zoo](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#1-model-zoo)
* [2. Quick start](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#2-quick-start)
* [3. Usage](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#3-usage)
* [4. Tools](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#4-tools)
* [5. Citation](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#5-citation)
* [6. Acknowledgement](https://github.com/ShihuaHuang95/DEIM?tab=readme-ov-file#6-acknowledgement)
## 1. Model Zoo
### DEIM-D-FINE
| Model | Dataset | APD-FINE | APDEIM | #Params | Latency | GFLOPs | config | checkpoint
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---:
**N** | COCO | **42.8** | **43.0** | 4M | 2.12ms | 7 | [yml](./configs/deim_dfine/deim_hgnetv2_n_coco.yml) | [ckpt](https://drive.google.com/file/d/1ZPEhiU9nhW4M5jLnYOFwTSLQC1Ugf62e/view?usp=sharing) |
**S** | COCO | **48.7** | **49.0** | 10M | 3.49ms | 25 | [yml](./configs/deim_dfine/deim_hgnetv2_s_coco.yml) | [ckpt](https://drive.google.com/file/d/1tB8gVJNrfb6dhFvoHJECKOF5VpkthhfC/view?usp=drive_link) |
**M** | COCO | **52.3** | **52.7** | 19M | 5.62ms | 57 | [yml](./configs/deim_dfine/deim_hgnetv2_m_coco.yml) | [ckpt](https://drive.google.com/file/d/18Lj2a6UN6k_n_UzqnJyiaiLGpDzQQit8/view?usp=drive_link) |
**L** | COCO | **54.0** | **54.7** | 31M | 8.07ms | 91 | [yml](./configs/deim_dfine/deim_hgnetv2_l_coco.yml) | [ckpt](https://drive.google.com/file/d/1PIRf02XkrA2xAD3wEiKE2FaamZgSGTAr/view?usp=drive_link) |
**X** | COCO | **55.8** | **56.5** | 62M | 12.89ms | 202 | [yml](./configs/deim_dfine/deim_hgnetv2_x_coco.yml) | [ckpt](https://drive.google.com/file/d/1dPtbgtGgq1Oa7k_LgH1GXPelg1IVeu0j/view?usp=drive_link) |
### DEIM-RT-DETRv2
| Model | Dataset | APRT-DETRv2 | APDEIM | #Params | Latency | GFLOPs | config | checkpoint
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---:
**S** | COCO | **47.9** | **49.0** | 20M | 4.59ms | 60 | [yml](./configs/deim_rtdetrv2/deim_r18vd_120e_coco.yml) | [ckpt](https://drive.google.com/file/d/153_JKff6EpFgiLKaqkJsoDcLal_0ux_F/view?usp=drive_link) |
**M** | COCO | **49.9** | **50.9** | 31M | 6.40ms | 92 | [yml](./configs/deim_rtdetrv2/deim_r34vd_120e_coco.yml) | [ckpt](https://drive.google.com/file/d/1O9RjZF6kdFWGv1Etn1Toml4r-YfdMDMM/view?usp=drive_link) |
**M*** | COCO | **51.9** | **53.2** | 33M | 6.90ms | 100 | [yml](./configs/deim_rtdetrv2/deim_r50vd_m_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/10dLuqdBZ6H5ip9BbBiE6S7ZcmHkRbD0E/view?usp=drive_link) |
**L** | COCO | **53.4** | **54.3** | 42M | 9.15ms | 136 | [yml](./configs/deim_rtdetrv2/deim_r50vd_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/1mWknAXD5JYknUQ94WCEvPfXz13jcNOTI/view?usp=drive_link) |
**X** | COCO | **54.3** | **55.5** | 76M | 13.66ms | 259 | [yml](./configs/deim_rtdetrv2/deim_r101vd_60e_coco.yml) | [ckpt](https://drive.google.com/file/d/1BIevZijOcBO17llTyDX32F_pYppBfnzu/view?usp=drive_link) |
## 2. Quick start
### Setup
```shell
conda create -n deim python=3.11.9
conda activate deim
pip install -r requirements.txt
```
### Data Preparation
COCO2017 Dataset
1. Download COCO2017 from [OpenDataLab](https://opendatalab.com/OpenDataLab/COCO_2017) or [COCO](https://cocodataset.org/#download).
1. Modify paths in [coco_detection.yml](./configs/dataset/coco_detection.yml)
```yaml
train_dataloader:
img_folder: /data/COCO2017/train2017/
ann_file: /data/COCO2017/annotations/instances_train2017.json
val_dataloader:
img_folder: /data/COCO2017/val2017/
ann_file: /data/COCO2017/annotations/instances_val2017.json
```
Custom Dataset
To train on your custom dataset, you need to organize it in the COCO format. Follow the steps below to prepare your dataset:
1. **Set `remap_mscoco_category` to `False`:**
This prevents the automatic remapping of category IDs to match the MSCOCO categories.
```yaml
remap_mscoco_category: False
```
2. **Organize Images:**
Structure your dataset directories as follows:
```shell
dataset/
βββ images/
β βββ train/
β β βββ image1.jpg
β β βββ image2.jpg
β β βββ ...
β βββ val/
β β βββ image1.jpg
β β βββ image2.jpg
β β βββ ...
βββ annotations/
βββ instances_train.json
βββ instances_val.json
βββ ...
```
- **`images/train/`**: Contains all training images.
- **`images/val/`**: Contains all validation images.
- **`annotations/`**: Contains COCO-formatted annotation files.
3. **Convert Annotations to COCO Format:**
If your annotations are not already in COCO format, you'll need to convert them. You can use the following Python script as a reference or utilize existing tools:
```python
import json
def convert_to_coco(input_annotations, output_annotations):
# Implement conversion logic here
pass
if __name__ == "__main__":
convert_to_coco('path/to/your_annotations.json', 'dataset/annotations/instances_train.json')
```
4. **Update Configuration Files:**
Modify your [custom_detection.yml](./configs/dataset/custom_detection.yml).
```yaml
task: detection
evaluator:
type: CocoEvaluator
iou_types: ['bbox', ]
num_classes: 777 # your dataset classes
remap_mscoco_category: False
train_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /data/yourdataset/train
ann_file: /data/yourdataset/train/train.json
return_masks: False
transforms:
type: Compose
ops: ~
shuffle: True
num_workers: 4
drop_last: True
collate_fn:
type: BatchImageCollateFunction
val_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /data/yourdataset/val
ann_file: /data/yourdataset/val/ann.json
return_masks: False
transforms:
type: Compose
ops: ~
shuffle: False
num_workers: 4
drop_last: False
collate_fn:
type: BatchImageCollateFunction
```
## 3. Usage
COCO2017
1. Training
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0
```
2. Testing
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --test-only -r model.pth
```
3. Tuning
```shell
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0 -t model.pth
```
Customizing Batch Size
For example, if you want to double the total batch size when training D-FINE-L on COCO2017, here are the steps you should follow:
1. **Modify your [dataloader.yml](./configs/base/dataloader.yml)** to increase the `total_batch_size`:
```yaml
train_dataloader:
total_batch_size: 64 # Previously it was 32, now doubled
```
2. **Modify your [deim_hgnetv2_l_coco.yml](./configs/deim_dfine/deim_hgnetv2_l_coco.yml)**. Hereβs how the key parameters should be adjusted:
```yaml
optimizer:
type: AdamW
params:
-
params: '^(?=.*backbone)(?!.*norm|bn).*$'
lr: 0.000025 # doubled, linear scaling law
-
params: '^(?=.*(?:encoder|decoder))(?=.*(?:norm|bn)).*$'
weight_decay: 0.
lr: 0.0005 # doubled, linear scaling law
betas: [0.9, 0.999]
weight_decay: 0.0001 # need a grid search
ema: # added EMA settings
decay: 0.9998 # adjusted by 1 - (1 - decay) * 2
warmups: 500 # halved
lr_warmup_scheduler:
warmup_duration: 250 # halved
```
Customizing Input Size
If you'd like to train **DEIM** on COCO2017 with an input size of 320x320, follow these steps:
1. **Modify your [dataloader.yml](./configs/base/dataloader.yml)**:
```yaml
train_dataloader:
dataset:
transforms:
ops:
- {type: Resize, size: [320, 320], }
collate_fn:
base_size: 320
dataset:
transforms:
ops:
- {type: Resize, size: [320, 320], }
```
2. **Modify your [dfine_hgnetv2.yml](./configs/base/dfine_hgnetv2.yml)**:
```yaml
eval_spatial_size: [320, 320]
```
## 4. Tools
Deployment
1. Setup
```shell
pip install onnx onnxsim
```
2. Export onnx
```shell
python tools/deployment/export_onnx.py --check -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth
```
3. Export [tensorrt](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html)
```shell
trtexec --onnx="model.onnx" --saveEngine="model.engine" --fp16
```
Inference (Visualization)
1. Setup
```shell
pip install -r tools/inference/requirements.txt
```
2. Inference (onnxruntime / tensorrt / torch)
Inference on images and videos is now supported.
```shell
python tools/inference/onnx_inf.py --onnx model.onnx --input image.jpg # video.mp4
python tools/inference/trt_inf.py --trt model.engine --input image.jpg
python tools/inference/torch_inf.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth --input image.jpg --device cuda:0
```
Benchmark
1. Setup
```shell
pip install -r tools/benchmark/requirements.txt
```
2. Model FLOPs, MACs, and Params
```shell
python tools/benchmark/get_info.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml
```
2. TensorRT Latency
```shell
python tools/benchmark/trt_benchmark.py --COCO_dir path/to/COCO2017 --engine_dir model.engine
```
Fiftyone Visualization
1. Setup
```shell
pip install fiftyone
```
4. Voxel51 Fiftyone Visualization ([fiftyone](https://github.com/voxel51/fiftyone))
```shell
python tools/visualization/fiftyone_vis.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth
```
Others
1. Auto Resume Training
```shell
bash reference/safe_training.sh
```
2. Converting Model Weights
```shell
python reference/convert_weight.py model.pth
```
## 5. Citation
If you use `DEIM` or its methods in your work, please cite the following BibTeX entries:
bibtex
```latex
@misc{huang2024deim,
title={DEIM: DETR with Improved Matching for Fast Convergence},
author={Shihua Huang, Zhichao Lu, Xiaodong Cun, Yongjun Yu, Xiao Zhou, and Xi Shen},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025},
}
```
## 6. Acknowledgement
Our work is built upon [D-FINE](https://github.com/Peterande/D-FINE) and [RT-DETR](https://github.com/lyuwenyu/RT-DETR).
β¨ Feel free to contribute and reach out if you have any questions! β¨