# SuperYOLO
**Repository Path**: PolarisF/SuperYOLO
## Basic Information
- **Project Name**: SuperYOLO
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-03-23
- **Last Updated**: 2024-04-25
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
⭐ This code has been completely released ⭐
⭐ our [article](https://ieeexplore.ieee.org/abstract/document/10075555) ⭐
⭐ We also finish the work about the **quantization** based on SuperYOLO:
[Guided Hybrid Quantization for Object Detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching](https://github.com/icey-zhang/GHOST)!!!⭐
If our code is helpful to you, please cite:
```
@ARTICLE{10075555,
author={Zhang, Jiaqing and Lei, Jie and Xie, Weiying and Fang, Zhenman and Li, Yunsong and Du, Qian},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery},
year={2023},
volume={61},
number={},
pages={1-15},
doi={10.1109/TGRS.2023.3258666}}
@article{zhang2023guided,
title={Guided Hybrid Quantization for Object Detection in Remote Sensing Imagery via One-to-one Self-teaching},
author={Zhang, Jiaqing and Lei, Jie and Xie, Weiying and Li, Yunsong and Yang, Geng and Jia, Xiuping},
journal={IEEE Transactions on Geoscience and Remote Sensing},
year={2023},
publisher={IEEE}
}
```
## Requirements
```python
pip install -r requirements.txt
pip install numba
pip install timm
```
## Train
### 1. Prepare training data
- 1.1 In order to realize the SR assisted branch, the input images of the network are downsampled from 1024 x 1024 size to 512 x 512 during the training process. In the test process, the image size is 512 x 512, which is consistent with the input of other algorithms compared.
- 1.2 Download VEDAI data for our experiment from [baiduyun](https://pan.baidu.com/s/1L0SWi5AQA6ZK9jDIWRY7Fg) (code: hvi4) or [google drive](https://drive.google.com/file/d/1Fz0VVlBS924pM3RQvcTsD_qaGjxzIv3y/view?usp=sharing). And the path of dataset is like that
```python
SuperYOLO
├── dataset
│ ├── VEDAI
│ │ ├── images
│ │ ├── labels
│ │ ├── fold01.txt
│ │ ├── fold01test.txt
│ │ ├── fold02.txt
│ │ ├── .....
│ ├── VEDAI_1024
│ │ ├── images
│ │ ├── labels
```
- 1.3 Note that we transform the labels of the dataset to be horizontal boxes by [transform code](data/transform.py). You shoud run transform.py before training the model. Change the **PATH = './dataset/'** and then run the code.
好像已经转换完了?
TODO
- [x] 理解'--ch'含义:最前面输入通道数 RGB+IR时候是 4 像素级融合 如果是特征级融合可能是 64 128这种
因为要测试不同的特征融合,所以把不同的融合方法 封装成第一次模块,MF 只需要手动指定一下第一个模块的输出特征通道数即可
没有MF结构的时候,即 直接是RGB 或者 RGB+IR 像素级融合的时候, ch表示第一个模块输入特征通道数
在原版的YOLOv5中 ch=3 是给定好了的默认参数
- [ ] ch_steam 和 ch 的区别是啥
- [x] SRyolo_noFocus_small.yaml 里面只有一个检测头 是 对应P3的 head
- [x] 为什么训练时候都用的1024 那还怎么超分辨作监督 使用super的时候是 1024 不使用的时候是 512
因为1024的分辨率 训练前会自己下采样到512 然后1024的原始输入用作 SR模块的监督信息
- [x] 新的 MF 方法 为什么"--ch 64" 模型结构图里可以看出
module名为"models.common.MF"的参数为[3] 并没有指定输出通道数
<原因同第一条>
### 2. Begin to train multi images
```bash
# 2024年3月29日13:10:24 PID 375159
nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \
--super --train_img_size 1024 \
--hr_input --data data/SRvedai.yaml --name SRyolo_noFocus_small \
--ch 4 --input_mode RGB+IR --device 0 --batch-size 16 \
> logs/SRyolo_noFocus_small_RGB+IR_32.log 2>&1 \
& tail -f logs/SRyolo_noFocus_small_RGB+IR_32.log
```
new fusion method MF
```bash
nohup python train.py --cfg models/SRyolo_MF.yaml \
--super --train_img_size 1024 \
--hr_input --data data/SRvedai.yaml --name SRyolo_MF \
--ch 64 --input_mode RGB+IR+MF --device 1 --batch-size 16 \
> logs/SRyolo_MF_RGB+IR+MF.log 2>&1 \
& tail -f logs/SRyolo_MF_RGB+IR+MF.log
```
### 3. Begin to train RGB or IR images
```bash
nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \
--super --train_img_size 1024 --name SRyolo_RGB \
--device 1 --batch-size 16 \
--hr_input --data data/SRvedai.yaml --ch 3 --input_mode RGB \
> logs/SRyolo_RGB.log 2>&1 \
& tail -f logs/SRyolo_RGB.log
```
```bash
nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \
--super --train_img_size 1024 --name SRyolo_IR \
--device 0 --batch-size 16 \
--hr_input --data data/SRvedai.yaml --ch 3 --input_mode IR \
> logs/SRyolo_IR.log 2>&1 \
& tail -f logs/SRyolo_IR.log
```
### 4. Begin to train multi images without SR branch
```bash
# 2024年4月1日14:55:00 开始训练
nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \
--train_img_size 512 --data data/SRvedai.yaml \
--ch 4 --input_mode RGB+IR \
--device 0 --batch-size 16 \
> logs/SRyolo_RGB+IR_noSuper.log 2>&1 \
& tail -f logs/SRyolo_RGB+IR_noSuper.log
```
new fusion method MF
```bash
nohup python train.py --cfg models/SRyolo_MF.yaml \
--train_img_size 512 --data data/SRvedai.yaml \
--ch 64 --input_mode RGB+IR+MF \
--device 1 --batch-size 16 \
> logs/SRyolo_RGB+IR+MF_noSuper.log 2>&1 \
& tail -f logs/SRyolo_RGB+IR+MF_noSuper.log
```
### 5. Begin to train RGB or IR images without SR branch
```bash
python train.py --cfg models/SRyolo_noFocus_small.yaml --train_img_size 512 --data data/SRvedai.yaml --ch 3 --input_mode RGB
```
```python
python train.py --cfg models/SRyolo_noFocus_small.yaml --train_img_size 512 --data data/SRvedai.yaml --ch 3 --input_mode IR
```
## Test
### 1. Pretrained Checkpoints
You can use our pretrained checkpoints for test process.
Download pre-trained model and put it in [here](https://github.com/icey-zhang/SuperYOLO/tree/main/weights).
### 2. Begin to test
```python
python test.py --weights runs/train/exp/best.pt --input_mode RGB+IR+MF
```
## Results
| Method | Modality | **Car** | **Pickup** | **Camping** | **Truck** | **Other** | **Tractor** | **Boat** | **Van** | **mAP50** | **Params.** $\downarrow$ | **GFLOPs** $\downarrow$ |
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| **YOLOv3** | IR | 80.21 | 67.03 | 65.55 | 47.78 | 25.86 | 40.11 | 32.67 | 53.33 | 51.54 | **61.5351M** | 49.55 |
| **YOLOv3** | RGB | 83.06 | 71.54 | **69.14** | 59.30 | **48.93** | **67.34** | 33.48 | 55.67 | 61.06 | **61.5351M** | 49.55 |
| **YOLOv3** | Multi | **84.57** | **72.68** | 67.13 | **61.96** | 43.04 | 65.24 | **37.10** | **58.29** | **61.26** | 61.5354M | 49.68 |
| **YOLOv4** | IR | 80.45 | 67.88 | 68.84 | 53.66 | 30.02 | 44.23 | 25.40 | 51.41 | 52.75 | **52.5082M** | 38.16 |
| **YOLOv4** | RGB | 83.73 | **73.43** | 71.17 | 59.09 | **51.66** | 65.86 | **34.28** | **60.32** | 62.43 | **52.5082M** | 38.16 |
| **YOLOv4** | Multi | **85.46** | 72.84 | **72.38** | **62.82** | 48.94 | **68.99** | **34.28** | 54.66 | **62.55** | 52.5085M | 38.23 |
| **YOLOv5s** | IR | 77.31 | 65.27 | 66.47 | 51.56 | 25.87 | 42.36 | 21.88 | 48.88 | 49.94 | **7.0728M** | 5.24 |
| **YOLOv5s** | RGB | 80.07 | 68.01 | 66.12 | 51.52 | 45.76 | **64.38** | 21.62 | 40.93 | 54.82 | **7.0728M** | 5.24 |
| **YOLOv5s** | Multi | 80.81 | **68.48** | **69.06** | **54.71** | **46.76** | 64.29 | **24.25** | **45.96** | **56.79** | 7.0739M | 5.32 |
| **YOLOv5m** | IR | 79.23 | 67.32 | 65.43 | 51.75 | 26.66 | 44.28 | 26.64 | 56.14 | 52.19 | **21.0659M** | 16.13 |
| **YOLOv5m** | RGB | 81.14 | 70.26 | 65.53 | 53.98 | **46.78** | **66.69** | **36.24** | 49.87 | 58.80 | **21.0659M** | 16.13 |
| **YOLOv5m** | Multi | **82.53** | **72.32** | **68.41** | **59.25** | 46.20 | 66.23 | 33.51 | **57.11** | **60.69** | 21.0677M | 16.24 |
| **YOLOv5l** | IR | 80.14 | 68.57 | 65.37 | 53.45 | 30.33 | 45.59 | 27.24 | **61.87** | 54.06 | **46.6383M** | 36.55 |
| **YOLOv5l** | RGB | 81.36 | 71.70 | 68.25 | 57.45 | 45.77 | **70.68** | 35.89 | 55.42 | 60.81 | **46.6383M** | 36.55 |
| **YOLOv5l** | Multi | **82.83** | **72.32** | **69.92** | **63.94** | **48.48** | 63.07 | **40.12** | 56.46 | **62.16** | 46.6046M | 36.70 |
| **YOLOv5x** | IR | 79.01 | 66.72 | 65.93 | 58.49 | 31.39 | 41.38 | 31.58 | 58.98 | 54.18 | **87.2458M** | 69.52 |
| **YOLOv5x** | RGB | 81.66 | 72.23 | 68.29 | 59.07 | 48.47 | 66.01 | **39.15** | **61.85** | 62.09 | **87.2458M** | 69.52 |
| **YOLOv5x** | Multi | **84.33** | **72.95** | **70.09** | **61.15** | **49.94** | **67.35** | 38.71 | 56.65 | **62.65** | 87.2487M | 69.71 |
| **SuperYOLO** | IR | 87.90 | 81.39 | 76.90 | 61.56 | 39.39 | 60.56 | 46.08 | 71.00 | 65.60 | **4.8256M** | 16.61 |
| **SuperYOLO** | RGB | 90.30 | 82.66 | 76.69 | 68.55 | 53.86 | 79.48 | 58.08 | 70.30 | 72.49 | **4.8256M** | 16.61 |
| **SuperYOLO** | Multi | **91.13** | **85.66** | **79.30** | **70.18** | **57.33** | **80.41** | **60.24** | **76.50** | **75.09** | 4.8451M | 17.98 |
## Time
2023.2.14 open the train.py
2023.2.14 update the new fusion method (MF)
2023.2.16 update the test.py for visualization of detection results
## Visualization of results
## Acknowledgements
This code is built on [YOLOv5 (PyTorch)](https://github.com/ultralytics/yolov5). We thank the authors for sharing the codes.
## Licencing
Copyright (C) 2020 Jiaqing Zhang
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program.
## Contact
If you have any questions, please contact me by email (jq.zhangcn@foxmail.com).
