# SuperYOLO **Repository Path**: PolarisF/SuperYOLO ## Basic Information - **Project Name**: SuperYOLO - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-03-23 - **Last Updated**: 2024-04-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery ⭐ This code has been completely released ⭐ ⭐ our [article](https://ieeexplore.ieee.org/abstract/document/10075555) ⭐ ⭐ We also finish the work about the **quantization** based on SuperYOLO： [Guided Hybrid Quantization for Object Detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching](https://github.com/icey-zhang/GHOST)!!!⭐ If our code is helpful to you, please cite: ``` @ARTICLE{10075555, author={Zhang, Jiaqing and Lei, Jie and Xie, Weiying and Fang, Zhenman and Li, Yunsong and Du, Qian}, journal={IEEE Transactions on Geoscience and Remote Sensing}, title={SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery}, year={2023}, volume={61}, number={}, pages={1-15}, doi={10.1109/TGRS.2023.3258666}} @article{zhang2023guided, title={Guided Hybrid Quantization for Object Detection in Remote Sensing Imagery via One-to-one Self-teaching}, author={Zhang, Jiaqing and Lei, Jie and Xie, Weiying and Li, Yunsong and Yang, Geng and Jia, Xiuping}, journal={IEEE Transactions on Geoscience and Remote Sensing}, year={2023}, publisher={IEEE} } ```

## Requirements ```python pip install -r requirements.txt pip install numba pip install timm ``` ## Train ### 1. Prepare training data - 1.1 In order to realize the SR assisted branch, the input images of the network are downsampled from 1024 x 1024 size to 512 x 512 during the training process. In the test process, the image size is 512 x 512, which is consistent with the input of other algorithms compared. - 1.2 Download VEDAI data for our experiment from [baiduyun](https://pan.baidu.com/s/1L0SWi5AQA6ZK9jDIWRY7Fg) (code: hvi4) or [google drive](https://drive.google.com/file/d/1Fz0VVlBS924pM3RQvcTsD_qaGjxzIv3y/view?usp=sharing). And the path of dataset is like that ```python SuperYOLO ├── dataset │ ├── VEDAI │ │ ├── images │ │ ├── labels │ │ ├── fold01.txt │ │ ├── fold01test.txt │ │ ├── fold02.txt │ │ ├── ..... │ ├── VEDAI_1024 │ │ ├── images │ │ ├── labels ``` - 1.3 Note that we transform the labels of the dataset to be horizontal boxes by [transform code](data/transform.py). You shoud run transform.py before training the model. Change the **PATH = './dataset/'** and then run the code. 好像已经转换完了？ TODO - [x] 理解'--ch'含义：最前面输入通道数 RGB+IR时候是 4 像素级融合如果是特征级融合可能是 64 128这种因为要测试不同的特征融合，所以把不同的融合方法封装成第一次模块，MF 只需要手动指定一下第一个模块的输出特征通道数即可没有MF结构的时候，即直接是RGB 或者 RGB+IR 像素级融合的时候， ch表示第一个模块输入特征通道数在原版的YOLOv5中 ch=3 是给定好了的默认参数 - [ ] ch_steam 和 ch 的区别是啥 - [x] SRyolo_noFocus_small.yaml 里面只有一个检测头是对应P3的 head - [x] 为什么训练时候都用的1024 那还怎么超分辨作监督使用super的时候是 1024 不使用的时候是 512 因为1024的分辨率训练前会自己下采样到512 然后1024的原始输入用作 SR模块的监督信息 - [x] 新的 MF 方法为什么"--ch 64" 模型结构图里可以看出 module名为"models.common.MF"的参数为[3] 并没有指定输出通道数 <原因同第一条> ### 2. Begin to train multi images ```bash # 2024年3月29日13:10:24 PID 375159 nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \ --super --train_img_size 1024 \ --hr_input --data data/SRvedai.yaml --name SRyolo_noFocus_small \ --ch 4 --input_mode RGB+IR --device 0 --batch-size 16 \ > logs/SRyolo_noFocus_small_RGB+IR_32.log 2>&1 \ & tail -f logs/SRyolo_noFocus_small_RGB+IR_32.log ``` new fusion method MF ```bash nohup python train.py --cfg models/SRyolo_MF.yaml \ --super --train_img_size 1024 \ --hr_input --data data/SRvedai.yaml --name SRyolo_MF \ --ch 64 --input_mode RGB+IR+MF --device 1 --batch-size 16 \ > logs/SRyolo_MF_RGB+IR+MF.log 2>&1 \ & tail -f logs/SRyolo_MF_RGB+IR+MF.log ``` ### 3. Begin to train RGB or IR images ```bash nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \ --super --train_img_size 1024 --name SRyolo_RGB \ --device 1 --batch-size 16 \ --hr_input --data data/SRvedai.yaml --ch 3 --input_mode RGB \ > logs/SRyolo_RGB.log 2>&1 \ & tail -f logs/SRyolo_RGB.log ``` ```bash nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \ --super --train_img_size 1024 --name SRyolo_IR \ --device 0 --batch-size 16 \ --hr_input --data data/SRvedai.yaml --ch 3 --input_mode IR \ > logs/SRyolo_IR.log 2>&1 \ & tail -f logs/SRyolo_IR.log ``` ### 4. Begin to train multi images without SR branch ```bash # 2024年4月1日14:55:00 开始训练 nohup python train.py --cfg models/SRyolo_noFocus_small.yaml \ --train_img_size 512 --data data/SRvedai.yaml \ --ch 4 --input_mode RGB+IR \ --device 0 --batch-size 16 \ > logs/SRyolo_RGB+IR_noSuper.log 2>&1 \ & tail -f logs/SRyolo_RGB+IR_noSuper.log ``` new fusion method MF ```bash nohup python train.py --cfg models/SRyolo_MF.yaml \ --train_img_size 512 --data data/SRvedai.yaml \ --ch 64 --input_mode RGB+IR+MF \ --device 1 --batch-size 16 \ > logs/SRyolo_RGB+IR+MF_noSuper.log 2>&1 \ & tail -f logs/SRyolo_RGB+IR+MF_noSuper.log ``` ### 5. Begin to train RGB or IR images without SR branch ```bash python train.py --cfg models/SRyolo_noFocus_small.yaml --train_img_size 512 --data data/SRvedai.yaml --ch 3 --input_mode RGB ``` ```python python train.py --cfg models/SRyolo_noFocus_small.yaml --train_img_size 512 --data data/SRvedai.yaml --ch 3 --input_mode IR ``` ## Test ### 1. Pretrained Checkpoints You can use our pretrained checkpoints for test process. Download pre-trained model and put it in [here](https://github.com/icey-zhang/SuperYOLO/tree/main/weights). ### 2. Begin to test ```python python test.py --weights runs/train/exp/best.pt --input_mode RGB+IR+MF ``` ## Results | Method | Modality | **Car** | **Pickup** | **Camping** | **Truck** | **Other** | **Tractor** | **Boat** | **Van** | **mAP50** | **Params.** $\downarrow$ | **GFLOPs** $\downarrow$ | |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | **YOLOv3** | IR | 80.21 | 67.03 | 65.55 | 47.78 | 25.86 | 40.11 | 32.67 | 53.33 | 51.54 | **61.5351M** | 49.55 | | **YOLOv3** | RGB | 83.06 | 71.54 | **69.14** | 59.30 | **48.93** | **67.34** | 33.48 | 55.67 | 61.06 | **61.5351M** | 49.55 | | **YOLOv3** | Multi | **84.57** | **72.68** | 67.13 | **61.96** | 43.04 | 65.24 | **37.10** | **58.29** | **61.26** | 61.5354M | 49.68 | | **YOLOv4** | IR | 80.45 | 67.88 | 68.84 | 53.66 | 30.02 | 44.23 | 25.40 | 51.41 | 52.75 | **52.5082M** | 38.16 | | **YOLOv4** | RGB | 83.73 | **73.43** | 71.17 | 59.09 | **51.66** | 65.86 | **34.28** | **60.32** | 62.43 | **52.5082M** | 38.16 | | **YOLOv4** | Multi | **85.46** | 72.84 | **72.38** | **62.82** | 48.94 | **68.99** | **34.28** | 54.66 | **62.55** | 52.5085M | 38.23 | | **YOLOv5s** | IR | 77.31 | 65.27 | 66.47 | 51.56 | 25.87 | 42.36 | 21.88 | 48.88 | 49.94 | **7.0728M** | 5.24 | | **YOLOv5s** | RGB | 80.07 | 68.01 | 66.12 | 51.52 | 45.76 | **64.38** | 21.62 | 40.93 | 54.82 | **7.0728M** | 5.24 | | **YOLOv5s** | Multi | 80.81 | **68.48** | **69.06** | **54.71** | **46.76** | 64.29 | **24.25** | **45.96** | **56.79** | 7.0739M | 5.32 | | **YOLOv5m** | IR | 79.23 | 67.32 | 65.43 | 51.75 | 26.66 | 44.28 | 26.64 | 56.14 | 52.19 | **21.0659M** | 16.13 | | **YOLOv5m** | RGB | 81.14 | 70.26 | 65.53 | 53.98 | **46.78** | **66.69** | **36.24** | 49.87 | 58.80 | **21.0659M** | 16.13 | | **YOLOv5m** | Multi | **82.53** | **72.32** | **68.41** | **59.25** | 46.20 | 66.23 | 33.51 | **57.11** | **60.69** | 21.0677M | 16.24 | | **YOLOv5l** | IR | 80.14 | 68.57 | 65.37 | 53.45 | 30.33 | 45.59 | 27.24 | **61.87** | 54.06 | **46.6383M** | 36.55 | | **YOLOv5l** | RGB | 81.36 | 71.70 | 68.25 | 57.45 | 45.77 | **70.68** | 35.89 | 55.42 | 60.81 | **46.6383M** | 36.55 | | **YOLOv5l** | Multi | **82.83** | **72.32** | **69.92** | **63.94** | **48.48** | 63.07 | **40.12** | 56.46 | **62.16** | 46.6046M | 36.70 | | **YOLOv5x** | IR | 79.01 | 66.72 | 65.93 | 58.49 | 31.39 | 41.38 | 31.58 | 58.98 | 54.18 | **87.2458M** | 69.52 | | **YOLOv5x** | RGB | 81.66 | 72.23 | 68.29 | 59.07 | 48.47 | 66.01 | **39.15** | **61.85** | 62.09 | **87.2458M** | 69.52 | | **YOLOv5x** | Multi | **84.33** | **72.95** | **70.09** | **61.15** | **49.94** | **67.35** | 38.71 | 56.65 | **62.65** | 87.2487M | 69.71 | | **SuperYOLO** | IR | 87.90 | 81.39 | 76.90 | 61.56 | 39.39 | 60.56 | 46.08 | 71.00 | 65.60 | **4.8256M** | 16.61 | | **SuperYOLO** | RGB | 90.30 | 82.66 | 76.69 | 68.55 | 53.86 | 79.48 | 58.08 | 70.30 | 72.49 | **4.8256M** | 16.61 | | **SuperYOLO** | Multi | **91.13** | **85.66** | **79.30** | **70.18** | **57.33** | **80.41** | **60.24** | **76.50** | **75.09** | 4.8451M | 17.98 | ## Time 2023.2.14 open the train.py 2023.2.14 update the new fusion method (MF)

2023.2.16 update the test.py for visualization of detection results ## Visualization of results

## Acknowledgements This code is built on [YOLOv5 (PyTorch)](https://github.com/ultralytics/yolov5). We thank the authors for sharing the codes. ## Licencing Copyright (C) 2020 Jiaqing Zhang This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3 of the License. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. ## Contact If you have any questions, please contact me by email (jq.zhangcn@foxmail.com).

![微信图片_20240317145628](https://github.com/icey-zhang/SuperYOLO/assets/54712081/1422a88e-95c8-43a4-8dcb-4302f4fb607b)