# DDRNet **Repository Path**: ai-models-cn/DDRNet ## Basic Information - **Project Name**: DDRNet - **Description**: No description available - **Primary Language**: Python - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-10-24 - **Last Updated**: 2024-10-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # The official implementation of "Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes" ![avatar](./figs/performance.png) **Achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration (like tensorRT) and extra data (like Mapillary)!** ![avatar](./figs/DDRNet_seg.png) The overall architecture of our methods. ![avatar](./figs/DAPPM.png) The details of "Deep Aggregation Pyramid Pooling Module(DAPPM)". ## Usage Currently, this repo contains the model codes and pretrained models for classification and semantic segmentation. Our models are trained using this code base [HRNet-Semantic-Segmentation-pytorch-v1.1](https://github.com/HRNet/HRNet-Semantic-Segmentation/tree/pytorch-v1.1). For training and testing DDRNet, you can refer to [DDRNet.pytorch](https://github.com/chenjun2hao/DDRNet.pytorch), [Segmentation-Pytorch](https://github.com/Deeachain/Segmentation-Pytorch), [semantic-segmentation](https://github.com/sithu31296/semantic-segmentation), [PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg), [deci.ai](https://github.com/Deci-AI/super-gradients). ## Notice There are some basic training tricks you should employ to reproduce our results including class balance sample, ohem, crop size of 1024x1024. More details can be found in the [paper](https://arxiv.org/abs/2101.06085). And there is usually some variation with Cityscapes val results of the same model, maybe about 1% mIoU. Keep "align_corners=False" in all places if you want to use our pretrained models for evaluation directly. ## Pretrained models ### ImageNet DDRNet_23_slim(top-1 error:29.8): [googledrive](https://drive.google.com/file/d/1mg5tMX7TJ9ZVcAiGSB4PEihPtrJyalB4/view?usp=sharing) DDRNet_23_slim using timm library, maybe helpful to train on own datasets(top-1 error:26.3, trained with a batch size of 256, warmup, cosine learning rate, 300 epoches and label smoothing): [googledrive](https://drive.google.com/file/d/17sgZ8mRJFhsItmdTrifI1rloVq5K1WiC/view?usp=sharing) DDRNet_23(top-1 error:24.1): [googledrive](https://drive.google.com/file/d/1VoUsERBeuCaiuQJufu8PqpKKtGvCTdug/view?usp=sharing) DDRNet_39(top-1 error:22.7): [googledrive](https://drive.google.com/file/d/122CMx6DZBaRRf-dOHYwuDY9vG0_UQ10i/view?usp=sharing) ### Cityscapes DDRNet_23_slim(val mIoU:77.8): [googledrive](https://drive.google.com/file/d/1d_K3Af5fKHYwxSo8HkxpnhiekhwovmiP/view?usp=sharing) DDRNet_23(val mIoU:79.5): [googledrive](https://drive.google.com/file/d/16viDZhbmuc3y7OSsUo2vhA7V6kYO0KX6/view?usp=sharing) ### CamVid Dataset can be downloaded from the [link](https://paddleseg.bj.bcebos.com/dataset/camvid.tar). DDRNet_23_slim: [googledrive](https://drive.google.com/file/d/1sh71nLdFKq1l89X3xyVO2J0d_3qBZui8/view?usp=sharing) ### [Comma10K](https://github.com/commaai/comma10k) Methods | Val loss | FPS :--:|:--:|:--: UNet-EfficientNetB0 | 0.0495 | 35.6 | UNet-EfficientNetB4 | 0.0462 | 18.0 | STDC1-Seg | 0.0482 | 92.0 | STDC2-Seg | 0.0448 | 73.0 | DDRNet_23_slim | 0.0448 | 166.8 | DDRNet_23 | 0.0433 | 62.7 | DDRNet_39 | 0.0428 | 36.3 | Please refer to [comma10k-baseline](https://github.com/YassineYousfi/comma10k-baseline) for train and test details. The FPS is tested with the method of our paper under the same conditions. ## Results on Cityscapes server DDRNet_23_slim: [77.4](https://www.cityscapes-dataset.com/anonymous-results/?id=552a0548931fb49759bde6216f8472f60c470f768ac78b4cd08bf30a3a161e82) DDRNet_23: [79.4](https://www.cityscapes-dataset.com/anonymous-results/?id=5766a6aff8efa27239e2f1d1085052cdb0a2351a66ef00d1610c9ea226e6770b) DDRNet_39: [80.4](https://www.cityscapes-dataset.com/anonymous-results/?id=c9a859907b83426a71dcdcb08a7c0ad5b69111a45e61e3fdef5df1ddc680268c) [81.9](https://www.cityscapes-dataset.com/anonymous-results/?id=594e60787c8af8203cd37e5094c764a93b5a0c35e1e699d89ce4a64cb9da447b)(multi-scale and flip) DDRNet_39 1.5x: [82.4](https://www.cityscapes-dataset.com/anonymous-results/?id=3515d66c1dc86c6daf42800c85a2937205658c6a8e5880904f350d8af234db01)(multi-scale and flip) ## Test Speed Evaluate the inference speed on Cityscapes dataset. ``` python3 DDRNet_23_slim_eval_speed.py ``` DDRNet-23-slim can achieve above 130fps by using the [tool](https://github.com/NVIDIA-AI-IOT/torch2trt). ## Citation If you find this repo is useful for your research, Please consider citing our paper: ``` @article{hong2021deep, title={Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes}, author={Hong, Yuanduo and Pan, Huihui and Sun, Weichao and Jia, Yisong}, journal={arXiv preprint arXiv:2101.06085}, year={2021} } @article{pan2022deep, title={Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Traffic Scenes}, author={Pan, Huihui and Hong, Yuanduo and Sun, Weichao and Jia, Yisong}, journal={IEEE Transactions on Intelligent Transportation Systems}, year={2022}, publisher={IEEE} } ```