# RGBD_Semantic_Segmentation_PyTorch **Repository Path**: remvs/RGBD_Semantic_Segmentation_PyTorch ## Basic Information - **Project Name**: RGBD_Semantic_Segmentation_PyTorch - **Description**: 场景分析与解析 - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-10-24 - **Last Updated**: 2022-01-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 场景分析与解析 本工程受**国家重点研发计划“云计算与大数据”重点专项项目——“多源数据驱动的智能化高效场景建模与绘制引擎”(2017YFB1002601)**资助,为该项目中课题一“复杂场景多源数据分析融合与联合优化”的子课题。 ## 算法流程图 - **RGB-D 场景分析与解析流程** [[文章链接](https://arxiv.org/abs/2007.09183)] - **RGB 场景分析与解析流程 ** ## 主要结果 #### Cityscapes 测试集 | 像素准确率 | 分割精度 (mIoU) | | :---: | :---------: | | - | **82.8** | #### Cityscapes 验证集 | 像素准确率 | 分割精度 (mIoU) | | :------: | :---------: | | **96.7** | **82.14** | #### Camvid 测试集 | 像素准确率 | 分割精度 (mIoU) | | :------: | :---------: | | **95.5** | **82.4** | #### Camvid 验证集 | 像素准确率 | 分割精度 (mIoU) | | :------: | :---------: | | **96.2** | **82.1** | ## 文件结构 ``` ./ |-- furnace |-- model |-- DATA -- |-- pytorch-weight -- |-- Cityscapes | |-- ColoredLabel | |-- Depth | |-- HHA | |-- Label | |-- RGB | |-- test.txt | |-- train.txt ``` ## 安装依赖 The code is developed using Python 3.6 with PyTorch 1.0.0. The code is developed and tested using 4 or 8 NVIDIA TITAN V GPU cards. You can change the `input size (image_height and image_width)` or `batch_size` in the `config.py` according to your available resources. 1. **Clone this repo.** ```shell $ git clone https://github.com/charlesCXK/RGBD_Semantic_Segmentation_PyTorch.git $ cd RGBD_Semantic_Segmentation_PyTorch ``` 2. **Install dependencies.** **(1) Create a conda environment:** ```shell $ conda env create -f rgbd.yaml $ conda activate rgbd ``` **(2) Install apex 0.1(needs CUDA)** ```shell $ cd ./furnace/apex $ python setup.py install --cpp_ext --cuda_ext ``` ​ ## 数据准备 #### 预训练 ResNet-101 从以下链接下载 ResNet-101, 然后放到 `./DATA/pytorch-weight`. | Source | Link | | :----------: | :--------------------------------------: | | BaiDu Cloud | Link: https://pan.baidu.com/s/1Zc_ed9zdgzHiIkARp2tCcw Password: f3ew | | Google Drive | https://drive.google.com/drive/folders/1_1HpmoCsshNCMQdXhSNOq8Y-deIDcbKS?usp=sharing | #### Cityscapes | Source | Link | | :------: | :--------------------------------------: | | OneDrive | https://pkueducn-my.sharepoint.com/personal/pkucxk_pku_edu_cn/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fpkucxk%5Fpku%5Fedu%5Fcn%2FDocuments%2FDATA | #### 生成 HHA 图片 代码请参考 [https://github.com/charlesCXK/Depth2HHA-python](https://github.com/charlesCXK/Depth2HHA-python). ​ ## 模型训练与测试 ### Training Training on NYU Depth V2: ```shell $ cd ./model/SA-Gate.nyu $ export NGPUS=8 $ python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py ``` If you only have 4 GPU cards, you could: ```shell $ cd ./model/SA-Gate.nyu.432 $ export NGPUS=4 $ python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py ``` - Note that the only difference between `SA-Gate.nyu/` and `SA-Gate.nyu.432/` is the training/inference image crop size. - The tensorboard file is saved in `log/tb/` directory. ### Inference Inference on NYU Depth V2: ```shell $ cd ./model/SA-Gate.nyu $ python eval.py -e 300-400 -d 0-7 --save_path results ``` - Here, 300-400 means we evaluate on checkpoints whose ID is in [300, 400], such as epoch-300.pth, epoch-310.pth, etc. - The segmentation predictions will be saved in `results/` and `results_color/`, the former stores the original predictions and the latter stores colored version. Performance in mIoU will be written to `log/*.log`. You will expect ~51.4% mIoU in SA-Gate.nyu and ~51.5% mIoU in SA-Gate.nyu.432. (single scale inference with no flip) - For **multi-scale and flip inference**, please set `C.eval_flip = True` and `C.eval_scale_array = [1, 0.75, 1.25]` in the `config.py`. Different `eval_scale_array` may have different performances. ## 室外 RGB-D 场景结果展示 **RGB:** **HHA:** **分割结果:** ## 室外 RGB 场景结果展示 **RGB:** **分割结果:** ## 引用 Please consider citing this project in your publications if it helps your research. ```tex @inproceedings{chen2020-SAGate, title={Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation}, author={Chen, Xiaokang and Lin, Kwan-Yee and Wang, Jingbo and Wu, Wayne and Qian, Chen and Li, Hongsheng and Zeng, Gang}, booktitle={European Conference on Computer Vision (ECCV)}, year={2020} } ```