# RGBD_Semantic_Segmentation_PyTorch
**Repository Path**: remvs/RGBD_Semantic_Segmentation_PyTorch
## Basic Information
- **Project Name**: RGBD_Semantic_Segmentation_PyTorch
- **Description**: 场景分析与解析
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 1
- **Created**: 2021-10-24
- **Last Updated**: 2022-01-01
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# 场景分析与解析
本工程受**国家重点研发计划“云计算与大数据”重点专项项目——“多源数据驱动的智能化高效场景建模与绘制引擎”(2017YFB1002601)**资助,为该项目中课题一“复杂场景多源数据分析融合与联合优化”的子课题。
## 算法流程图
- **RGB-D 场景分析与解析流程** [[文章链接](https://arxiv.org/abs/2007.09183)]
- **RGB 场景分析与解析流程 **
## 主要结果
#### Cityscapes 测试集
| 像素准确率 | 分割精度 (mIoU) |
| :---: | :---------: |
| - | **82.8** |
#### Cityscapes 验证集
| 像素准确率 | 分割精度 (mIoU) |
| :------: | :---------: |
| **96.7** | **82.14** |
#### Camvid 测试集
| 像素准确率 | 分割精度 (mIoU) |
| :------: | :---------: |
| **95.5** | **82.4** |
#### Camvid 验证集
| 像素准确率 | 分割精度 (mIoU) |
| :------: | :---------: |
| **96.2** | **82.1** |
## 文件结构
```
./
|-- furnace
|-- model
|-- DATA
-- |-- pytorch-weight
-- |-- Cityscapes
| |-- ColoredLabel
| |-- Depth
| |-- HHA
| |-- Label
| |-- RGB
| |-- test.txt
| |-- train.txt
```
## 安装依赖
The code is developed using Python 3.6 with PyTorch 1.0.0. The code is developed and tested using 4 or 8 NVIDIA TITAN V GPU cards. You can change the `input size (image_height and image_width)` or `batch_size` in the `config.py` according to your available resources.
1. **Clone this repo.**
```shell
$ git clone https://github.com/charlesCXK/RGBD_Semantic_Segmentation_PyTorch.git
$ cd RGBD_Semantic_Segmentation_PyTorch
```
2. **Install dependencies.**
**(1) Create a conda environment:**
```shell
$ conda env create -f rgbd.yaml
$ conda activate rgbd
```
**(2) Install apex 0.1(needs CUDA)**
```shell
$ cd ./furnace/apex
$ python setup.py install --cpp_ext --cuda_ext
```
## 数据准备
#### 预训练 ResNet-101
从以下链接下载 ResNet-101, 然后放到 `./DATA/pytorch-weight`.
| Source | Link |
| :----------: | :--------------------------------------: |
| BaiDu Cloud | Link: https://pan.baidu.com/s/1Zc_ed9zdgzHiIkARp2tCcw Password: f3ew |
| Google Drive | https://drive.google.com/drive/folders/1_1HpmoCsshNCMQdXhSNOq8Y-deIDcbKS?usp=sharing |
#### Cityscapes
| Source | Link |
| :------: | :--------------------------------------: |
| OneDrive | https://pkueducn-my.sharepoint.com/personal/pkucxk_pku_edu_cn/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fpkucxk%5Fpku%5Fedu%5Fcn%2FDocuments%2FDATA |
#### 生成 HHA 图片
代码请参考 [https://github.com/charlesCXK/Depth2HHA-python](https://github.com/charlesCXK/Depth2HHA-python).
## 模型训练与测试
### Training
Training on NYU Depth V2:
```shell
$ cd ./model/SA-Gate.nyu
$ export NGPUS=8
$ python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py
```
If you only have 4 GPU cards, you could:
```shell
$ cd ./model/SA-Gate.nyu.432
$ export NGPUS=4
$ python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py
```
- Note that the only difference between `SA-Gate.nyu/` and `SA-Gate.nyu.432/` is the training/inference image crop size.
- The tensorboard file is saved in `log/tb/` directory.
### Inference
Inference on NYU Depth V2:
```shell
$ cd ./model/SA-Gate.nyu
$ python eval.py -e 300-400 -d 0-7 --save_path results
```
- Here, 300-400 means we evaluate on checkpoints whose ID is in [300, 400], such as epoch-300.pth, epoch-310.pth, etc.
- The segmentation predictions will be saved in `results/` and `results_color/`, the former stores the original predictions and the latter stores colored version. Performance in mIoU will be written to `log/*.log`. You will expect ~51.4% mIoU in SA-Gate.nyu and ~51.5% mIoU in SA-Gate.nyu.432. (single scale inference with no flip)
- For **multi-scale and flip inference**, please set `C.eval_flip = True` and `C.eval_scale_array = [1, 0.75, 1.25]` in the `config.py`. Different `eval_scale_array` may have different performances.
## 室外 RGB-D 场景结果展示
**RGB:**
**HHA:**
**分割结果:**
## 室外 RGB 场景结果展示
**RGB:**
**分割结果:**
## 引用
Please consider citing this project in your publications if it helps your research.
```tex
@inproceedings{chen2020-SAGate,
title={Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation},
author={Chen, Xiaokang and Lin, Kwan-Yee and Wang, Jingbo and Wu, Wayne and Qian, Chen and Li, Hongsheng and Zeng, Gang},
booktitle={European Conference on Computer Vision (ECCV)},
year={2020}
}
```