# MinkOcc

**Repository Path**: xupeng-fu/MinkOcc

## Basic Information

- **Project Name**: MinkOcc
- **Description**: 更正了环境配置教程和loading警告代码，主题程序与原作者相同
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: dev3.0
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-04-12
- **Last Updated**: 2025-04-12

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# MinkOcc
修正了环境配置混乱和部分loading代码，代码主体与minkocc完全相同，上传备用
## Get Started

#### Installation and Data Preparation
Make sure to have a working CUDA.

a. Create a conda virtual environment and activate it. All steps below happens in conda env.

```shell script
conda create -n minkocc python=3.8 -y
conda activate minkocc
```

b. Install PyTorch and torchvision following the official instructions.

```shell script
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7 -c pytorch -c nvidia
```

c. Install MinkowskiEngine. (Please refer to official website for specifics)
https://github.com/NVIDIA/MinkowskiEngine/wiki/Installation

``` shell script
sudo apt install libopenblas-dev
git clone https://github.com/NVIDIA/MinkowskiEngine
cd MinkowskiEngine
export MAX_JOBS=1; python setup.py install --user
```

d. Install mmcv-full.

```shell script
pip install mmcv-full==1.5.2 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html
```

e. Install mmdet and mmseg.

```shell script
pip install mmdet==2.24.0
pip install mmsegmentation==0.24.0
```

f. Install other dependencies 
    f1. spconv, depends on your cuda version, for cuda 11.8 
    f2. yapf

```shell script
pip install spconv-cu117
pip install yapf==0.31.0
```

g. Prepare MinkOcc repo by.

```shell script
git clone https://github.com/venti-sam/MinkOcc.git
cd MinkOcc
pip install -v -e .
```
h. Download Nuscenes Mini dataset:

```shell script
https://www.nuscenes.org/nuscenes#download
```

i. For Occupancy Prediction task, download the mini and (only) the 'gts' from [CVPR2023-3D-Occupancy-Prediction](https://github.com/CVPR2023-3D-Occupancy-Prediction/CVPR2023-3D-Occupancy-Prediction) and arrange the folder as:

```shell script
└── nuscenes
    ├── v1.0-mini (existing)
    ├── sweeps  (existing)
    ├── samples (existing)
    └── gts (new)
```

j. Prepare nuScenes dataset as introduced in [nuscenes_det.md](docs/en/datasets/nuscenes_det.md) and create the pkl for MinkOcc by running:

```shell
python tools/create_data_bevdet.py

If encountered terminal error related to numba & nuscenes converter, install numpy==1.23.5.
```

#### Train model

```shell script
# single gpu
python tools/train.py configs/bevdet_occ/bevdet_minkocc.py
```

#### Test model

```shell script
python tools/test.py $config $checkpoint --eval mAP
```

#### Robotcycle
First prepare dataset to same format as Waymo / Kitti,
or download it from: https://drive.google.com/file/d/18x55NXeblCSuBNCjsUjsoHUeEIk34rN5/view?usp=drive_link
. If you download the zip file, you can skip the two steps below

Step 1. 
```shell
    - ImageSets for txt file with filename textfile indicating train/val/test
    - Training folder
        - calib in txt kitti format
        - images in .jpg / .png format
        - labels in .txt format
        - velodyne in .bin format
        - poses in .txt format
        - etc (depends)
    - Testing folder
```

Step 2. Create .pkl file for data loading

```shell
python tools/create_data.py kitti --root-path ./data/robotcycle --out-dir ./data/robotcycle --extra-tag robotcycle
```

Step 3. Run the training script

```shell
python tools/train.py configs/bevdet_occ/robotcycle.py
```


## Acknowledgement

This project is not possible without multiple great open-sourced code bases. We list some notable examples below.

- [open-mmlab](https://github.com/open-mmlab)
- [CenterPoint](https://github.com/tianweiy/CenterPoint)
- [Lift-Splat-Shoot](https://github.com/nv-tlabs/lift-splat-shoot)
- [Swin Transformer](https://github.com/microsoft/Swin-Transformer)
- [BEVFusion](https://github.com/mit-han-lab/bevfusion)
- [BEVDepth](https://github.com/Megvii-BaseDetection/BEVDepth)

Beside, there are some other attractive works extend the boundary of BEVDet.

- [BEVerse](https://github.com/zhangyp15/BEVerse) for multi-task learning.
- [BEVStereo](https://github.com/Megvii-BaseDetection/BEVStereo) for stero depth estimation.

## Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entries.

````

@article{huang2023dal,
title={Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection},
author={Huang, Junjie and Ye, Yun and Liang, Zhujin and Shan, Yi and Du, Dalong},
journal={arXiv preprint arXiv:2311.07152},
year={2023}
}

@article{huang2022bevpoolv2,
title={BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment},
author={Huang, Junjie and Huang, Guan},
journal={arXiv preprint arXiv:2211.17111},
year={2022}
}

@article{huang2022bevdet4d,
title={BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection},
author={Huang, Junjie and Huang, Guan},
journal={arXiv preprint arXiv:2203.17054},
year={2022}
}

@article{huang2021bevdet,
title={BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View},
author={Huang, Junjie and Huang, Guan and Zhu, Zheng and Yun, Ye and Du, Dalong},
journal={arXiv preprint arXiv:2112.11790},
year={2021}
}

```

```
````