# MinkOcc **Repository Path**: xupeng-fu/MinkOcc ## Basic Information - **Project Name**: MinkOcc - **Description**: 更正了环境配置教程和loading警告代码,主题程序与原作者相同 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: dev3.0 - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-04-12 - **Last Updated**: 2025-04-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # MinkOcc 修正了环境配置混乱和部分loading代码,代码主体与minkocc完全相同,上传备用 ## Get Started #### Installation and Data Preparation Make sure to have a working CUDA. a. Create a conda virtual environment and activate it. All steps below happens in conda env. ```shell script conda create -n minkocc python=3.8 -y conda activate minkocc ``` b. Install PyTorch and torchvision following the official instructions. ```shell script conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7 -c pytorch -c nvidia ``` c. Install MinkowskiEngine. (Please refer to official website for specifics) https://github.com/NVIDIA/MinkowskiEngine/wiki/Installation ``` shell script sudo apt install libopenblas-dev git clone https://github.com/NVIDIA/MinkowskiEngine cd MinkowskiEngine export MAX_JOBS=1; python setup.py install --user ``` d. Install mmcv-full. ```shell script pip install mmcv-full==1.5.2 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html ``` e. Install mmdet and mmseg. ```shell script pip install mmdet==2.24.0 pip install mmsegmentation==0.24.0 ``` f. Install other dependencies f1. spconv, depends on your cuda version, for cuda 11.8 f2. yapf ```shell script pip install spconv-cu117 pip install yapf==0.31.0 ``` g. Prepare MinkOcc repo by. ```shell script git clone https://github.com/venti-sam/MinkOcc.git cd MinkOcc pip install -v -e . ``` h. Download Nuscenes Mini dataset: ```shell script https://www.nuscenes.org/nuscenes#download ``` i. For Occupancy Prediction task, download the mini and (only) the 'gts' from [CVPR2023-3D-Occupancy-Prediction](https://github.com/CVPR2023-3D-Occupancy-Prediction/CVPR2023-3D-Occupancy-Prediction) and arrange the folder as: ```shell script └── nuscenes ├── v1.0-mini (existing) ├── sweeps (existing) ├── samples (existing) └── gts (new) ``` j. Prepare nuScenes dataset as introduced in [nuscenes_det.md](docs/en/datasets/nuscenes_det.md) and create the pkl for MinkOcc by running: ```shell python tools/create_data_bevdet.py If encountered terminal error related to numba & nuscenes converter, install numpy==1.23.5. ``` #### Train model ```shell script # single gpu python tools/train.py configs/bevdet_occ/bevdet_minkocc.py ``` #### Test model ```shell script python tools/test.py $config $checkpoint --eval mAP ``` #### Robotcycle First prepare dataset to same format as Waymo / Kitti, or download it from: https://drive.google.com/file/d/18x55NXeblCSuBNCjsUjsoHUeEIk34rN5/view?usp=drive_link . If you download the zip file, you can skip the two steps below Step 1. ```shell - ImageSets for txt file with filename textfile indicating train/val/test - Training folder - calib in txt kitti format - images in .jpg / .png format - labels in .txt format - velodyne in .bin format - poses in .txt format - etc (depends) - Testing folder ``` Step 2. Create .pkl file for data loading ```shell python tools/create_data.py kitti --root-path ./data/robotcycle --out-dir ./data/robotcycle --extra-tag robotcycle ``` Step 3. Run the training script ```shell python tools/train.py configs/bevdet_occ/robotcycle.py ``` ## Acknowledgement This project is not possible without multiple great open-sourced code bases. We list some notable examples below. - [open-mmlab](https://github.com/open-mmlab) - [CenterPoint](https://github.com/tianweiy/CenterPoint) - [Lift-Splat-Shoot](https://github.com/nv-tlabs/lift-splat-shoot) - [Swin Transformer](https://github.com/microsoft/Swin-Transformer) - [BEVFusion](https://github.com/mit-han-lab/bevfusion) - [BEVDepth](https://github.com/Megvii-BaseDetection/BEVDepth) Beside, there are some other attractive works extend the boundary of BEVDet. - [BEVerse](https://github.com/zhangyp15/BEVerse) for multi-task learning. - [BEVStereo](https://github.com/Megvii-BaseDetection/BEVStereo) for stero depth estimation. ## Bibtex If this work is helpful for your research, please consider citing the following BibTeX entries. ```` @article{huang2023dal, title={Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection}, author={Huang, Junjie and Ye, Yun and Liang, Zhujin and Shan, Yi and Du, Dalong}, journal={arXiv preprint arXiv:2311.07152}, year={2023} } @article{huang2022bevpoolv2, title={BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment}, author={Huang, Junjie and Huang, Guan}, journal={arXiv preprint arXiv:2211.17111}, year={2022} } @article{huang2022bevdet4d, title={BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection}, author={Huang, Junjie and Huang, Guan}, journal={arXiv preprint arXiv:2203.17054}, year={2022} } @article{huang2021bevdet, title={BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View}, author={Huang, Junjie and Huang, Guan and Zhu, Zheng and Yun, Ye and Du, Dalong}, journal={arXiv preprint arXiv:2112.11790}, year={2021} } ``` ``` ````