# CenterPoint **Repository Path**: hchouse/CenterPoint ## Basic Information - **Project Name**: CenterPoint - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-12-28 - **Last Updated**: 2022-05-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Center-based 3D Object Detection and Tracking 3D Object Detection and Tracking using center points in the bird-eye view.

> [**Center-based 3D Object Detection and Tracking**](https://arxiv.org/abs/2006.11275), > Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl, > *arXiv technical report ([arXiv 2006.11275](https://arxiv.org/abs/2006.11275))* @article{yin2020center, title={Center-based 3D Object Detection and Tracking}, author={Yin, Tianwei and Zhou, Xingyi and Kr{\"a}henb{\"u}hl, Philipp}, journal={arXiv:2006.11275}, year={2020}, } ## Updates [2020-12-11] **NEW:** 3 out of the top 4 entries in the recent NeurIPS 2020 [nuScenes 3D Detection challenge](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Any) used CenterPoint. Congratualations to other participants and please stay tuned for more updates on nuScenes and Waymo soon. [2020-08-10] We now support vehicle detection on [Waymo](docs/WAYMO.md) with SOTA performance. ## Contact Any questions or discussion are welcome! Tianwei Yin [yintianwei@utexas.edu](mailto:yintianwei@utexas.edu) Xingyi Zhou [zhouxy@cs.utexas.edu](mailto:zhouxy@cs.utexas.edu) ## Abstract Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection, but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objectsas points. We use a keypoint detector to find centers of objects, and simply regress to other attributes, including 3D size, 3D orientation, and velocity. In our center-based framework, 3D object tracking simplifies to greedy closest-point matching.The resulting detection and tracking algorithm is simple, efficient, and effective. On the nuScenes dataset, our point-based representations performs 3-4mAP higher than the box-based counterparts for 3D detection, and 6 AMOTA higher for 3D tracking. Our real-time model runs end-to-end 3D detection and tracking at 30 FPS with 54.2AMOTA and 48.3mAP while the best single model achieves 60.3mAP for 3D detection, and 63.8AMOTA for 3D tracking. # Highlights - **Simple:** Two sentences method summary: We use standard 3D point cloud encoder with a few convolutional layers in the head to produce a bird-eye-view heatmap and other dense regression outputs including the offset to centers in the previous frame. Detection is a simple local peak extraction, and tracking is a closest-distance matching. - **Fast:** Our [PointPillars model](configs/centerpoint/nusc_centerpoint_pp_02voxel_circle_nms.py) runs at *30* FPS with *48.3* AP and *59.1* AMOTA for simultaneous 3D detection and tracking on the nuScenes dataset. - **Accurate**: Our [best single model](configs/centerpoint/nusc_centerpoint_voxelnet_dcn_0075voxel_flip_testset.py) achieves *60.3* mAP and *67.3* NDS on nuScenes detection testset. - **Extensible**: Simple baseline to switch in your backbone and novel algorithms. ## Main results #### 3D detection | | Split | MAP | NDS | FPS | |---------|---------|---------|--------|-------| | PointPillars-512 | Val | 48.3 | 59.1 | 30.3 | | VoxelNet-1024 | Val | 55.4 | 63.8 | 14.5 | | VoxelNet-1440_dcn_flip | Val | 59.1 | 67.1 | 2.2 | | VoxelNet-1440_dcn_flip | Test | 60.3 | 67.3 | 2.2 | #### 3D Tracking | | Split | Tracking time | Total time | AMOTA ↑ | AMOTP ↓ | |-----------------------|-----------|---------------|--------------|---------|---------| | CenterPoint_pillar_512| val | 1ms | 34ms | 54.2 | 0.680 | | CenterPoint_voxel_1024| val | 1ms | 70ms | 62.6 | 0.630 | | CenterPoint_voxel_1440_dcn_flip | val | 1ms | 451ms | 65.9 | 0.567 | | CenterPoint_voxel_1440_dcn_flip | test | 1ms | 451ms | 63.8 | 0.555 | All results are tested on a Titan Xp GPU with batch size 1. More models and details can be found in [MODEL_ZOO.md](docs/MODEL_ZOO.md). ## Third-party resources - [AFDet](https://arxiv.org/abs/2006.12671): another work inspired by CenterNet achieves good performance on KITTI/Waymo dataset. ## Use CenterPoint We provide a demo with PointPillars model for 3D object detection on the nuScenes dataset. ### Basic Installation ```bash # basic python libraries conda create --name centerpoint python=3.6 conda activate centerpoint conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch git clone https://github.com/tianweiy/CenterPoint.git cd CenterPoint pip install -r requirements.txt # add CenterPoint to PYTHONPATH by adding the following line to ~/.bashrc (change the path accordingly) export PYTHONPATH="${PYTHONPATH}:PATH_TO_CENTERPOINT" ``` First download the model (By default, [centerpoint_pillar_512](https://drive.google.com/file/d/1ubWKx3Jg1AqF93qqWIZxgGXTycQ77qM3/view?usp=sharing)) from the [Model Zoo](docs/MODEL_ZOO.md) and put it in ```work_dirs/centerpoint_pillar_512_demo```. We provide a driving sequence clip from the [nuScenes dataset](https://www.nuscenes.org). Donwload the [folder](https://drive.google.com/file/d/1bK-xeq5UwJzpPfVDhICDJeKiU1QVZwtI/view?usp=sharing) and put in the main directory. Then run a demo by ```python tools/demo.py```. If setup corectly, you will see an output video like (red is gt objects, blue is the prediction):

## Advanced Installation For more advanced usage, please refer to [INSTALL](docs/INSTALL.md) to set up more libraries needed for distributed training and sparse convolution. ## Benchmark Evaluation and Training Please refer to [GETTING_START](docs/GETTING_START.md) to prepare the data. Then follow the instruction there to reproduce our detection and tracking results. All detection configurations are included in [configs](configs) and we provide the scripts for all tracking experiments in [tracking_scripts](tracking_scripts). The pretrained models, log, and each model's prediction files are provided in the [MODEL_ZOO.md](docs/MODEL_ZOO.md). ## License CenterPoint is release under MIT license (see [LICENSE](LICENSE)). It is developed based on a forked version of [det3d](https://github.com/poodarchu/Det3D/tree/56402d4761a5b73acd23080f537599b0888cce07). We also incorperate a large amount of code from [CenterNet](https://github.com/xingyizhou/CenterNet) and [CenterTrack](https://github.com/xingyizhou/CenterTrack). See the [NOTICE](docs/NOTICE) for details. Note that the nuScenes dataset is free of charge for non-commercial activities. Please contact the [nuScenes team](https://www.nuscenes.org) for commercial usage. ## Acknowlegement This project is not possible without multiple great opensourced codebases. We list some notable examples below. * [det3d](https://github.com/poodarchu/det3d) * [second.pytorch](https://github.com/traveller59/second.pytorch) * [CenterTrack](https://github.com/xingyizhou/CenterTrack) * [CenterNet](https://github.com/xingyizhou/CenterNet) * [mmcv](https://github.com/open-mmlab/mmcv) * [mmdetection](https://github.com/open-mmlab/mmdetection) * [maskrcnn_benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) * [PCDet](https://github.com/sshaoshuai/PCDet) **CenterPoint is deeply influenced by the following projects. Please consider citing the relevant papers.** ``` @article{zhu2019classbalanced, title={Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection}, author={Zhu, Benjin and Jiang, Zhengkai and Zhou, Xiangxin and Li, Zeming and Yu, Gang}, journal={arXiv:1908.09492}, year={2019} } @article{lang2019pillar, title={PointPillars: Fast Encoders for Object Detection From Point Clouds}, journal={CVPR}, author={Lang, Alex H. and Vora, Sourabh and Caesar, Holger and Zhou, Lubing and Yang, Jiong and Beijbom, Oscar}, year={2019}, } @article{zhou2018voxelnet, title={VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection}, journal={CVPR}, author={Zhou, Yin and Tuzel, Oncel}, year={2018}, } @article{yan2018second, title={Second: Sparsely embedded convolutional detection}, author={Yan, Yan and Mao, Yuxing and Li, Bo}, journal={Sensors}, year={2018}, } @article{zhou2019objects, title={Objects as Points}, author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp}, journal={arXiv:1904.07850}, year={2019} } @article{zhou2020tracking, title={Tracking Objects as Points}, author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp}, journal={arXiv:2004.01177}, year={2020} } ```