# Segment-Any-Point-Cloud **Repository Path**: chenchunguang/Segment-Any-Point-Cloud ## Basic Information - **Project Name**: Segment-Any-Point-Cloud - **Description**: No description available - **Primary Language**: Python - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-09-07 - **Last Updated**: 2024-10-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
English | 简体中文
Youquan Liu1,*
Lingdong Kong1,2,*
Jun Cen3
Runnan Chen4
Wenwei Zhang1,5
Liang Pan5
Kai Chen1
Ziwei Liu5
1Shanghai AI Laboratory
2National University of Singapore
3The Hong Kong University of Science and Technology
4The University of Hong Kong
5S-Lab, Nanyang Technological University
|
|
|
| [Link](https://youtu.be/S0q2-nQdwSs) :arrow_heading_up: | [Link](https://youtu.be/yoon3uiRnY8) :arrow_heading_up: | [Link]() :arrow_heading_up: |
## Updates
- \[2023.12\] - We are hosting [The RoboDrive Challenge](https://robodrive-24.github.io/) at [ICRA 2024](https://2024.ieee-icra.org/). :blue_car:
- \[2023.09\] - `Seal` was selected as a :sparkles: spotlight :sparkles: at [NeurIPS 2023](https://neurips.cc/).
- \[2023.09\] - `Seal` was accepted to [NeurIPS 2023](https://neurips.cc/)! :tada:
- \[2023.07\] - We release the [code](docs/document/SUPERPOINT.md) for generating semantic superpixel & superpoint by [SLIC](https://scikit-image.org/docs/stable/api/skimage.segmentation.html#skimage.segmentation.slic), [SAM](https://github.com/facebookresearch/segment-anything), and [SEEM](https://scikit-image.org/docs/stable/api/skimage.segmentation.html#skimage.segmentation.slic). More VFMs coming on the way!
- \[2023.06\] - Our paper is available on arXiv, click [here](https://arxiv.org/abs/2306.09347) to check it out. Code will be available later!
## Outline
- [Installation](#installation)
- [Data Preparation](#data-preparation)
- [Superpoint Generation](#superpoint-generation)
- [Getting Started](#getting-started)
- [Main Result](#main-result)
- [TODO List](#todo-list)
- [License](#license)
- [Acknowledgement](#acknowledgement)
- [Citation](#citation)
## Installation
Please refer to [INSTALL.md](docs/document/INSTALL.md) for the installation details.
## Data Preparation
| [**nuScenes**](https://www.nuscenes.org/nuscenes) | [**SemanticKITTI**](http://semantic-kitti.org/) | [**Waymo Open**](https://waymo.com/open) | [**ScribbleKITTI**](https://github.com/ouenal/scribblekitti) |
| :-: | :-: | :-: | :-: |
|
|
|
|
|
| [**RELLIS-3D**](http://www.unmannedlab.org/research/RELLIS-3D) | [**SemanticPOSS**](http://www.poss.pku.edu.cn/semanticposs.html) | [**SemanticSTF**](https://github.com/xiaoaoran/SemanticSTF) | [**DAPS-3D**](https://github.com/subake/DAPS3D) |
|
|
|
|
|
| [**SynLiDAR**](https://github.com/xiaoaoran/SynLiDAR) | [**Synth4D**](https://github.com/saltoricristiano/gipso-sfouda) | [**nuScenes-C**](https://github.com/ldkong1205/Robo3D) |
|
|
|
|
Please refer to [DATA_PREPARE.md](docs/document/DATA_PREPARE.md) for the details to prepare these datasets.
## Superpoint Generation
| Raw Point Cloud | Semantic Superpoint | Groundtruth |
| :-: | :-: | :-: |
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
Kindly refer to [SUPERPOINT.md](docs/document/SUPERPOINT.md) for the details to generate the semantic superpixels & superpoints with vision foundation models.
## Getting Started
Kindly refer to [GET_STARTED.md](docs/document/GET_STARTED.md) to learn more usage of this codebase.
## Main Result
### :unicorn: Framework Overview
|
|
| :-: |
| Overview of the **Seal :seal:** framework. We generate, for each {LiDAR, camera} pair at timestamp t and another LiDAR frame at timestamp t + n, the semantic superpixel and superpoint by VFMs. Two pertaining objectives are then formed, including *spatial contrastive learning* between paired LiDAR and camera features and *temporal consistency regularization* between segments at different timestamps. |
### :car: Cosine Similarity
|
|
| :-: |
| The cosine similarity between a query point (red dot) and the feature learned with SLIC and different VFMs in our **Seal :seal:** framework. The queried semantic classes from top to bottom examples are: “car”, “manmade”, and “truck”. The color goes from violet to yellow denoting low and high similarity scores, respectively. |
### :blue_car: Benchmark
| Method | nuScenes | KITTI | Waymo | Synth4D | |||||
|---|---|---|---|---|---|---|---|---|---|
| LP | 1% | 5% | 10% | 25% | Full | 1% | 1% | 1% | |
| Random | 8.10 | 30.30 | 47.84 | 56.15 | 65.48 | 74.66 | 39.50 | 39.41 | 20.22 |
| PointContrast | 21.90 | 32.50 | - | - | - | - | 41.10 | - | - |
| DepthContrast | 22.10 | 31.70 | - | - | - | - | 41.50 | - | - |
| PPKT | 35.90 | 37.80 | 53.74 | 60.25 | 67.14 | 74.52 | 44.00 | 47.60 | 61.10 |
| SLidR | 38.80 | 38.30 | 52.49 | 59.84 | 66.91 | 74.79 | 44.60 | 47.12 | 63.10 |
| ST-SLidR | 40.48 | 40.75 | 54.69 | 60.75 | 67.70 | 75.14 | 44.72 | 44.93 | - |
| Seal :seal: | 44.95 | 45.84 | 55.64 | 62.97 | 68.41 | 75.60 | 46.63 | 49.34 | 64.50 |
|
| :-: |
| The qualitative results of our **Seal :seal:** framework pretrained on nuScenes (without using groundtruth labels) and linear probed with a frozen backbone and a linear classification head. To highlight the differences, the correct / incorrect predictions are painted in gray / red, respectively. |
### :articulated_lorry: Downstream Generalization
| Method | ScribbleKITTI | RELLIS-3D | SemanticPOSS | SemanticSTF | SynLiDAR | DAPS-3D | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1% | 10% | 1% | 10% | Half | Full | Half | Full | 1% | 10% | Half | Full | |
| Random | 23.81 | 47.60 | 38.46 | 53.60 | 46.26 | 54.12 | 48.03 | 48.15 | 19.89 | 44.74 | 74.32 | 79.38 |
| PPKT | 36.50 | 51.67 | 49.71 | 54.33 | 50.18 | 56.00 | 50.92 | 54.69 | 37.57 | 46.48 | 78.90 | 84.00 |
| SLidR | 39.60 | 50.45 | 49.75 | 54.57 | 51.56 | 55.36 | 52.01 | 54.35 | 42.05 | 47.84 | 81.00 | 85.40 |
| Seal :seal: | 40.64 | 52.77 | 51.09 | 55.03 | 53.26 | 56.89 | 53.46 | 55.36 | 43.58 | 49.26 | 81.88 | 85.90 |
|
| :-: |
| The qualitative results of **Seal :seal:** and prior methods pretrained on nuScenes (without using groundtruth labels) and fine-tuned with 1% labeled data. To highlight the differences, the correct / incorrect predictions are painted in gray / red, respectively. |
## TODO List
- [x] Initial release. :rocket:
- [x] Add license. See [here](#license) for more details.
- [x] Add video demos :movie_camera:
- [x] Add installation details.
- [x] Add data preparation details.
- [x] Support semantic superpixel generation.
- [x] Support semantic superpoint generation.
- [ ] Add evaluation details.
- [ ] Add training details.
## Citation
If you find this work helpful, please kindly consider citing our paper:
```bibtex
@inproceedings{liu2023segment,
title = {Segment Any Point Cloud Sequences by Distilling Vision Foundation Models},
author = {Liu, Youquan and Kong, Lingdong and Cen, Jun and Chen, Runnan and Zhang, Wenwei and Pan, Liang and Chen, Kai and Liu, Ziwei},
booktitle = {Advances in Neural Information Processing Systems},
year = {2023},
}
```
```bibtex
@misc{liu2023segment_any_point_cloud,
title = {The Segment Any Point Cloud Codebase},
author = {Liu, Youquan and Kong, Lingdong and Cen, Jun and Chen, Runnan and Zhang, Wenwei and Pan, Liang and Chen, Kai and Liu, Ziwei},
howpublished = {\url{https://github.com/youquanl/Segment-Any-Point-Cloud}},
year = {2023},
}
```
## License
