# SLAM3R **Repository Path**: vic17/SLAM3R ## Basic Information - **Project Name**: SLAM3R - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-01-09 - **Last Updated**: 2025-01-09 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos

Yuzheng Liu* · Siyan Dong* · Shuzhe Wang · Yanchao Yang · Qingnan Fan · Baoquan Chen

Paper | Online Demo (Coming Soon)

SLAM3R is a real-time dense scene reconstruction system that regresses 3D points from video frames using feed-forward neural networks, without explicitly estimating camera parameters.

## TODO List - [x] Release pre-trained weights and inference code. - [ ] Release Gradio Demo. - [ ] Release evaluation code. - [ ] Release training code and data. ## Installation 1. Clone SLAM3R ```bash git clone https://github.com/PKU-VCL-3DV/SLAM3R.git cd SLAM3R ``` 2. Prepare environment ```bash conda create -n slam3r python=3.11 cmake=3.14.0 conda activate slam3r # install torch according to your cuda version pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt # optional: install XFormers according to your pytorch version, see https://github.com/facebookresearch/xformers pip install xformers==0.0.28.post2 ``` 3. Optional: Compile cuda kernels for RoPE ```bash cd slam3r/pos_embed/curope/ python setup.py build_ext --inplace cd ../../../ ``` 4. Download the SLAM3R checkpoints for the Image-to-Points and Local-to-World models from Hugging Face ```bash from huggingface_hub import hf_hub_download filepath = hf_hub_download(repo_id='siyan824/slam3r_i2p', filename='slam3r_i2p.pth', local_dir='./checkpoints') filepath = hf_hub_download(repo_id='siyan824/slam3r_l2w', filename='slam3r_l2w.pth', local_dir='./checkpoints') ``` or Google Drive: [Image-to-Points model](https://drive.google.com/file/d/1DhBxEmUlo9a6brf5_Z21EWzpX3iKhVce/view?usp=drive_link) and [Local-to-World model](https://drive.google.com/file/d/1LkPZBNz8WlMwxdGvvb1ZS4rKrWO-_aqQ/view?usp=drive_link). Place them under `./checkpoints/` ## Demo ### Replica dataset To run our demo on Replica dataset, download the sample scene [here](https://drive.google.com/file/d/1NmBtJ2A30qEzdwM0kluXJOp2d1Y4cRcO/view?usp=drive_link) and unzip it to `./data/Replica/`. Then run the following command to reconstruct the scene from the video images ```bash bash scripts/demo_replica.sh ``` The results will be stored at `./visualization/` by default. ### Self-captured outdoor data We also provide a set of images extracted from an in-the-wild captured video. Download it [here](https://drive.google.com/file/d/1FVLFXgepsqZGkIwg4RdeR5ko_xorKyGt/view?usp=drive_link) and unzip it to `./data/wild/`. Set the required parameter in this [script](./scripts/demo_wild.sh), and then run SLAM3R by using the following command ```bash bash scripts/demo_wild.sh ``` You can run SLAM3R on your self-captured video with the steps above. Here are [some tips](./docs/recon_tips.md) for it ## Citation If you find our work helpful in your research, please consider citing: ``` @article{slam3r, title={SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos}, author={Liu, Yuzheng and Dong, Siyan and Wang, Shuzhe and Yang, Yanchao and Fan, Qingnan and Chen, Baoquan}, journal={arXiv preprint arXiv:2412.09401}, year={2024} } ``` ## Acknowledgments Our implementation is based on several awesome repositories: - [Croco](https://github.com/naver/croco) - [DUSt3R](https://github.com/naver/dust3r) - [NICER-SLAM](https://github.com/cvg/nicer-slam) - [Spanner](https://github.com/HengyiWang/spann3r) We thank the respective authors for open-sourcing their code.