# ROMP
**Repository Path**: bzzhang/ROMP
## Basic Information
- **Project Name**: ROMP
- **Description**: No description available
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 0
- **Created**: 2021-07-30
- **Last Updated**: 2025-08-01
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Monocular, One-stage, Regression of Multiple 3D People
[](https://colab.research.google.com/drive/1oz9E6uIbj4udOPZvA1Zi9pFx0SWH_UXg)
[](https://arxiv.org/pdf/2008.12272v3.pdf)
[](https://paperswithcode.com/sota/3d-human-pose-estimation-on-3dpw?p=centerhmr-a-bottom-up-single-shot-method-for)
[](https://paperswithcode.com/sota/3d-human-pose-estimation-on-3d-poses-in-the?p=centerhmr-a-bottom-up-single-shot-method-for)
ROMP is a one-stage network for multi-person 3D mesh recovery from a single image.
> [**Monocular, One-stage, Regression of Multiple 3D People**](https://arxiv.org/abs/2008.12272),
> Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei,
> *arXiv paper ([arXiv 2008.12272](https://arxiv.org/abs/2008.12272))*
Contact: [yusun@stu.hit.edu.cn](mailto:yusun@stu.hit.edu.cn). Feel free to contact me for related questions or discussions!
- **Simple:** Simultaneously predicting the body center locations and corresponding 3D body mesh parameters for all people at each pixel.
- **Fast:** ROMP ResNet-50 model runs over *30* FPS on a 1070Ti GPU.
- **Strong**: ROMP achieves superior performance on multiple challenging multi-person/occlusion benchmarks, including 3DPW, CMU Panoptic, and 3DOH50K.
- **Easy to use:** We provide user friendly testing API and webcam demos.
### News
*2021/7/15: Adding support for an elegant context manager to run code in a notebook.* See [Colab demo](https://colab.research.google.com/drive/1oz9E6uIbj4udOPZvA1Zi9pFx0SWH_UXg) for the details.
*2021/4/19: Adding support for textured SMPL mesh using [vedo](https://github.com/marcomusy/vedo).* See [visualization.md](docs/visualization.md) for the details.
*2021/3/30: 1.0 version.* Rebuilding the code. Release the ResNet-50 version and evaluation on 3DPW.
*2020/11/26: Optimization for person-person occlusion.* Small changes for video support.
*2020/9/11: Real-time webcam demo using local/remote server.* Please refer to [config_guide.md](docs/config_guide.md) for details.
*2020/9/4: Google Colab demo.* Saving a npy file per imag. Please refer to [config_guide.md](docs/config_guide.md) for details.
### Try on Google Colab
Before installation, you can take a few minutes to try the prepared [Google Colab demo](https://colab.research.google.com/drive/1oz9E6uIbj4udOPZvA1Zi9pFx0SWH_UXg) a try.
It allows you to run the project in the cloud, free of charge.
Please refer to the [bug.md](docs/bugs.md) for unpleasant bugs. Welcome to submit the issues for related bugs.
### Installation
Please refer to [install.md](docs/install.md) for installation.
### Demo
Currently, the released code is used to re-implement demo results. Only 1-2G GPU memory is needed.
To do this you just need to run
```bash
cd ROMP/src
sh run.sh
# if there are any bugs about shell script, please consider run the following command instead:
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/single_image.yml
```
Results will be saved in ROMP/demo/images_results.
#### Internet images
You can also run the code on random internet images via putting the images under ROMP/demo/images.
Please refer to [config_guide.md](docs/config_guide.md) for **saving the estimated mesh/Center maps/parameters dict**.
#### Internet videos
You can also run the code on random internet videos.
To do this you just need to firstly change the input_video_path in src/configs/video.yml to /path/to/your/video. For example, set
```bash
video_or_frame: True
input_video_path: '../demo/videos/sample_video.mp4' # None
output_dir: '../demo/videos/sample_video_results/'
```
then run
```bash
cd ROMP/src
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/video.yml
```
Results will be saved to `../demo/videos/sample_video_results`.
#### Batch Videos
You can also batch process a directory of videos.
Please refer to [batch_videos.md](docs/batch_videos.md) for more info.
###### Unix
```shell
python lib/utils/batch_videos.py --input=/home/user/Animations/mocap/cleaned --output=/home/user/Animations/mocap/cleaned/processed --extension mp4 --run_conversion --yaml_template=configs/video-batch.yml
```
###### Windows
```sh
python lib/utils/batch_videos.py --input=M:/Animations/mocap/cleaned --output=M:/Animations/mocap/cleaned/processed --extension mp4 --windows --run_conversion --yaml_template=configs/video-batch.yml
```
#### Webcam
We also provide the webcam demo code, which can run at real-time on a 1070Ti GPU / remote server.
Currently, limited by the visualization pipeline, the webcam visualization code only support the single-person mesh.
To do this you just need to run:
```bash
cd ROMP/src
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/webcam.yml
# or try to use the model with ResNet-50 as backbone.
CUDA_VISIBLE_DEVICES=0 python core/test.py --gpu=0 --configs_yml=configs/webcam_resnet.yml
```
Press Up/Down to end the demo. Pelease refer to [config_guide.md](docs/config_guide.md) for running webcam demo on remote server, setting mesh color or camera id.
### Blender
##### Export to Blender FBX
Please refer to [expert.md](docs/export.md) to export the results to fbx files for Blender usage. Currently, this function only support the single-person video cases. Therefore, please test it with `../demo/videos/sample_video2_results/sample_video2.mp4`, whose results would be saved to `../demo/videos/sample_video2_results`.
##### Blender Addons
- [vltmedia/QuickMocap-BlenderAddon: Use this Blender Addon to import & clean Mocap Pose data from .npz or .pkl files. These files may have been created using Numpy, ROMP, or other motion capture processes that package their files accordingly. (github.com)](https://github.com/vltmedia/QuickMocap-BlenderAddon)
- Reads the .npz file created by ROMP. Clean & smooth the resulting keyframes.
- 
### Evaluation
Please refer to [evaluation.md](docs/evaluation.md) for evaluation on benchmarks.
## TODO LIST
The code will be gradually open sourced according to:
- [ ] the schedule
- [x] demo code for internet images / videos / webcam
- [x] runtime optimization
- [x] benchmark evaluation
- [ ] training
## Citation
Please considering citing
```bibtex
@InProceedings{ROMP,
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
title = {Monocular, One-stage, Regression of Multiple 3D People},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021}
}
```
## Acknowledgement
We thank [Peng Cheng](https://github.com/CPFLAME) for his constructive comments on Center map training.
Thanks to [Marco Musy](https://github.com/marcomusy) for his help in [the textured SMPL visualization](https://github.com/marcomusy/vedo/issues/371).
Thanks to [Gavin Gray](https://github.com/gngdb) for adding support for an elegant context manager to run code in a notebook via [this pull](https://github.com/Arthur151/ROMP/pull/58).
Thanks to [VLT Media](https://github.com/vltmedia) for adding support for running on Windows & batch_videos.py.
Here are some great resources we benefit:
- SMPL models and layer is borrowed from MPII [SMPL-X model](https://github.com/vchoutas/smplx).
- Webcam pipeline is borrowed from [minimal-hand](https://github.com/CalciferZh/minimal-hand).
- Some functions are borrowed from [HMR-pytorch](https://github.com/MandyMo/pytorch_HMR).
- Some functions for data augmentation are borrowed from [SPIN](https://github.com/nkolot/SPIN).
- Synthetic occlusion is borrowed from [synthetic-occlusion](https://github.com/isarandi/synthetic-occlusion).
- The evaluation code of 3DPW dataset is brought from [3dpw-eval](https://github.com/aymenmir1/3dpw-eval).
- For fair comparison, the GT annotations of 3DPW dataset are brought from [VIBE](https://github.com/mkocabas/VIBE).
- 3D mesh visualization is supported by [vedo](https://github.com/marcomusy/vedo) and [Open3D]( https://github.com/intel-isl/Open3D).