# BANMo
**Repository Path**: simonck666/banmo
## Basic Information
- **Project Name**: BANMo
- **Description**: Fork BANMo 20220315
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2022-03-15
- **Last Updated**: 2024-10-15
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# BANMo
#### [[Webpage]](https://banmo-www.github.io/) [[Latest preprint (02/14/2022)]](https://banmo-www.github.io/banmo-2-14.pdf) [[Arxiv]](https://arxiv.org/abs/2112.12761)
### Changelog
- **02/15**: Add motion-retargeting, quantitative evaluation and synthetic data generation/eval.
- **02/17**: Add adaptation to a new video, optimization with known root poses, and pose code visualization.
- **02/23**: Improve NVS with fourier light code, improve uncertainty MLP, add long schedule, minor speed up.
## Install
### Build with conda
We provide two versions.
[A. torch1.10+cu113 (1.4x faster on V100)]
```
# clone repo
git clone git@github.com:facebookresearch/banmo.git --recursive
cd banmo
# install conda env
conda env create -f misc/banmo-cu113.yml
conda activate banmo-cu113
# install pytorch3d (takes minutes), kmeans-pytorch
pip install -e third_party/pytorch3d
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
```
[B. torch1.7+cu110]
```
# clone repo
git clone git@github.com:facebookresearch/banmo.git --recursive
cd banmo
# install conda env
conda env create -f misc/banmo.yml
conda activate banmo
# install kmeans-pytorch
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html
```
### Data
We provide two ways to obtain data.
The easiest way is to download and unzip the pre-processed data as follows.
[Download pre-processed data]
We provide preprocessed data for cat and human.
Download the pre-processed `rgb/mask/flow/densepose` images as follows
```
# (~8G for each)
bash misc/processed/download.sh cat-pikachiu
bash misc/processed/download.sh human-cap
```
[Download raw videos]
Download raw videos to `./raw/` folder
```
bash misc/vid/download.sh cat-pikachiu
bash misc/vid/download.sh human-cap
bash misc/vid/download.sh dog-tetres
bash misc/vid/download.sh cat-coco
```
**To use your own videos, or pre-process raw videos into banmo format,
please follow the instructions [here](./preprocess).**
### PoseNet weights
[expand]
Download pre-trained PoseNet weights for human and quadrupeds
```
mkdir -p mesh_material/posenet && cd "$_"
wget $(cat ../../misc/posenet.txt); cd ../../
```
## Demo
This example shows how to reconstruct a cat from 11 videos and a human from 10 videos.
For more examples, see [here](./scripts/README.md).
Hardware/time for running the demo
By default, it takes 8 hours on 2 V100 GPUs and 15 hours on 1 V100 GPU.
We provide a [script](./scripts/template-accu.sh) that use gradient accumulation
to support experiments on fewer GPUs / GPU with lower memory.
Setting good hyper-parameter for your own videos
When optimizing your own videos, a rule of thumb is to set
"num gpus" x "batch size" x "accu steps" ~= num frames (default number 512 suits for cat-pikachiu and human-hap)
Try pre-optimized models
We provide pre-optimized models and scripts to run mesh extraction and novel view synthesis.
|seqname | download link |
|---|---|
|cat-pikachiu|[.npy](https://www.dropbox.com/s/nc2aawnwrmil8jr/cat-pikachiu.npy), [.pth](https://www.dropbox.com/s/i8sjlgbom5eoy0j/cat-pikachiu.pth)|
|cat-coco|[.npy](https://www.dropbox.com/s/fwf8il8bt9c812f/cat-coco.npy), [.pth](https://www.dropbox.com/s/4g0w6z4xec4f88g/cat-coco.pth)|
```
# download pre-optimized models
mkdir -p tmp && cd "$_"
wget https://www.dropbox.com/s/nc2aawnwrmil8jr/cat-pikachiu.npy
wget https://www.dropbox.com/s/i8sjlgbom5eoy0j/cat-pikachiu.pth
cd ../
seqname=cat-pikachiu
# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 $seqname tmp/cat-pikachiu.pth \
"0 1 2 3 4 5 6 7 8 9 10" 256
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (256 by default)
# render novel views
bash scripts/render_nvs.sh 0 $seqname tmp/cat-pikachiu.pth 5 0
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: path to the weights
# argv[4]: video id used for pose traj
# argv[5]: video id used for root traj
```
#### 1. Optimization
[cat-pikachiu]
```
seqname=cat-pikachiu
# To speed up data loading, we store images as lines of pixels).
# only needs to run it once per sequence and data are stored
python preprocess/img2lines.py --seqname $seqname
# Optimization
bash scripts/template.sh 0,1 $seqname 10001 "no" "no"
# argv[1]: gpu ids separated by comma
# args[2]: sequence name
# args[3]: port for distributed training
# args[4]: use_human, pass "" for human cse, "no" for quadreped cse
# args[5]: use_symm, pass "" to force x-symmetric shape
# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 $seqname logdir/$seqname-e120-b256-ft3/params_latest.pth \
"0 1 2 3 4 5 6 7 8 9 10" 256
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (256 by default)
```
https://user-images.githubusercontent.com/13134872/154554031-332e2355-3303-43e3-851c-b5812699184b.mp4
[human-cap]
```
seqname=adult7
python preprocess/img2lines.py --seqname $seqname
bash scripts/template.sh 0,1 $seqname 10001 "" ""
bash scripts/render_mgpu.sh 0 $seqname logdir/$seqname-e120-b256-ft3/params_latest.pth \
"0 1 2 3 4 5 6 7 8 9" 256
```
https://user-images.githubusercontent.com/13134872/154554210-3bb0a439-fe46-4ea3-a058-acecf5f8dbb5.mp4
Use more iterations for for better color rendering and novel view synthesis results, see `scripts/template-long.sh`.
#### 2. Visualization tools
[Tensorboard]
```
# You may need to set up ssh tunneling to view the tensorboard monitor locally.
screen -dmS "tensorboard" bash -c "tensorboard --logdir=logdir --bind_all"
```
[Root pose, rest mesh, bones]
To draw root pose trajectories (+rest shape) over epochs
```
# logdir
logdir=logdir/$seqname-e120-b256-init/
# first_idx, last_idx specifies what frames to be drawn
python scripts/visualize/render_root.py --testdir $logdir --first_idx 0 --last_idx 120
```
Find the output at `$logdir/mesh-cam.gif`.
During optimization, the rest mesh and bones at each epoch are saved at `$logdir/*rest.obj`.
https://user-images.githubusercontent.com/13134872/154553887-1871fdea-24f4-4a79-8689-86ff6af7fa52.mp4
[Correspondence/pose code]
To visualize 2d-2d and 2d-3d matchings of the latest epoch weights
```
# 2d matches between frame 0 and 100 via 2d->feature matching->3d->geometric warp->2d
bash scripts/render_match.sh $logdir/params_latest.pth "0 100" "--render_size 128"
```
2d-2d matches will be saved to `tmp/match_%03d.jpg`.
2d-3d feature matches of frame 0 will be saved to `tmp/match_line_pred.obj`.
2d-3d geometric warps of frame 0 will be saved to `tmp/match_line_exp.obj`.
near-plane frame 0 will be saved to `tmp/match_plane.obj`.
Pose code visualization will be saved at `tmp/code.mp4`.
https://user-images.githubusercontent.com/13134872/154553652-c93834db-cce2-4158-a30a-21680ab46a63.mp4
[Render novel views]
Render novel views at the canonical camera coordinate
```
bash scripts/render_nvs.sh 0 $seqname logdir/$seqname-e120-b256-ft3/params_latest.pth 5 0
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: path to the weights
# argv[4]: video id used for pose traj
# argv[5]: video id used for root traj
```
Results will be saved at `logdir/$seqname-e120-b256-ft3/nvs*.mp4`.
https://user-images.githubusercontent.com/13134872/155441493-38bf7a02-a6ee-4f2f-9dc5-0cf98a4c7c45.mp4
### Common install issues
[expand]
* Q: pyrender reports `ImportError: Library "GLU" not found.`
* install `sudo apt install freeglut3-dev`
* Q: ffmpeg reports `libopenh264.so.5` not fund
* install ffmpeg `sudo apt-get install ffmpeg` and remove ~/anaconda/envs/banmo/bin/ffmpeg
### Note on arguments
[expand]
- use `--use_human` for human reconstruction, otherwise it assumes quadruped animals
- use `--full_mesh` at mesh extraction time to extract a complete surface (disable visibility check)
- use `--queryfw` at mesh extraction time to extract forward articulated meshes, which only needs to run marching cubes once.
- use `--use_cc` maintains the largest connected component for rest mesh in order to set the object bounds and near-far plane (by default turned on). Turn it off with `--nouse_cc` for disconnected objects such as hands.
- use `--debug` to print out the rough time each component takes.
### Acknowledgement
[expand]
Volume rendering code is borrowed from [Nerf_pl](https://github.com/kwea123/nerf_pl).
Flow estimation code is adapted from [VCN-robust](https://github.com/gengshan-y/rigidmask).
Other external repos:
- [Detectron2](https://github.com/facebookresearch/detectron2) (modified)
- [SoftRas](https://github.com/ShichenLiu/SoftRas) (modified, for synthetic data generation)
- [Chamfer3D](https://github.com/ThibaultGROUEIX/ChamferDistancePytorch) (for evaluation)
### License
[expand]
- [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
See the [LICENSE](LICENSE) file.