# TripoSG
**Repository Path**: mirrors/TripoSG
## Basic Information
- **Project Name**: TripoSG
- **Description**: TripoSG 是一个基于 RF 的 MoE Transformer 3D 生成基础模型
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: https://www.oschina.net/p/triposg
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 2
- **Created**: 2025-04-02
- **Last Updated**: 2025-08-30
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
[](https://yg256li.github.io/TripoSG-Page/)
[](https://arxiv.org/abs/2502.06608)
[](https://huggingface.co/VAST-AI/TripoSG)
[](https://huggingface.co/spaces/VAST-AI/TripoSG)
[](https://huggingface.co/spaces/VAST-AI/TripoSG-scribble)
**By [Tripo](https://www.tripo3d.ai)**

TripoSG is an advanced high-fidelity, high-quality and high-generalizability image-to-3D generation foundation model. It leverages large-scale rectified flow transformers, hybrid supervised training, and a high-quality dataset to achieve state-of-the-art performance in 3D shape generation.
## ✨ Key Features
- **High-Fidelity Generation**: Produces meshes with sharp geometric features, fine surface details, and complex structures
- **Semantic Consistency**: Generated shapes accurately reflect input image semantics and appearance
- **Strong Generalization**: Handles diverse input styles including photorealistic images, cartoons, and sketches
- **Robust Performance**: Creates coherent shapes even for challenging inputs with complex topology
## 🔬 Technical Highlights
- **Large-Scale Rectified Flow Transformer**: Combines RF's linear trajectory modeling with transformer architecture for stable, efficient training
- **Advanced VAE Architecture**: Uses Signed Distance Functions (SDFs) with hybrid supervision combining SDF loss, surface normal guidance, and eikonal loss
- **High-Quality Dataset**: Trained on 2 million meticulously curated Image-SDF pairs, ensuring superior output quality
- **Efficient Scaling**: Implements architecture optimizations for high performance even at smaller model scales
## 🔥 Updates
* [2025-04] Release TripoSG-scribble, a CFG-distilled, 512 token model for fast shape prototyping from scribble+prompt! Try the online demo [here](https://huggingface.co/spaces/VAST-AI/TripoSG-scribble).
* [2025-03] Release of TripoSG 1.5B parameter rectified flow model and VAE trained on 2048 latent tokens, along with inference code and interactive demo
## 🔨 Installation
Clone the repo:
```bash
git clone https://github.com/VAST-AI-Research/TripoSG.git
cd TripoSG
```
Create a conda environment (optional):
```bash
conda create -n tripoSG python=3.10
conda activate tripoSG
```
Install dependencies:
```bash
# pytorch (select correct CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/{your-cuda-version}
# other dependencies
pip install -r requirements.txt
```
## 💡 Quick Start
Generate a 3D mesh from an image:
```bash
python -m scripts.inference_triposg --image-input assets/example_data/hjswed.png --output-path ./output.glb
```
Limiting the number of faces:
```bash
python -m scripts.inference_triposg --image-input assets/example_data/hjswed.png --faces 5000 --output-path ./output.glb
```
or from scribble+prompt:
```bash
python -m scripts.inference_triposg_scribble --image-input assets/example_scribble_data/cat_with_wings.png --prompt "a cat with wings" --scribble-conf 0.3 --output-path output.glb
```
The required model weights will be automatically downloaded:
- TripoSG (image condition) model from [VAST-AI/TripoSG](https://huggingface.co/VAST-AI/TripoSG) → `pretrained_weights/TripoSG`
= TripoSG-scribble (scribble+prompt condition) model from [VAST-AI/TripoSG-scribble](https://huggingface.co/VAST-AI/TripoSG-scribble) → `pretrained_weights/TripoSG-scribble`
- RMBG model from [briaai/RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) → `pretrained_weights/RMBG-1.4`
## 💻 System Requirements
- CUDA-enabled GPU with at least 8GB VRAM
## 📝 Tips
- If you want to use the full VAE module (including the encoder part), you need to uncomment the Line-15 in `triposg/models/autoencoders/autoencoder_kl_triposg.py` and install `torch-cluster`. and run:
```
python -m scripts.inference_vae --surface-input assets/example_data_point/surface_point_demo.npy
```
## 🤝 Community & Support
- **Issues & Discussions**: Use GitHub Issues for bug reports and feature requests.
- **Contributing**: We welcome contributions!
## 📚 Citation
```
@article{li2025triposg,
title={TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models},
author={Li, Yangguang and Zou, Zi-Xin and Liu, Zexiang and Wang, Dehu and Liang, Yuan and Yu, Zhipeng and Liu, Xingchao and Guo, Yuan-Chen and Liang, Ding and Ouyang, Wanli and others},
journal={arXiv preprint arXiv:2502.06608},
year={2025}
}
```
## ⭐ Acknowledgements
We would like to thank the following open-source projects and research works that made TripoSG possible:
- [DINOv2](https://github.com/facebookresearch/dinov2) for their powerful visual features
- [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) for background removal
- [🤗 Diffusers](https://github.com/huggingface/diffusers) for their excellent diffusion model framework
- [HunyuanDiT](https://github.com/Tencent/HunyuanDiT) for DiT
- [FlashVDM](https://github.com/Tencent/FlashVDM) for their lightning vecset decoder
- [3DShape2VecSet](https://github.com/1zb/3DShape2VecSet) for 3D shape representation
We are grateful to the broader research community for their open exploration and contributions to the field of 3D generation.