VIDIT Dataset

dataset_info

features

splits

download_size

dataset_size

name	dtype
scene	string

name	dtype
image	image

name	dtype
depth_map	image

name	dtype
direction	string

name	dtype
temprature	int32

name	dtype
caption	string

name	num_bytes	num_examples
train	20575644792	12000

20108431280

20575644792

VIDIT Dataset

This is a version of the VIDIT dataset equipped for training ControlNet using depth maps conditioning. VIDIT includes 390 different Unreal Engine scenes, each captured with 40 illumination settings, resulting in 15,600 images. The illumination settings are all the combinations of 5 color temperatures (2500K, 3500K, 4500K, 5500K and 6500K) and 8 light directions (N, NE, E, SE, S, SW, W, NW). Original image resolution is 1024x1024. We include in this version only the training split containing only 300 scenes. Captions were generated using the BLIP-2, Flan T5-xxl model. Depth maps were generated using the GLPN fine-tuned on NYUv2 model.

Examples with varying direction

varying direction

Examples with varying color temperature

varying color temperature

Disclaimer

I do not own any of this data.

Hugging Face 数据集镜像/VIDIT-Depth-ControlNet

VIDIT Dataset

Examples with varying direction

Examples with varying color temperature

Disclaimer

简介

发行版

贡献者

近期动态

Hugging Face 数据集镜像/VIDIT-Depth-ControlNet .gitee-modal { width: 500px !important; }

VIDIT Dataset

Examples with varying direction

Examples with varying color temperature

Disclaimer

简介

发行版

贡献者

近期动态

搜索帮助

Hugging Face 数据集镜像/VIDIT-Depth-ControlNet