# DR-I2V **Repository Path**: deng-gang/DR-I2V ## Basic Information - **Project Name**: DR-I2V - **Description**: ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation" - **Primary Language**: Python - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-09-22 - **Last Updated**: 2023-10-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # I2V-GAN This repository is the official Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation". [[Arxiv]](https://arxiv.org/abs/2108.00913) [[ACM DL]](https://dl.acm.org/doi/10.1145/3474085.3475445) #### Traffic I2V Example: Download a pretrained model from [Baidu Netdisk](https://pan.baidu.com/s/1tKpsENwnUEaSdsCvnBzm8Q?pwd=Traf) [Access code: `Traf`] or [Google drive](https://drive.google.com/file/d/1jpSmMvAqjffEnWzPLD1OR8aODoOmG4vy/view?usp=sharing).  #### Monitoring I2V Example:  #### Flower Translation Example:  ## Introduction ### Abstract Human vision is often adversely affected by complex environmental factors, especially in night vision scenarios. Thus, infrared cameras are often leveraged to help enhance the visual effects via detecting infrared radiation in the surrounding environment, but the infrared videos are undesirable due to the lack of detailed semantic information. In such a case, an effective video-to-video translation method from the infrared domain to the visible counterpart is strongly needed by overcoming the intrinsic huge gap between infrared and visible fields. Our work propose an infrared-to-visible (I2V) video translation method I2V-GAN to generate fine-grained and spatial-temporal consistent visible light video by given an unpaired infrared video. The backbone network follows Cycle-GAN and Recycle-GAN.  Technically, our model capitalizes on three types of constraints: adversarial constraint to generate synthetic frame that is similar to the real one, cyclic consistency with the introduced perceptual loss for effective content conversion as well as style preservation, and similarity constraint across and within domains to enhance the content and motion consistency in both spatial and temporal spaces at a fine-grained level.  ### IRVI Dataset Download from [Baidu Netdisk](https://pan.baidu.com/s/1og7bcuVDModuBJhEQXWPxg?pwd=IRVI) [Access code: `IRVI`] or [Google Drive](https://drive.google.com/file/d/1ZcJ0EfF5n_uqtsLc7-8hJgTcr2zHSXY3/view?usp=sharing).  #### Data Structure
SUBSET | TRAIN | TEST | TOTAL FRAME | ||
Traffic | 17000 | 1000 | 18000 | ||
Mornitoring | sub-1 | 1384 | 347 | 1731 | 6352 |
sub-2 | 1040 | 260 | 1300 | ||
sub-3 | 1232 | 308 | 1540 | ||
sub-4 | 672 | 169 | 841 | ||
sub-5 | 752 | 188 | 940 |