# FF-GAN

**Repository Path**: Junjiagit/FF-GAN

## Basic Information

- **Project Name**: FF-GAN
- **Description**: Fine-grained Cross-modal Fusion based Refinement for Text-to-Image Synthesis
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-10
- **Last Updated**: 2025-10-16

## Categories & Tags

**Categories**: cv

**Tags**: None

## README

## FF-GAN

This repository is the official code for the paper "Fine-grained Cross-modal Fusion based Refinement for Text-to-Image Synthesis" 

### Introduction

### How to use

0. **Requirements** 

```
Python >= 3.6
PyTorch >= 1.0
NVIDIA GPU + CUDA cuDNN
```

1. **Data** 
   1. Download metadata for [birds](https://drive.google.com/open?id=1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ) [coco](https://drive.google.com/open?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9) and save them to your path.
   2. Download the [birds](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) image data.
   3. Download [coco](http://cocodataset.org/#download) dataset.
   4. extract the source data to your path.

2. **Pretrained Models**
   1. the [models file](https://pan.baidu.com/s/1-V2Mp0wmX_tQxl6mOtnKpw) of our FF-GAN which obtain the best performance. CODE:zrE2.
   2. later experiments found that good results often occur between 550 and 650 epoches, we suggest you choose the model in this scope.
3. Training 
   * Modify the parameters in parse to your local path
   * You can modify some parameters in the cfg file

```python
python /FF_GAN/code/main.py --cfg cfg/bird_DMGAN.yml --gpu 0
python /FF_GAN/code/main.py --cfg cfg/coco_DMGAN.yml --gpu 0
```


4. Validation

```python
# image genaration:
python /FF_GAN/code/main.py --cfg cfg/eval_bird.yml --gpu 0
python /FF_GAN/code/main.py --cfg cfg/eval_coco.yml --gpu 0

# FID
python fid_score.py --gpu 0 --batch-size 50 --path1 bird_val.npz --path2 your generated picture path
# R-precision
# You will get the result of R-precision when 30k pictures are generated
python /FF_GAN/code/main.py --cfg cfg/eval_bird.yml --gpu 0
python /FF_GAN/code/main.py --cfg cfg/eval_coco.yml --gpu 0

```
### Example results

* Images of different stages generated by our FF-GAN on CUB-200 and COCO datasets
![Figure_resolution](https://github.com/haoranhfut/FF-GAN/blob/main/code/fig/bird_resolution_box_final.png?raw=true)
* Qualitative results on CUB-200
![Figure_bird](https://github.com/haoranhfut/FF-GAN/blob/main/code/fig/figure_bird_coco.png?raw=true)
* Qualitative results on COCO
![Figure_coco](https://github.com/haoranhfut/FF-GAN/blob/main/code/fig/figure_coco_box.png?raw=true)
* Attention visualization on CUB-200 test sets
![Atten](https://github.com/haoranhfut/FF-GAN/blob/main/code/fig/attention_map.png?raw=true)


### Acknowlegement
The pre-process data and code borrows heavily from [Attn-GAN](https://github.com/taoxugit/AttnGAN) and [DM-GAN](https://github.com/MinfengZhu/DM-GAN), we apprecite the authors for sharing their codes and data.