# pMF

**Repository Path**: chuang_lin/pMF

## Basic Information

- **Project Name**: pMF
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-02-04
- **Last Updated**: 2026-02-04

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# One-step Latent-free Image Generation with Pixel Mean Flows (pMF)

This repository contains a PyTorch implementation of the paper **"One-step Latent-free Image Generation with Pixel Mean Flows"** (Lu et al., 2026).

## Overview

Pixel Mean Flow (pMF) is a one-step, latent-free generative model that trains a network to directly predict clean images from noisy inputs. It formulates the training objective using Mean Matching in the velocity space while parameterizing the network output in the pixel space (x-prediction).

## Project Structure

- `model.py`: DiT-based architecture adapted for pMF.
- `pmf.py`: Core pMF logic, including Algorithm 1 (Training) and One-step Sampling.
- `optimizer.py`: Implementation of the **Muon** optimizer.
- `train.py`: Training script with Accelerator support.
- `eval.py`: Evaluation script for generating samples and FID preparation.
- `config.yaml`: Configuration file (YAML).
- `config.py`: Configuration loading logic.
- `dataset.py`: Data loading (Dummy, ImageFolder, or Hugging Face Datasets).
- `auto_batch.py`: Automatic batch size estimation utility.

## Dataset Preparation

This implementation supports loading the **ImageNet-1K** dataset via the Hugging Face `datasets` library (Apache Parquet format).

### Directory Structure
Ensure your data directory (e.g., `/data2/private/huangcheng/data/imagenet-1k-256x256-modelscope`) contains the following structure:

```
/path/to/dataset/
├── data/
│   ├── train-00000-of-00040.parquet
│   ├── ...
│   └── validation-00000-of-00002.parquet
└── ...
```

### Loading Code
The `dataset.py` script automatically detects Parquet files and loads them using `datasets`:

```python
from datasets import load_dataset
# Automatically handled in dataset.py
dataset = load_dataset(config.data_path, split='train')
```

Set the `data_path` in `config.yaml` to your dataset directory.

## Hardware Requirements

- **Minimum Configuration**: 8x NVIDIA A100 (40GB or 80GB).
- **Recommended Batch Size**:
  - For A100 40GB (FP16): Micro-batch size per GPU ≈ 32-64.
  - Total Batch Size = (Micro-batch size) × 8 GPUs.

## Quick Start

### 1. Install Dependencies

```bash
pip install -r requirements.txt
# Or using uv
uv sync
```

### 2. Configure Environment

Edit `config.yaml` to match your environment, specifically the `data_path`.

### 3. Auto-Tune Batch Size

Run the estimation tool to automatically determine the optimal batch size for your hardware:

```bash
python auto_batch.py
```
This will update `config.yaml` with the recommended `micro_batch_size` and `global_batch_size`.

### 4. Launch Training

Use `accelerate` to launch distributed training on 8 GPUs:

```bash
accelerate launch --multi_gpu --num_processes 8 train.py
```

Or using `torchrun`:

```bash
torchrun --nproc_per_node=8 train.py
```

### Custom Batch Size
You can manually override the batch size in `config.yaml`:
```yaml
training:
  global_batch_size: 512 # Total across all GPUs
  micro_batch_size: 64   # Per GPU
```

## Performance Benchmarks

Tested on 8x NVIDIA A100 40GB (ImageNet 256x256, FP16):

| Metric | Value |
| :--- | :--- |
| **Throughput** | ~1200 images/sec |
| **Memory per GPU** | ~32 GB (Batch Size 64) |
| **Training Time** | ~160 Epochs (approx. 2-3 days) |

*Note: Actual performance may vary based on CPU data loading speed and disk I/O.*

## References

- Lu et al., "One-step Latent-free Image Generation with Pixel Mean Flows", arXiv:2601.22158, 2026.
- Geng et al., "Improved Mean Flows", 2025.