# latent-zoning-networks **Repository Path**: mirrors_microsoft/latent-zoning-networks ## Basic Information - **Project Name**: latent-zoning-networks - **Description**: [NeurIPS 2025] Latent Zoning Networks - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-23 - **Last Updated**: 2025-09-27 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification **[[paper (NeurIPS 2025)](#)]** **[[paper (arXiv)](https://arxiv.org/abs/2509.15591)]** **[[website](https://zinanlin.me/blogs/latent_zoning_networks.html)]** **[[code](https://github.com/microsoft/latent-zoning-networks)]** **Authors:** [Zinan Lin](https://zinanlin.me), [Enshu Liu](https://scholar.google.com/citations?user=0LUhWzoAAAAJ), [Xuefei Ning](https://nics-effalg.com/ningxuefei/), [Junyi Zhu](https://junyizhu-ai.github.io/), [Wenyu Wang](#), [Sergey Yekhanin](https://www.microsoft.com/en-us/research/people/yekhanin/) **Correspondence to:** [Zinan Lin](https://zinanlin.me) (zinanlin AT microsoft DOT com) **Abstract:** **Generative modeling, representation learning, and classification** are three core problems in machine learning (ML), yet their state-of-the-art (SoTA) solutions remain largely disjoint. In this paper, we ask: **Can a unified principle address all three?** Such unification could simplify ML pipelines and foster greater synergy across tasks. We introduce Latent Zoning Network (LZN) as a step toward this goal. At its core, LZN creates a shared Gaussian latent space that encodes information across all tasks. Each data type (e.g., images, text, labels) is equipped with an encoder that maps samples to disjoint latent zones, and a decoder that maps latents back to data. ML tasks are expressed as compositions of these encoders and decoders: for example, label-conditional image generation uses a label encoder and image decoder; image embedding uses an image encoder; classification uses an image encoder and label decoder. We demonstrate the promise of LZN in three increasingly complex scenarios: **(1) LZN can enhance existing models (image generation)**: When combined with the SoTA Rectified Flow model, LZN improves FID on CIFAR10 from 2.76 to 2.59—without modifying the training objective. **(2) LZN can solve tasks independently (representation learning)**: LZN can implement unsupervised representation learning without auxiliary loss functions, outperforming the seminal MoCo and SimCLR methods by 9.3% and 0.2%, respectively, on downstream linear classification on ImageNet. **(3) LZN can solve multiple tasks simultaneously (joint generation and classification)**: With image and label encoders/decoders, LZN performs both tasks jointly by design, improving FID and achieving SoTA classification accuracy on CIFAR10. ## News * `9/21/2025`: 🚀 The models and training/inference code for **image generation on AFHQ-Cat** and **image embedding trained on ImageNet** have been released! Due to the sensitive nature of the datasets, the remaining models and code are undergoing an internal review process and will be released at a later date. Stay tuned! * Code: https://github.com/microsoft/latent-zoning-networks * Models: https://huggingface.co/microsoft/latent-zoning-networks * `9/21/2025`: The paper is released [here](https://arxiv.org/abs/2509.15591). ## Environment Setup We provide a docker file with the necessary dependencies in [`docker/Dockerfile`](docker/Dockerfile). Alternatively, you can install PyTorch as well as the libraries in [`docker/requirements.txt`](docker/requirements.txt) to set up the environment. ## Distributed Training Configuration The training code uses Distributed Data Parallelism (DDP) for multi-GPU and multi-node training. In the configuration files, specify the number of GPUs per node using `config.distributed.num_gpus_per_node`. For **single-node training**, this is all you need—the training script will automatically launch processes on each GPU. For **multi-node training**, the training command must be executed once on each node, and the following environment variables need to be set: - `MASTER_ADDR`: The IP address of the master node. - `MASTER_PORT`: The port number on the master node for inter-node communication. - `WORLD_SIZE`: The total number of nodes. - `NODE_RANK`: The rank of the current node, starting from 0 on the master node. ## Unconditional Generative Modeling (Case Study 1) ### AFHQ-Cat #### Data Preparation Download the [AFHQ-Cat dataset](https://github.com/clovaai/stargan-v2/blob/master/README.md#animal-faces-hq-dataset-afhq) and put the images in the following folder structure: ``` 📦/tmp/data/AFHQ ┣ 📂train ┃ ┣ 📂cat ┃ ┃ ┣ 📜flickr_cat_000002.jpg ┃ ┃ ┗ ... (more images) ┗ 📂val ┃ ┣ 📂cat ┃ ┃ ┣ 📜flickr_cat_000008.jpg ┃ ┃ ┗ ... (more images) ``` Note that the root folder ``/tmp/data/AFHQ`` can be moved to any locations, as long as it matches `config.data.params.afhq_root` in the configuration file [configs/lzn1/case_study_1_afhqcat.py](configs/lzn1/case_study_1_afhqcat.py). #### Model Training ``` python train.py --config=configs/lzn1/case_study_1_afhqcat.py ``` The results (checkpoints, logs, metrics, generated images, etc.) will be saved in the folder `results/case_study_1_afhqcat`. For examples, the generated images using the RK45 sampler in the Numpy format can be found in `results/case_study_1_afhqcat/ema-rk45-random_samples_array/` and `results/case_study_1_afhqcat/ema-rk45-random_samples_more_array`. #### Model Inference ``` python evaluate.py --config=configs/lzn1/case_study_1_afhqcat.py --config.checkpoint.load_checkpoint=manual --config.checkpoint.path="" ``` where the checkpoint `lzn1/case_study_1_afhqcat/000003000-000060000.pt` can be downloaded from [here](https://huggingface.co/microsoft/latent-zoning-networks/resolve/main/lzn1/case_study_1_afhqcat/000003000-000060000.pt). Simiarly to model training, the results (logs, metrics, generated images, etc.) will be saved in the folder `results/case_study_1_afhqcat`. ## Unsupervised Representation Learning (Case Study 2) ### ImageNet #### Data Preparation Please download the [ImageNet dataset](http://www.image-net.org/) and place `ILSVRC2012_devkit_t12.tar.gz`, `ILSVRC2012_img_train.tar`, and `ILSVRC2012_img_val.tar` in `/tmp/data/ImageNet`. Note that the root folder ``/tmp/data/ImageNet`` can be other locations, as long as it matches `config.data.params.root` in the configuration file [configs/lzn1/case_study_2.py](configs/lzn1/case_study_2.py). #### Model Training ``` python train.py --config=configs/lzn1/case_study_2.py ``` The results (checkpoints, logs, image representations, linear classification accuracies, etc.) will be saved in the folder `results/case_study_2`. For examples, the image representations of the validation set can be found in `results/cast_study_2/ema-representation_no_head_representations/`. #### Model Inference ``` python evaluate.py --config=configs/lzn1/case_study_2.py --config.checkpoint.load_checkpoint=manual --config.checkpoint.path="" ``` where the checkpoint `lzn1/case_study_2/000032051-005000000.pt` can be downloaded from [here](https://huggingface.co/microsoft/latent-zoning-networks/resolve/main/lzn1/case_study_2/000032051-005000000.pt). Simiarly to model training, the results (logs, image representations, linear classification accuracies, etc.) will be saved in the folder `results/case_study_2`. ## Citation Please cite the paper if you use this code: ``` @article{lin2025latent, title={Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification}, author={Lin, Zinan and Liu, Enshu and Ning, Xuefei and Zhu, Junyi and Wang, Wenyu and Yekhanin, Sergey}, journal={arXiv preprint arXiv:2509.15591}, year={2025} } ``` ## Trademarks This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.