# EvoQuality **Repository Path**: ByteDance/EvoQuality ## Basic Information - **Project Name**: EvoQuality - **Description**: Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking (ICLR 2026) - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-06-12 - **Last Updated**: 2026-06-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
# ICLR 2026: EvoQuality ### Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking
Teaser 1 Teaser 2
[![arXiv](https://img.shields.io/badge/arXiv-2509.25787-b31b1b.svg)](https://arxiv.org/pdf/2509.25787) [![ICLR 2026](https://img.shields.io/badge/ICLR-2026-blue.svg)](https://openreview.net/forum?id=INOi0YqI8p\&referrer=%5BAuthor%20Console%5D\(%2Fgroup%3Fid%3DICLR.cc%2F2026%2FConference%2FAuthors%23your-submissions\)) [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-ByteDance%2FEvoQuality-yellow)](https://huggingface.co/ByteDance/EvoQuality)
*** ## Abstract **EvoQuality** is a self-supervised framework that enables a Vision-Language Model (VLM) to autonomously refine its image quality perception capability via pairwise majority voting and **GRPO**-based iterative evolution, without requiring any ground-truth labels.
EvoQuality Framework
Figure 1: Overview of the EvoQuality Framework
*** ## Model The pre-trained EvoQuality model weights are publicly available on Hugging Face: - 🤗 **[ByteDance/EvoQuality](https://huggingface.co/ByteDance/EvoQuality)** You can download the model directly via `git lfs` or the `huggingface_hub` Python API: ```bash git lfs install git clone https://huggingface.co/ByteDance/EvoQuality ``` ```python from huggingface_hub import snapshot_download snapshot_download(repo_id="ByteDance/EvoQuality", local_dir="./EvoQuality") ``` *** ## ️ Installation ### 1. Clone the repository ```bash git clone https://github.com/your-organization/evoquality.git cd evoquality/verl ``` ### 2. Create a virtual environment ```bash conda create -n evoquality python=3.9 -y conda activate evoquality ``` ### 3. Install dependencies We provide a setup script: ```bash bash setup.sh ``` *** ## Quick Start ### 1. Prepare Data Place your dataset in the `data/` directory. The expected structure is: ``` data/ └── YourDataset/ ├── train.parquet ├── test.parquet └── data.csv ``` ### 2. Training To train EvoQuality: ```bash cd examples/ttrl/Qwen-Pair/ # Set environment variables or modify the script directly export DATA_LOCAL_DIR="./data" export BACKBONE_PATH="./model/Qwen2.5-VL-7B-Instruct" export WANDB_PROJECT="EvoQuality" bash train.sh ``` ### 3. Merge Checkpoint After training, you need to merge the FSDP checkpoint into a Hugging Face format: ```bash cd ../../.. # Go back to verl/ directory python3 -m verl.model_merger merge \ --backend fsdp \ --local_dir /path/to/your/checkpoint/global_step_XXX/actor \ --target_dir /path/to/merged/hf/model ``` Example: ```bash python3 -m verl.model_merger merge \ --backend fsdp \ --local_dir ./ckpts/EvoQuality/0318/Experiment-grpo-000000/global_step_300/actor \ --target_dir ./merged_hf_model/Qwen25-Finetuned-Pair-300 ``` ### 4. Inference #### Batch Inference for Single Image Quality Assessment ```bash cd scripts/ python batch_image_inference.py \ --root_checkpoint /path/to/merged/hf/model \ --root_data /path/to/image/directory \ --data /path/to/data.csv \ --save_result_root ./results ``` #### Batch Inference for Pairwise Comparison with Voting ```bash python batch_image_pair_inference_voting.py \ --root_checkpoint /path/to/merged/hf/model \ --root_data /path/to/image/directory \ --data /path/to/pair_data.csv \ --save_result_root ./results \ --n_prompt 64 ``` *** ## Project Structure ``` EvoQuality/ ├── verl/ │ ├── examples/ │ │ └── ttrl/ │ │ └── Qwen-Pair/ │ │ └── train.sh # Training script │ ├── scripts/ │ │ ├── batch_image_inference.py # Single image inference │ │ └── batch_image_pair_inference_voting.py # Pairwise inference │ ├── verl/ │ │ ├── workers/ │ │ │ └── reward_manager/ │ │ │ └── ttrl_pair.py # Pairwise reward manager │ │ └── utils/ │ │ └── reward_score/ │ │ └── ttrl/ # TTRL metrics and utils │ ├── setup.sh # Main setup script │ └── setup.py # Package setup └── README.md ``` *** ## Citation If you find EvoQuality useful for your research, please cite our paper: ```bibtex @article{wen2025selfevolving, title={Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking}, author={Wen, Wen and Zhi, Tianwu and Fan, Kanglong and Li, Yang and Peng, Xinge and Zhang, Yabin and Liao, Yiting and Li, Junlin and Zhang, Li}, journal={arXiv preprint arXiv:2509.25787}, year={2025} } ``` *** ## License Agreement The models in this repository are licensed under the Apache 2.0 License. We claim no rights over the your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the [license](https://www.apache.org/licenses/LICENSE-2.0). *** ## Acknowledgements We would like to thank the contributors to the [verl](https://github.com/verl-project/verl), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [transformers](https://github.com/huggingface/transformers), [vLLM](https://github.com/vllm-project/vllm), [PEFT](https://github.com/huggingface/peft) and [HuggingFace](https://huggingface.co) repositories, for their open research.