# TimeMaster
**Repository Path**: ring24/TimeMaster
## Basic Information
- **Project Name**: TimeMaster
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-06-27
- **Last Updated**: 2025-06-27
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning
`TimeMaster` is a reinforcement‑learning‑enhanced framework for training **time‑series multimodal large language models (MLLMs)**. It enables **structured, interpretable reasoning** over visualized time‑series signals and has been evaluated on real‑world tasks such as EMG, ECG and Human Activity Recognition (HAR) using Qwen2.5‑VL‑3B‑Instruct.
# News
- [2025.06.21] SFT model released. See [link](https://huggingface.co/collections/langfeng01/timemaster-68554b6ec27ee539d07a6e40).
- [2025.06.21] Code released.
- [2025.06.16] Our paper on `TimeMaster` released. See [link](https://arxiv.org/abs/2506.13705).
# Table of Contents
- [Overview](#overview)
- [Installation](#installation)
- [1. Set Up Conda Environment](#1-set-up-conda-environment)
- [2. Data Preprocessing](#2-data-preprocessing)
- [3. Model Preparation](#3-model-preparation)
- [RL Training](#rl-training)
- [1. Training](#1-training)
- [2. Evaluation](#2-evaluation)
- [Usage Tips](#usage-tips)
- [Reasoning Example](#reasoning-example)
- [Citation](#citation)
- [Acknowledgements](#acknowledgements)
# Overview
`TimeMaster` performs structured reasoning on time series images using reinforcement learning with composite rewards. The framework integrates format, hard, and soft rewards to improve classification, interpretability, and clinical insight generation.

# Installation
## 1. Set Up Conda Environment
```bash
conda create -n timemaster python=3.11 -y
conda activate timemaster
pip3 install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn==2.7.4.post1 --no-build-isolation
pip3 install -e .
pip3 install vllm==0.8.2
pip3 install -r requirements_timemaster.txt
```
## 2. Data Preprocessing
> Currently, we provide the CTU dataset. Additional datasets will be released soon.
To preprocess the dataset, simply run the following script:
```bash
bash example/data_preprocess/ctu.sh
```
After successful execution, the following preprocessed data will be generated:
```
data/ctu_image/
├── images/
├── test/
├── train/
├── dataset_dict.json
├── test.parquet
└── train.parquet
```
## 3. Model Preparation
Download the SFT model from our [TimeMaster's HuggingFace](https://huggingface.co/collections/langfeng01/timemaster-68554b6ec27ee539d07a6e40) using the command below:
```
huggingface-cli download langfeng01/TimeMaster-SFT-Qwen2.5-VL-3B-CTU --local-dir ./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU/
```
This will download all model files into the `./checkpoints/` directory.
# RL Training
## 1. Training
We offer two types of training:
1. `TimeMaster` **(SFT + RL)**: RL training initialized from a supervised fine-tuned (SFT) checkpoint. To use this, set `MODEL_PATH=./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU` in the script: [./example/grpo_trainer/run_ctu.sh](./example/grpo_trainer/run_ctu.sh)
2. `TimeMaster` **(RL)**: RL training from scratch using the base model. To use this, set `MODEL_PATH=Qwen/Qwen2.5-VL-3B-Instruct` in the script: [./example/grpo_trainer/run_ctu.sh](./example/grpo_trainer/run_ctu.sh)
After setting the appropriate `MODEL_PATH`, start the RL training by running:
```bash
bash example/grpo_trainer/run_ctu.sh
```
After training, the model checkpoint will be saved in: `./checkpoints/`
## 2. Evaluation
To start evaluation, set `EVAL=True` in the script: [./example/grpo_trainer/run_ctu.sh](./example/grpo_trainer/run_ctu.sh). Then, run the following command:
```bash
bash example/grpo_trainer/run_ctu.sh
```
# Usage Tips
## 1. Additional datasets
`TimeMaster` supports additional datasets beyond CTU, including **EMG**, **ECG**, **HAR**, **RCW**, and **TEE**.
To process these datasets, follow the same data preparation pipeline demonstrated in [example/data_preprocess/ctu.sh](./example/data_preprocess/ctu.sh).
## 2. Reward design
The core reward functions are located in [./verl/utils/reward_score/](./verl/utils/reward_score):
- `ctu.py`: Implements *format* and *accuracy* rewards for the CTU dataset.
- `emg_soft.py`: Demonstrates a composite reward setup with three components — **format**, **accuracy**, and **extension** (the latter using the OpenAI API for soft evaluation).
# Reasoning Example

# Citation
If `TimeMaster` helps your research, we would appreciate it if you could cite our work:
```bibtex
@article{zhang2025timemaster,
title={TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning},
author={Zhang, Junru and Feng, Lang and Guo, Xu and Wu, Yuhan and Dong, Yabo and Xu, Duanqing},
journal={arXiv preprint arXiv:2506.13705},
year={2025}
}
```
# Acknowledgements
We thank the [veRL](https://github.com/volcengine/verl) project for foundational RL infrastructure and [Qwen2-VL-Finetune](https://github.com/2U1/Qwen2-VL-Finetune) project for support in SFT.