# conrft_jjy **Repository Path**: xibeisiber/conrft_jjy ## Basic Information - **Project Name**: conrft_jjy - **Description**: fork of https://github.com/cccedric/conrft - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-16 - **Last Updated**: 2025-10-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Static Badge](https://img.shields.io/badge/Project-Page-a)](https://cccedric.github.io/conrft/) We provide examples to fine-tune Octo, on the top of [HIL-SERL](https://github.com/rail-berkeley/hil-serl) that provides the base environment to perform robotic manipulation tasks with human interventions. The following sections describe how to use our code. **Table of Contents** - [ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy](#conrft-a-reinforced-fine-tuning-method-for-vla-models-via-consistency-policy) - [🛠️ Installation Instructions](#️-installation-instructions) - [💻 Overview and Code Structure](#-overview-and-code-structure) - [✉️ Contact](#️-contact) - [🙏 Acknowledgement](#-acknowledgement) - [📝 Citation](#-citation) ## 🛠️ Installation Instructions 1. **Setup Conda Environment:** create an environment with ```bash conda create -n conrft python=3.10 ``` 2. **Install Jax as follows:** - For CPU (not recommended): ```bash pip install --upgrade "jax[cpu]" ``` - For GPU: ```bash pip install --upgrade "jax[cuda11_pip]==0.4.20" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html ``` - See the [Jax Github page](https://github.com/google/jax) for more details on installing Jax. 3. **Install the Octo** ```bash git clone git@github.com:cccedric/octo.git cd octo pip install -e . pip install -r requirements.txt ``` **Note**: This is a personalized fork of Octo, adding custom functions while preserving its core capabilities for general-purpose robotic manipulation. 4. **Install the serl_launcher_conrft** ```bash cd serl_launcher_conrft pip install -e . pip install -r requirements.txt ``` 5. **Install for serl_robot_infra** Please refer to the [README](./serl_robot_infra/README.md) in the `serl_robot_infra` directory for installation instructions and details on operating the Franka robot arm. This document includes guidance on setting up the impedance-based [serl_franka_controllers](https://github.com/rail-berkeley/serl_franka_controllers). After completing the installation, you should be able to start the robot server and interact with the `franka_env` gym for hardware control. ## 💻 Overview and Code Structure We offers a set of code for fine-tuning Octo in robotic manipulation tasks. The approach's pipeline consists of an actor thread and a learner thread, both of which interact with the robot gym environment. These two threads operate asynchronously, with data transmitted from the actor to the learner node over the network using [agentlace](https://github.com/youliangtan/agentlace). The learner thread periodically updates the policy and syncs it with the actor. **Table for code structure** | Code Directory | Description | | --- | --- | | examples | Scripts for policy training, demonstration data collection, reward classifier training | | serl_launcher_conrft | Main code for Agent Training | | serl_launcher_conrft.agents | Agent Policies (e.g. SAC, BC) | | serl_launcher_conrft.wrappers | Gym env wrappers | | serl_launcher_conrft.data | Replay buffer and data store | | serl_launcher_conrft.vision | Vision related models and utils | | serl_robot_infra | Robot infra for running with real robots | | serl_robot_infra.robot_servers | Flask server for sending commands to robot via ROS | | serl_robot_infra.franka_env | Gym env for Franka robot | We provide a step-by-step guide in [franka_walkthrough](/docs/franka_walkthrough.md) to fine-tune VLA with ConRFT on a Franka robot. ## ✉️ Contact For any questions, please feel free to email [chenyuhui2022@ia.ac.cn](mailto:chenyuhui2022@ia.ac.cn). ## 🙏 Acknowledgement Our code is built upon [CPQL](https://github.com/cccedric/cpql/), [Octo](https://github.com/octo-models/octo), [HIL-SERL](https://github.com/rail-berkeley/hil-serl). We thank all these authors for their nicely open sourced code and their great contributions to the community. ## 📝 Citation If you find our research helpful and would like to reference it in your work, please consider the following citations: ```bibtex @article{chen2025conrft, title={ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy}, author={Chen, Yuhui and Tian, Shuai and Liu, Shugao and Zhou, Yingting and Li, Haoran and Zhao, Dongbin}, journal={arXiv preprint arXiv:2502.05450}, year={2025} } ```