# DriveMoE **Repository Path**: flashdxy/DriveMoE ## Basic Information - **Project Name**: DriveMoE - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-06-04 - **Last Updated**: 2025-06-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

# DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving

We are currently cleaning and organizing the code, and the publicly available part now is the data preprocessing section. Thank you for your patience in waiting for the training and inference code. [Project Page](https://thinklab-sjtu.github.io/DriveMoE/), [Paper](https://arxiv.org/abs/2505.16278) ## Installation Before you begin, you need to ensure that your CUDA version is >= 12.1. Clone this repository at your directory and run `pip install -e .` to install environment. Download [PaliGemma](https://huggingface.co/blog/paligemma) weights to your directory. ```console git clone https://huggingface.co/google/paligemma-3b-pt-224 ``` If you wish to attempt training DrivePi0 and DriveMoE using the code, or to try open-loop testing with provided checkpoints, you will need to utilize the [Bench2Drive Dataset](https://huggingface.co/datasets/rethinklab/Bench2Drive) and our [Camera Labels](https://huggingface.co/rethinklab/DriveMoE). Set environment variables `DATA_DIR` (if downloading datasets for training),`CAMERA_LABEL_DIR`, `LOG_DIR`, and `WANDB_ENTITY` by running `bash scripts/set_path.sh` ## Data processing Run these two scripts to preprocess the training data. ```console bash script/generate_data.sh && script/window.sh ``` To normalize data during training, we provide dataset statistics. You may also run `bash get_statistics.sh` to generate them. ## Acknowledgments This project has been developed based on the following pioneering works on GitHub repositories. We express our profound gratitude for these foundational resources: - https://github.com/allenzren/open-pi-zero - https://github.com/Physical-Intelligence/openpi ## Citation ```bibtex @misc{yang2025drivemoe, title={DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving}, author={Zhenjie Yang and Yilin Chai and Xiaosong Jia and Qifeng Li and Yuqian Shao and Xuekai Zhu and Haisheng Su and Junchi Yan}, year={2025}, eprint={2505.16278}, archivePrefix={arXiv}, primaryClass={cs.CV}, } ```