# OpenDriveVLA
**Repository Path**: platinum-into/OpenDriveVLA
## Basic Information
- **Project Name**: OpenDriveVLA
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-09-26
- **Last Updated**: 2025-09-26
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model

## Overview ✨
- [Todo List](#todo-list-)
- [News](#news-)
- [Getting Started](#getting-started-)
- [Citation](#citation)
- [Related Resources](#related-resources)
## TODO List 📅
We will release the model code and checkpoints soon. Stay tuned! 🔥
- [x] Release environment setup
- [x] Release inference code
- [ ] Release checkpoints
## News 📢
- **`2025/08/10`** OpenDriveVLA model & inference code released. 🔥
- **`2025/04/01`** OpenDriveVLA [paper](https://arxiv.org/abs/2503.23463) is available on arXiv.
- **`2025/03/28`** We release the environment setup of OpenDriveVLA.
- To make the dependencies of our OpenDriveVLA model [[mmcv](https://github.com/open-mmlab/mmcv) & [mmdet3d](https://github.com/open-mmlab/mmdetection3d)] compatible with [PyTorch 2.1.2](https://pytorch.org/) and support [Transformers](https://github.com/huggingface/transformers) and [Deepspeed](https://github.com/deepspeedai/DeepSpeed), we selected specific versions and enhanced the source code accordingly. The resulting customized libraries are available in the `third_party` folder.
## Getting Started 🌟
1. [Environment Installation](docs/1_INSTALL.md)
2. [Data Preparation](docs/2_DATA_PREP.md)
3. [Inference & Evaluatation](docs/3_EVAL.md)
## Citation 📝
If you find our project useful for your research, please consider citing our paper and codebase with the following BibTeX:
```bibtex
@misc{zhou2025opendrivevlaendtoendautonomousdriving,
title={OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model},
author={Xingcheng Zhou and Xuyuan Han and Feng Yang and Yunpu Ma and Alois C. Knoll},
year={2025},
eprint={2503.23463},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.23463},
}
```
## Acknowledgement 🤝
- [Transformers](https://github.com/huggingface/transformers)
- [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT)
- [Qwen2.5](https://github.com/QwenLM/Qwen2.5)
- [UniAD](https://github.com/OpenDriveLab/UniAD)
- [mmcv](https://github.com/open-mmlab/mmcv)
- [mmdet3d](https://github.com/open-mmlab/mmdetection3d)
- [GPT-Driver](https://github.com/PointsCoder/GPT-Driver)
- [Hint-AD](https://github.com/Robot-K/Hint-AD)
- [TOD3Cap](https://github.com/jxbbb/TOD3Cap)