# FlightGPT **Repository Path**: frontcold/FlightGPT ## Basic Information - **Project Name**: FlightGPT - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-05-24 - **Last Updated**: 2025-05-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # πŸš€ FlightGPT: A vision-language model based agent for UAV navigation. ## πŸ› οΈ Environment Setup This project depends on multiple models and tool libraries. It is recommended to use Conda to create an isolated environment. ### Install Conda Environment ```bash - conda create -n flightgpt python=3.10 - conda activate flightgpt - pip install -r requirements.txt ``` --- ## πŸ› οΈ Model and Data Preparation * Download model weights to `./model_weight/` Note: Change the value of `max_pixels` in `preprocessor_config.json` to `16032016`. * Download data to `./data/` * And for sft, Download the cleaned_final.json to ./LLaMA-Factory/data ### πŸ“¦ Project Structure β”œβ”€β”€ model_weight/ # Directory for model weights (download manually) β”œβ”€β”€ experiment/ β”œβ”€β”€ R1PhotoData/ β”œβ”€β”€ data/ └── citynav/ # Data annotation directory └── rgbd-new/ # Raw image files └── training_data/ # Training data directory └── ... β”œβ”€β”€ data_examples/ # Examples of some training data β”œβ”€β”€ eval.py # Model inference and evaluation script β”œβ”€β”€ open-r1-multimodal/ # GRPO training directory β”œβ”€β”€ LLaMA-Factory/ # SFT training directory β”œβ”€β”€ requirements.txt # Combined environment dependency file β”œβ”€β”€ README.md # This document β”œβ”€β”€ ... --- ## πŸš€ Inference 1. Start the vLLM service ```bash CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve path/to/your/model \ --dtype auto \ --trust-remote-code \ --served-model-name qwen_2_5_vl_7b \ --host 0.0.0.0 \ -tp 4 \ --uvicorn-log-level debug \ --port your_port \ --limit-mm-per-prompt image=2,video=0 \ --max-model-len=32000 ``` 2. Start the inference script ```bash python eval_by_qwen.py ``` 3. Result Visualization You can use the visualize_prediction function to visualize the predicted target coordinates and the landmark bounding boxes, as well as the actual target coordinates and landmark bounding boxes. --- ## πŸš€ Training 1. SFT ```bash cd LLaMA-Factory llamafactory-cli train examples/train_lora/qwen2vl_lora_sft.yaml llamafactory-cli export ./LLaMA-Factory/examples/merge_lora/qwen2vl_lora_sft.yaml ``` 2、GRPO ```bash sh ./open-r1-multimodal/run_scripts/run_grpo_rec_lora.sh ``` ---