# DroneSim

**Repository Path**: zheng_zhg/DroneSim

## Basic Information

- **Project Name**: DroneSim
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2025-01-01
- **Last Updated**: 2025-01-01

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# DroneSim
### Reinforcement learning for path following of an AirSim quadrotor implemented in Unity city environment
### Drone in training:
|![Start](Images/Training_Started.gif)      |![4,000](Images/4,000_time_steps.gif)         |   ![12,000](Images/12,000_time_steps.gif)        |
| -----------                               | -----------                                  | ---------                                        |
| *Figure1. training started*    | *Figure2. 4,000 time steps*    |*Figure3. 12,000 time steps*   |
## Getting Started
- ### Download the [Unity package](https://github.com/RealBrandonChen/DroneSim/releases/download/unity/Path_following_quadrotor.unitypackage) containing the customized city environment and the AirSim drone;
- ### Add the python files in Path_Following to the AirSim/PythonClient/reinforcement_learning you've cloned;
- ### You can
  - ### Directly load the [pre-trainined model](https://github.com/RealBrandonChen/DroneSim/releases/download/unity/path_following_model.zip) by hitting `python Model_Load.py` in the terminal, and you'll see the drone following the city road
  - ### Train your own model by `python dqn_drone.py`, your trained model is saved as "best_model.zip"
## Implementation Explanation
### Code snippet credit to AirSim/PythonClient/Reinforcement_learning/drone_env, and the reward function is as following:
```python
def _compute_reward(self):
    thresh_dist = 7
    beta = 1
    x = -240
    y = 10
    z = 200
    pts = [
        np.array([x, y, z]),
        np.array([-350, y, z]),
        np.array([-350, y, 150]),
        np.array([-350, y, z-100]),
        np.array([-350, y, z-200]),
    ]
    ...
    ...
    if self.state["collision"]:
        reward = -100
    else:
        dist = 10000000
        for i in range(0, len(pts) - 1):
            dist = min(
                dist,
                np.linalg.norm(np.cross((quad_pt - pts[i]), (quad_pt - pts[i + 1])))
                / np.linalg.norm(pts[i] - pts[i + 1]),
            )

        if dist > thresh_dist:
            reward = -10
    ...
    ...
    done = 0
    if reward <= -10:
        done = 1

    return reward, done
```
The tuple of the coordinates represents the central line of the city road. The `dist` in the reward function computes the twice distance between the realtime drone and the central line comprised by the points. The distance computing is reperesented as following picture:
###
![Explanation fot the distance computing](Images/Explanation.png)
## Future Work
- ### Generate images data with imitation learning
  A trained policy by cross-modal representations has been achieved by [Rogerio Bonatti](https://github.com/microsoft/AirSim-Drone-Racing-VAE-Imitation). The imitation learning data is generated for passing through the drone racing obstacles. The path following task should also work applied with the generated imitation learning data.
- ### Implement the trained model in the real drone
  Transfer the simulation algorithm to real-world platform.