The RLlib integration brings support between the Ray/RLlib library and CARLA, allowing the easy use of the CARLA environment for training and inference purposes. Ray is an open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
The RLlib integration allows users to create and use CARLA as an environment of Ray and use that environment for training and inference purposes. The integration is ready to use both locally and in the cloud using AWS.
In this guide we will outline the requirements needed for running the RLlib integration both locally and on AWS, the structure of the integration repository, an overview of how to use the library and then an example of how to set up a Ray experiment using CARLA as an environment.
git clone https://github.com/carla-simulator/rllib-integration.git
Requirements for running locally
- Install a package version of CARLA and import the additional assets. The recommended version is CARLA 0.9.11 as the integration was designed and tested with this version. Other versions may be compatible but have not been fully tested, so use these at your own discretion.
- Navigate into the root folder of the RLlib integration repository and install the Python requirements:
pip3 install -r requirements.txt
- Set an environment variable to locate the CARLA package by running the command below or add
CARLA_ROOT=path/to/carla
to your.bashrc
file:
export CARLA_ROOT=path/to/carla
Requirements for running on AWS Cloud
- The requirements for running on AWS are taken care of automatically in an install script found in the RLlib integration repository. Find more details in the section "Running on AWS".
The repository is divided into three directories:
rllib_integration
contains all the infrastructure related to CARLA and how to set up the CARLA server, clients and actors. This provides the basic structure that all training and testing experiments must follow.aws
has the files needed to run in an AWS instance. aws_helper.py
provides several functionalities that ease the management of EC2 instances, including instance creation and sending and receiving data.dqn_example
and the dqn_*
files in the root directory provide an easy-to-understand example on how to set up a Ray experiment using CARLA as its environment.This section provides a general overview on how to create your own experiment. For a more specific example, see the next section "DQN example".
You will need to create at least four files:
To use the CARLA environment you need to define a training experiment. Ray requires environments to return a series of specific information. You can see details on the CARLA environment in rllib-integration/rllib_integration/carla_env.py
.
The information required by Ray is dependent on your specific experiment so all experiments should inherit from BaseExperiment
. This class contains all the functions that need to be overwritten for your own experiment. These are all functions related to the actions, observations and rewards of the training.
The experiment should be configured through a .yaml
file. Any settings passed through the configuration file will override the default settings. The locations of the different default settings are explained below.
The configuration file has three main uses:
The last step is to create your own training and inference scripts. This part is completely up to you and is dependent on the Ray API. If you want to create your own specific model, check out Ray's custom model documentation.
This section builds upon the previous section to show a specific example on how to work with the RLlib integration using the BirdView pseudosensor and Ray's DQNTrainer.
The structure of the DQN example is as follows:
DQNExperiment
, which overwrites the methods of the BaseExperiment
class.dqn_example/dqn_config.yaml
dqn_train.py
dqn_inference_ray.py
dqn_inference.py
To run the example locally:
Install pytorch:
pip3 install -r dqn_example/dqn_requirements.txt
Run the training file:
python3 dqn_train.py dqn_example/dqn_config.yaml --name dqn
!!! Note The default configuration uses 1 GPU and 12 CPUs, so if your local machine doesn't have that capacity, lower the numbers in the configuration file.
If you experience out of memory problems, consider reducing the `buffer_size` parameter.
This section explains how to use the RLlib integration to automatically run training and inference on AWS EC2 instances. To handle the scaling of instances we use the Ray autoscaler API.
You will need to configure your boto3 environment correctly. Check here for more information.
Use the provided aws_helper.py
script to automatically create the image needed for training by running the command below, passing in the name of the base image and the installation script install.sh
found in rllib-integration/aws/install
:
python3 aws_helper.py create-image --name <AMI-name> --installation-scripts <installation-scripts> --instance-type <instance-type> --volume-size <volume-size>
Once the image is created, there will be an output with image information. To use the Ray autoscaler, update the <ImageId>
and <SecurityGroupIds>
settings in your autoscaler configuration file with the information from the output.
With the image created, you can use Ray's API to run the training on the cluster:
Initialize the cluster:
ray up <autoscaler_configuration_file>
(Optional) If the local code has been modified after the cluster initialization, run this command to update it:
ray rsync-up <autoscaler_configuration_file> <path_to_local_folder> <path_to_remote_folder>
Run the training:
ray submit <autoscaler_configuration_file> <training_file>
(Optional) Monitor the cluster status:
ray attach <autoscaler_configuration_file>
watch -n 1 ray status
Shutdown the cluster:
ray down <autoscaler_configuration_file>
To run the DQN example on AWS:
Create the image by passing the dqn_example/dqn_autoscaler.yaml
configuration to the following command:
python3 aws_helper.py create-image --name <AMI-name> --installation-scripts install/install.sh --instance-type <instance-type> --volume-size <volume-size>
Update the <ImageId>
and <SecurityGroupIds>
settings in dqn_autoscaler.yaml
with the information provided by the previous command.
Initialize the cluster:
ray up dqn_example/dqn_autoscaler.yaml
(Optional) Update remote files with local changes:
ray rsync-up dqn_example/dqn_autoscaler.yaml dqn_example .
ray rsync-up dqn_example/dqn_autoscaler.yaml rllib_integration .
Run the training:
ray submit dqn_example/dqn_autoscaler.yaml dqn_train.py -- dqn_example/dqn_config.yaml --auto
(Optional) Monitor the cluster status:
ray attach dqn_example/dqn_autoscaler.yaml
watch -n 1 ray status
Shutdown the cluster:
ray down dqn_example/dqn_autoscaler.yaml
This guide has outlined how to install and run the RLlib integration on AWS and on a local machine. If you have any questions or ran into any issues working through the guide, feel free to post in the forum or raise an issue on GitHub.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。