# d4rl_test **Repository Path**: nutquant/d4rl_test ## Basic Information - **Project Name**: d4rl_test - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-11-04 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # D4RL: Datasets for Deep Data-Driven Reinforcement Learning [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![License](https://licensebuttons.net/l/by/3.0/88x31.png)](https://creativecommons.org/licenses/by/4.0/) D4RL is an open-source benchmark for offline reinforcement learning. It provides standardized environments and datasets for training and benchmarking algorithms. A supplementary [whitepaper](https://arxiv.org/abs/2004.07219) and [website](https://sites.google.com/view/d4rl/home) are also available. ## Setup D4RL can be installed by cloning the repository as follows: ``` git clone https://github.com/rail-berkeley/d4rl.git cd d4rl pip install -e . ``` Or, alternatively: ``` pip install git+https://github.com/rail-berkeley/d4rl@master#egg=d4rl ``` The control environments require MuJoCo as a dependency. You may need to obtain a [license](https://www.roboti.us/license.html) and follow the setup instructions for mujoco_py. This mostly involves copying the key to your MuJoCo installation folder. The Flow and CARLA tasks also require additional installation steps: - Instructions for installing CARLA can be found [here](https://github.com/rail-berkeley/d4rl/wiki/CARLA-Setup) - Instructions for installing Flow can be found [here](https://flow.readthedocs.io/en/latest/flow_setup.html). Make sure to install using the SUMO simulator, and add the flow repository to your PYTHONPATH once finished. ## Using d4rl d4rl uses the [OpenAI Gym](https://github.com/openai/gym) API. Tasks are created via the `gym.make` function. A full list of all tasks is [available here](https://github.com/rail-berkeley/d4rl/wiki/Tasks). Each task is associated with a fixed offline dataset, which can be obtained with the `env.get_dataset()` method. This method returns a dictionary with `observations`, `actions`, `rewards`, `terminals`, and `infos` as keys. You can also load data using `d4rl.qlearning_dataset(env)`, which formats the data for use by typical Q-learning algorithms by adding a `next_observations` key. ```python import gym import d4rl # Import required to register environments # Create the environment env = gym.make('maze2d-umaze-v1') # d4rl abides by the OpenAI gym interface env.reset() env.step(env.action_space.sample()) # Each task is associated with a dataset # dataset contains observations, actions, rewards, terminals, and infos dataset = env.get_dataset() print(dataset['observations']) # An N x dim_observation Numpy array of observations # Alternatively, use d4rl.qlearning_dataset which # also adds next_observations. dataset = d4rl.qlearning_dataset(env) ``` Datasets are automatically downloaded to the `~/.d4rl/datasets` directory when `get_dataset()` is called. If you would like to change the location of this directory, you can set the `$D4RL_DATASET_DIR` environment variable to the directory of your choosing, or pass in the dataset filepath directly into the `get_dataset` method. ## Algorithm Implementations We have aggregated implementations of various offline RL algorithms in a [separate repository](https://github.com/rail-berkeley/d4rl_evaluations). ## Off-Policy Evaluations D4RL currently has limited support for off-policy evaluation methods, on a select few locomotion tasks. We provide trained reference policies and a set of performance metrics. Additional details can be found in the [wiki](https://github.com/rail-berkeley/d4rl/wiki/Off-Policy-Evaluation). ## Acknowledgements D4RL builds on top of several excellent domains and environments built by various researchers. We would like to thank the authors of: - [hand_dapg](https://github.com/aravindr93/hand_dapg) - [gym-minigrid](https://github.com/maximecb/gym-minigrid) - [carla](https://github.com/carla-simulator/carla) - [flow](https://github.com/flow-project/flow) - [adept_envs](https://github.com/google-research/relay-policy-learning) ## Citation Please use the following bibtex for citations: ``` @misc{fu2020d4rl, title={D4RL: Datasets for Deep Data-Driven Reinforcement Learning}, author={Justin Fu and Aviral Kumar and Ofir Nachum and George Tucker and Sergey Levine}, year={2020}, eprint={2004.07219}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` ## Licenses Unless otherwise noted, all datasets are licensed under the [Creative Commons Attribution 4.0 License (CC BY)](https://creativecommons.org/licenses/by/4.0/), and code is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).