# ElegantRL
**Repository Path**: cmy_program/ElegantRL
## Basic Information
- **Project Name**: ElegantRL
- **Description**: Lightweight and scalable deep reinforcement learning using PyTorch. 🔥
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-12-10
- **Last Updated**: 2024-05-29
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## ElegantRL “小雅”: Scalable and Elastic Deep Reinforcement Learning
[](https://pepy.tech/project/elegantrl)
[](https://pepy.tech/project/elegantrl)
[](https://www.python.org/downloads/release/python-360/)
[](https://pypi.org/project/elegantrl/)
[ElegantRL](https://elegantrl.readthedocs.io/en/latest/index.html) is developped for researchers and practitioners with the following advantages:
+ **Lightweight**: The core codes <1,000 lines (check elegantrl/tutorial), using PyTorch (train), OpenAI Gym (env), NumPy, Matplotlib (plot).
+ **Efficient**: in many testing cases, we find it more efficient than [Ray RLlib](https://github.com/ray-project/ray).
+ **Stable**: much more stable than [Stable Baselines 3](https://github.com/DLR-RM/stable-baselines3). Stable Baselines 3 can only use single GPU, but ElegantRL can use 1~8 GPUs for stable training.
ElegantRL implements the following model-free deep reinforcement learning (DRL) algorithms:
+ **DDPG, TD3, SAC, PPO, PPO (GAE)** for continuous actions
+ **DQN, DoubleDQN, D3QN** for discrete actions
For the details of DRL algorithms, please check out the educational webpage [OpenAI Spinning Up](https://spinningup.openai.com/en/latest/).
## Contents
+ [News](#News)
+ [File Structure](#File-Structure)
+ [Training Pipeline](#Training-Pipeline)
+ [Experimental Results](#Experimental-Demos)
+ [Requirements](#Requirements)
+ [Model-free DRL Algorithms](#Model-free-DRL-Algorithms)
## News
+ [Towardsdatascience] [ElegantRL: A Lightweight and Stable Deep Reinforcement Learning Library](https://towardsdatascience.com/elegantrl-a-lightweight-and-stable-deep-reinforcement-learning-library-95cef5f3460b)
+ [Towardsdatascience] [ElegantRL: Mastering PPO Algorithms](https://medium.com/@elegantrl/elegantrl-mastering-the-ppo-algorithm-part-i-9f36bc47b791)
+ [MLearning.ai] [ElegantRL Demo: Stock Trading Using DDPG (Part I)](https://elegantrl.medium.com/elegantrl-demo-stock-trading-using-ddpg-part-i-e77d7dc9d208)
+ [MLearning.ai] [ElegantRL Demo: Stock Trading Using DDPG (Part II)](https://medium.com/mlearning-ai/elegantrl-demo-stock-trading-using-ddpg-part-ii-d3d97e01999f)
## Framework

An agent (**agent.py**) with Actor-Critic networks (**net.py**) is trained (**run.py**) by interacting with an environment (**env.py**).
A high-level overview:
+ 1). Instantiate an environment in **Env.py**, and an agent in **Agent.py** with an Actor network and a Critic network in **Net.py**;
+ 2). In each training step in **Run.py**, the agent interacts with the environment, generating transitions that are stored into a Replay Buffer;
+ 3). The agent fetches a batch of transitions from the Replay Buffer to train its networks;
+ 4). After each update, an evaluator evaluates the agent's performance (e.g., fitness score or cumulative return) and saves the agent if the performance is good.
## Code Structure
### Core Codes
+ **elegantrl/net.py** # Neural networks.
+ Q-Net,
+ Actor network,
+ Critic network,
+ **elegantrl/agent.py** # RL algorithms.
+ AgentBase,
+ **elegantrl/run.py** # run DEMO 1 ~ 4
+ Parameter initialization,
+ Training loop,
+ Evaluator.
### Until Codes
+ **elegantrl/envs/** # gym env or custom env, including FinanceStockEnv.
+ **gym_utils.py**: A PreprocessEnv class for gym-environment modification.
+ **Stock_Trading_Env**: A self-created stock trading environment as an example for user customization.
+ **eRL_demo_BipedalWalker.ipynb** # BipedalWalker-v2 in jupyter notebooks
+ **eRL_demos.ipynb** # Demo 1~4 in jupyter notebooks. Tell you how to use tutorial version and advanced version.
+ **eRL_demo_SingleFilePPO.py** # Use single file to train PPO, more simple than tutorial version
+ **eRL_demo_StockTrading.py** # Stock Trading Application in jupyter notebooks
## Start to Train
### Initialization:
+ hyper-parameters `args`.
+ `env = PreprocessEnv()` : creates an environment (in the OpenAI gym format).
+ `agent = agent.XXX()` : creates an agent for a DRL algorithm.
+ `buffer = ReplayBuffer()` : stores the transitions.
+ `evaluator = Evaluator()` : evaluates and stores the trained model.
### Training (a while-loop):
+ `agent.explore_env(…)`: the agent explores the environment within target steps, generates transitions, and stores them into the ReplayBuffer.
+ `agent.update_net(…)`: the agent uses a batch from the ReplayBuffer to update the network parameters.
+ `evaluator.evaluate_save(…)`: evaluates the agent's performance and keeps the trained model with the highest score.
The while-loop will terminate when the conditions are met, e.g., achieving a target score, maximum steps, or manually breaks.
## Experiment
### Experiment 1 Comparisons with benchmark algorithms
### Experiment 2 Self-comparisons
It is easy to run the
## Experimental Demos
[LunarLanderContinuous-v2](https://gym.openai.com/envs/LunarLanderContinuous-v2/)

[BipedalWalkerHardcore-v2](https://gym.openai.com/envs/BipedalWalkerHardcore-v2/)
Note: BipedalWalkerHardcore is a difficult task in continuous action space. There are only a few RL implementations can reach the target reward. Check out a experiment video: [Crack the BipedalWalkerHardcore-v2 with total reward 310 using IntelAC](https://www.bilibili.com/video/BV1wi4y187tC).
## Requirements
Necessary:
| Python 3.6+ |
| PyTorch 1.6+ |
Not necessary:
| Numpy 1.18+ | For ReplayBuffer. Numpy will be installed along with PyTorch.
| gym 0.17.0 | For env. Gym provides tutorial env for DRL training. (env.render() bug in gym==0.18 pyglet==1.6. Change to gym==0.17.0, pyglet==1.5)
| pybullet 2.7+ | For env. We use PyBullet (free) as an alternative of MuJoCo (not free).
| box2d-py 2.3.8 | For gym. Use pip install Box2D (instead of box2d-py)
| matplotlib 3.2 | For plots.
pip3 install gym==1.17.0 pybullet Box2D matplotlib
## Citation:
To cite this repository:
```
@misc{erl,
author = {Liu, Xiao-Yang and Li, Zechu and Wang, Zhaoran and Zheng, Jiahao},
title = {{ElegantRL}: A Scalable and Elastic Deep Reinforcement Learning Library},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/AI4Finance-Foundation/ElegantRL}},
}
```