# batch-ppo
**Repository Path**: mirrors_google-research/batch-ppo
## Basic Information
- **Project Name**: batch-ppo
- **Description**: Efficient Batched Reinforcement Learning in TensorFlow
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-24
- **Last Updated**: 2026-04-04
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
Batch PPO
=========
This project provides optimized infrastructure for reinforcement learning. It
extends the [OpenAI gym interface][post-gym] to multiple parallel environments
and allows agents to be implemented in TensorFlow and perform batched
computation. As a starting point, we provide BatchPPO, an optimized
implementation of [Proximal Policy Optimization][post-ppo].
Please cite the [TensorFlow Agents paper][paper-agents] if you use code from
this project in your research:
```bibtex
@article{hafner2017agents,
title={TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow},
author={Hafner, Danijar and Davidson, James and Vanhoucke, Vincent},
journal={arXiv preprint arXiv:1709.02878},
year={2017}
}
```
Dependencies: Python 2/3, TensorFlow 1.3+, Gym, ruamel.yaml
[paper-agents]: https://arxiv.org/pdf/1709.02878.pdf
[post-gym]: https://blog.openai.com/openai-gym-beta/
[post-ppo]: https://blog.openai.com/openai-baselines-ppo/
Instructions
------------
Clone the repository and run the PPO algorithm by typing:
```shell
python3 -m agents.scripts.train --logdir=/path/to/logdir --config=pendulum
```
The algorithm to use is defined in the configuration and `pendulum` started
here uses the included PPO implementation. Check out more pre-defined
configurations in `agents/scripts/configs.py`.
If you want to resume a previously started run, add the `--timestamp=