# batch-ppo **Repository Path**: mirrors_google-research/batch-ppo ## Basic Information - **Project Name**: batch-ppo - **Description**: Efficient Batched Reinforcement Learning in TensorFlow - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-24 - **Last Updated**: 2026-04-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Batch PPO ========= This project provides optimized infrastructure for reinforcement learning. It extends the [OpenAI gym interface][post-gym] to multiple parallel environments and allows agents to be implemented in TensorFlow and perform batched computation. As a starting point, we provide BatchPPO, an optimized implementation of [Proximal Policy Optimization][post-ppo]. Please cite the [TensorFlow Agents paper][paper-agents] if you use code from this project in your research: ```bibtex @article{hafner2017agents, title={TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow}, author={Hafner, Danijar and Davidson, James and Vanhoucke, Vincent}, journal={arXiv preprint arXiv:1709.02878}, year={2017} } ``` Dependencies: Python 2/3, TensorFlow 1.3+, Gym, ruamel.yaml [paper-agents]: https://arxiv.org/pdf/1709.02878.pdf [post-gym]: https://blog.openai.com/openai-gym-beta/ [post-ppo]: https://blog.openai.com/openai-baselines-ppo/ Instructions ------------ Clone the repository and run the PPO algorithm by typing: ```shell python3 -m agents.scripts.train --logdir=/path/to/logdir --config=pendulum ``` The algorithm to use is defined in the configuration and `pendulum` started here uses the included PPO implementation. Check out more pre-defined configurations in `agents/scripts/configs.py`. If you want to resume a previously started run, add the `--timestamp=