# easy-rl **Repository Path**: huang_jingwei/easy-rl ## Basic Information - **Project Name**: easy-rl - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 2 - **Created**: 2022-02-13 - **Last Updated**: 2024-10-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 蘑菇书EasyRL 李宏毅老师的《深度强化学习》是强化学习领域经典的中文视频之一。李老师幽默风趣的上课风格让晦涩难懂的强化学习理论变得轻松易懂，他会通过很多有趣的例子来讲解强化学习理论。比如老师经常会用玩 Atari 游戏的例子来讲解强化学习算法。此外，为了教程的完整性，我们整理了周博磊老师的《强化学习纲要》、李科浇老师的《世界冠军带你从零实践强化学习》以及多个强化学习的经典资料作为补充。对于想入门强化学习又想看中文讲解的人来说绝对是非常推荐的。本教程也称为“蘑菇书”，寓意是希望此书能够为读者注入活力，让读者“吃”下这本蘑菇之后，能够饶有兴致地探索强化学习，像马里奥那样愈加强大，继而在人工智能领域觅得意外的收获。 ## 使用说明 * 第 4 章到第 11 章为[李宏毅《深度强化学习》](http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS18.html)的部分； * 第 1 章和第 2 章根据[《强化学习纲要》](https://github.com/zhoubolei/introRL)整理而来； * 第 3 章和第 12 章根据[《世界冠军带你从零实践强化学习》](https://aistudio.baidu.com/aistudio/education/group/info/1335) 整理而来。 ## 纸质版

购买链接：[京东](https://item.jd.com/13075567.html) | [当当](http://product.dangdang.com/29374163.html) 勘误表：https://datawhalechina.github.io/easy-rl/#/errata 豆瓣评分：https://book.douban.com/subject/35781275/ ## 在线阅读(内容实时更新) 地址：https://datawhalechina.github.io/easy-rl/ ## 最新版PDF下载地址：https://github.com/datawhalechina/easy-rl/releases 国内地址(推荐国内读者使用)：https://pan.baidu.com/s/1y6WLaLM5ChMhK1zZ9RoceQ 提取码: tyxb 压缩版(推荐网速较差的读者使用，文件小，图片分辨率较低)：https://pan.baidu.com/s/1DM84K1ckN16jwHU3-3oxGw 提取码: an48 ## 纸质版和PDF版的区别 PDF版本是全书初稿，人民邮电出版社的编辑老师们对初稿进行了反复修缮，最终诞生了纸质书籍，在此向人民邮电出版社的编辑老师的认真严谨表示衷心的感谢！（附：校对样稿）

## 内容导航 | 章节 | 习题 | 相关项目 | | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | [第一章强化学习概述](https://datawhalechina.github.io/easy-rl/#/chapter1/chapter1) | [第一章习题](https://datawhalechina.github.io/easy-rl/#/chapter1/chapter1_questions&keywords) | | | [第二章马尔可夫决策过程 (MDP)](https://datawhalechina.github.io/easy-rl/#/chapter2/chapter2) | [第二章习题](https://datawhalechina.github.io/easy-rl/#/chapter2/chapter2_questions&keywords) | | | [第三章表格型方法](https://datawhalechina.github.io/easy-rl/#/chapter3/chapter3) | [第三章习题](https://datawhalechina.github.io/easy-rl/#/chapter3/chapter3_questions&keywords) | [Q-learning算法实战](https://datawhalechina.github.io/easy-rl/#/chapter3/project1) | | [第四章策略梯度](https://datawhalechina.github.io/easy-rl/#/chapter4/chapter4) | [第四章习题](https://datawhalechina.github.io/easy-rl/#/chapter4/chapter4_questions&keywords) | | | [第五章近端策略优化 (PPO) 算法](https://datawhalechina.github.io/easy-rl/#/chapter5/chapter5) | [第五章习题](https://datawhalechina.github.io/easy-rl/#/chapter5/chapter5_questions&keywords) | | | [第六章 DQN (基本概念)](https://datawhalechina.github.io/easy-rl/#/chapter6/chapter6) | [第六章习题](https://datawhalechina.github.io/easy-rl/#/chapter6/chapter6_questions&keywords) | | | [第七章 DQN (进阶技巧)](https://datawhalechina.github.io/easy-rl/#/chapter7/chapter7) | [第七章习题](https://datawhalechina.github.io/easy-rl/#/chapter7/chapter7_questions&keywords) | [DQN算法实战](https://datawhalechina.github.io/easy-rl/#/chapter7/project2) | | [第八章 DQN (连续动作)](https://datawhalechina.github.io/easy-rl/#/chapter8/chapter8) | [第八章习题](https://datawhalechina.github.io/easy-rl/#/chapter8/chapter8_questions&keywords) | | | [第九章演员-评论家算法](https://datawhalechina.github.io/easy-rl/#/chapter9/chapter9) | [第九章习题](https://datawhalechina.github.io/easy-rl/#/chapter9/chapter9_questions&keywords) | | | [第十章稀疏奖励](https://datawhalechina.github.io/easy-rl/#/chapter10/chapter10) | [第十章习题](https://datawhalechina.github.io/easy-rl/#/chapter10/chapter10_questions&keywords) | | | [第十一章模仿学习](https://datawhalechina.github.io/easy-rl/#/chapter11/chapter11) | [第十一章习题](https://datawhalechina.github.io/easy-rl/#/chapter11/chapter11_questions&keywords) | | | [第十二章深度确定性策略梯度 (DDPG) 算法](https://datawhalechina.github.io/easy-rl/#/chapter12/chapter12) | [第十二章习题](https://datawhalechina.github.io/easy-rl/#/chapter12/chapter12_questions&keywords) | [DDPG算法实战](https://datawhalechina.github.io/easy-rl/#/chapter12/project3) | | [第十三章 AlphaStar 论文解读](https://datawhalechina.github.io/easy-rl/#/chapter13/chapter13) | | | ## 算法实战 [点击](https://github.com/datawhalechina/easy-rl/tree/master/codes)或者跳转```codes```文件夹下进入算法实战 ## 贡献者

Qi Wang

教程设计(第1~12章)
中国科学院大学

Yiyuan Yang

习题设计&第13章
清华大学

John Jim

算法实战
北京大学

## 致谢特别感谢 [@Sm1les](https://github.com/Sm1les)、[@LSGOMYP](https://github.com/LSGOMYP) 对本项目的帮助与支持。 ## 关注我们

Datawhale是一个专注AI领域的开源组织，以“for the learner，和学习者一起成长”为愿景，构建对学习者最有价值的开源学习社区。关注我们，一起学习成长。

## LICENSE

本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。