# maze_policygradient

**Repository Path**: AngryPanda_XYZ/maze_policygradient

## Basic Information

- **Project Name**: maze_policygradient
- **Description**: 《深度强化学习——边做边学》第二章 在走迷宫任务中策略迭代方法（修改后的代码）
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2020-07-15
- **Last Updated**: 2021-09-14

## Categories & Tags

**Categories**: Uncategorized

**Tags**: 待处理的项目

## README

# maze_random

#### 介绍
《深度强化学习——边做边学》第二章 在走迷宫任务中策略迭代方法（修改后的代码）

[https://www.cnblogs.com/devilmaycry812839668/p/13305933.html](https://www.cnblogs.com/devilmaycry812839668/p/13305933.html)


#### 运行环境

python3.6.5
numpy模块
matplotlib模块


#### 使用说明

1.  main.py  是使用策略迭代方法进行一次实验的结果
2.  main_hist.py  是使用策略迭代方法进行10000次实验结果，迭代次数为横轴，实验次数为纵轴
3.  Picture_results 文件夹里面是main.py和main_hist.py的实验结果