1 Star 0 Fork 0

马婧/Q-Learning-Algorithm-Implementation-in-MATLAB

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
GPL-2.0

README

This Q-Learning code for MATLAB has been written by Ioannis Makris and Andrew Chalikiopoulos. It trains an agent to find the shortest way through a 25x25 maze. Following convergence of the algorithm, MATLAB will print out the shortest path to the goal and will also create three graphs to measure the performance of the agent. These are:

  1. Figure 1: Double y plot of Cumulative Reward/Steps taken vs Episode and Steps Taken vs Episode.
  2. Figure 2: Standard plot of Cumulative Reward/Steps taken vs Episode.
  3. Figure 3. Standard plot of Steps Taken vs Episode.

How do I get set up?

This repository contains 8 files. These are:

  1. RandomPermutation.m
  2. ReinforcementLearning_RandomPol.m
  3. ReinforcementLearningGreedy.m
  4. ReinforcementLearningUpdateR.m
  5. RewardMatrix25.m
  6. RewardMatrix100.m
  7. RewardMatrixNoPunishment.csv
  8. RewardMatrixPunishment.csv

The following files were used for testing and should not be used:

  1. RewardMatrix25.m

The following scenarios, which can be run with this code:

  1. Q-Learning with no punishment applying the epislon greedy selection policy. The agent cannot get into a wall in this scenario and the only reward given is 1000 if the goal is found. To set this up, it is necessary ensure that MATLAB reads the RewardMatrixNoPunishment.csv. This can be set up in the RewardMatrix100.m file. It is then possible to manipulate the learning rate and the discount factor in the ReinforcementLearningGreedy.m file and to run the code from there.

  2. Q-Learning with no punishment, applying the random selection policy. The agent cannot get into a a wall state in this scenario and the only reward given is 1000 if the goal is found. To set this up, it is necessary ensure that MATLAB reads the RewardMatrixNoPunishment.csv. This can be set up in the RewardMatrix100.m file. It is then possible to manipulate the learning rate and the discount factor in the ReinforcementLearning_RandomPol.m file and to run the code from there.

  3. Q-Learning with punishment for going into a forbidden state applying the epsilon greedy selection policy. The agent is allowed to go into a wall state and receives -50 punishment for doing so. Again, the only reward given is 1000 if the goal is found. To set this up, it is necessary ensure that MATLAB reads the RewardMatrixPunishment.csv. This can be set up in the RewardMatrix100.m file. It is then possible to manipulate the learning rate and the discount factor in the ReinforcementLearningGreedy.m file and to run the code from there.

  4. Q-Learning with punishment for going into a forbidden state applying the epsilon greedy selection policy. Furthermore, to make this environment dynamic, once the agent has knocked down a wall, the reward will change to 0 as the wall is not there anymore. The reward will remain 0 for any subsequent episode completed by the agent. Again, the only reward given is 1000 if the goal is found. To set this up, it is necessary ensure that MATLAB reads the RewardMatrixPunishment.csv. This can be set up in the RewardMatrix100.m file. It is then possible to manipulate the learning rate and the discount factor in the ReinforcementLearningUpdateR.m file and to run the code from there.

Contribution guidelines

This code is build on the Q-Learning example code by Kardi Teknomo, which can be found at http://people.revoledu.com/kardi)

Who do I talk to?

If there are any questions regarding the code please contact either of the two people below:

Ioannis Makris - ioannis@makris.com

Andrew Chalikiopoulos - andreas.chalikio@hotmail.com

空文件

简介

取消

发行版

暂无发行版

贡献者

全部

语言

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/majingself/Q-Learning-Algorithm-Implementation-in-MATLAB.git
git@gitee.com:majingself/Q-Learning-Algorithm-Implementation-in-MATLAB.git
majingself
Q-Learning-Algorithm-Implementation-in-MATLAB
Q-Learning-Algorithm-Implementation-in-MATLAB
master

搜索帮助