# reinforcejs **Repository Path**: zhang-zhiyang/reinforcejs ## Basic Information - **Project Name**: reinforcejs - **Description**: No description available - **Primary Language**: JavaScript - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-01-06 - **Last Updated**: 2025-01-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # REINFORCEjs **REINFORCEjs** is a Reinforcement Learning library that implements several common RL algorithms, all with web demos. In particular, the library currently includes: - **Dynamic Programming** methods - (Tabular) **Temporal Difference Learning** (SARSA/Q-Learning) - **Deep Q-Learning** for Q-Learning with function approximation with Neural Networks - **Stochastic/Deterministic Policy Gradients** and Actor Critic architectures for dealing with continuous action spaces. (*very alpha, likely buggy or at the very least finicky and inconsistent*) See the [main webpage](http://cs.stanford.edu/people/karpathy/reinforcejs) for many more details, documentation and demos. # Code Sketch The library exports two global variables: `R`, and `RL`. The former contains various kinds of utilities for building expression graphs (e.g. LSTMs) and performing automatic backpropagation, and is a fork of my other project [recurrentjs](https://github.com/karpathy/recurrentjs). The `RL` object contains the current implementations: - `RL.DPAgent` for finite state/action spaces with environment dynamics - `RL.TDAgent` for finite state/action spaces - `RL.DQNAgent` for continuous state features but discrete actions A typical usage might look something like: ```javascript // create an environment object var env = {}; env.getNumStates = function() { return 8; } env.getMaxNumActions = function() { return 4; } // create the DQN agent var spec = { alpha: 0.01 } // see full options on DQN page agent = new RL.DQNAgent(env, spec); setInterval(function(){ // start the learning loop var action = agent.act(s); // s is an array of length 8 //... execute action in environment and get the reward agent.learn(reward); // the agent improves its Q,policy,model, etc. reward is a float }, 0); ``` The full documentation and demos are on the [main webpage](http://cs.stanford.edu/people/karpathy/reinforcejs). # License MIT.