JavaScript implementation for a TD RL agent learning optimal paths on a gridworld. Inspired by Reinforcement learning specialization
To regenerate a new random gridworld - click "apply".
Amount of bombs is scaled bases on size of grid world.
To learn an agent - click on "run RL". Might take some time on slower devices or bigger grid sizes.
- implemented a gridworld problem with some obstacles(bombs are bad for the agent)
- implemented SARSA agent with these parameters:
- ε-greedy policy (starting with 0.5 and decaying over time)
- 1000 episodes
- no discounted reward
- step size of 0.1
Open Web Components library is used for frontend. No specific reason for it, just wanted to give it a try :)
start
runs your app for development, reloading on file changesstart:build
runs your app after it has been built using the build commandbuild
builds your app and outputs it in yourdist
directorytest
runs your test suite with Web Test Runnerlint
runs the linter for your project