The post discusses the Temporal Difference family of iterative techniques for solving the Markov Decision Process (MDP). It explores the limitations of previous methods such as Dynamic Programming and Monte Carlo and introduces Temporal Difference techniques as a solution for situations where complete knowledge of environmental dynamics is unavailable. To read the full blog, visit Medium.
source update: SARSA and Q-Learning — Part 3 – Towards AI
Comments
There are no comments yet.