Reinforcement learning is about learning good control policies given only weak performance feedback: occasional scalar rewards that might be delayed from the events that led to good performance. Reinforcement learning inherently deals with feedback systems rather than (data, class) data samples, providing a more flexible control-like framework than many standard machine algorithms. These lectures will summarise reinforcement learning along 3 axes:
- Learning with or without knowledge of the system dynamics.
- Using state values as an intermediate solution, or learning a policy directly.
- Learning with or without fully observable system states.
Author: Douglas Aberdeen, National Ict Australia