Introduction to Reinforcement Learning
The tutorial will introduce Reinforcement Learning, that is, learning what actions to take, and when to take them, so as to optimize long-term performance. This may involve sacrificing immediate reward to obtain greater reward in the long-term or just to obtain more information about the environment. The first part of the tutorial will cover the basics, such as Markov decision processes, dynamic programming, temporal-difference learning, Monte Carlo methods, eligibility traces, the role of function approximation. In the second part we cover some recent developments, namely policy gradient and second order methods, such as LSPI and the modified Bellman residual minimization algorithm.
Author: Csaba Szepesvari, Department Of Computing Science, University Of Alberta