Reward-based Learning and Decision Making under Risk
Reinforcement learning provides a framework for making agents learn policies through feedback signals (“rewards”), which provide information about whether their actions or action sequences were successful or not. Reinforcement learning also provides a framework for understanding how humans learn and decide given reward information only. Standard reinforcement learning assumes that good decisions / actions / policies are the ones which maximize expected reward as a proxy of success. Humans and animals, on the other hand, often do not behave this way, and there is ample evidence for multiple factors which influence learning and decision making. In my talk I will specifically discuss the interaction between risk and reward. I will first present a mathematical framework for including outcome-induced risk into reinforcement learning on Markov decision processes, and I will derive a risk-sensitive variant of model-free Q-learning which is useful for quantifying human behavior. Then I will discuss extensions of this framework to the partially observable case and show preliminary results for cases where risk is induced by perceptual uncertainty.
***Want to know more about this lecture? Contact us at communication@scioi.de***