DATE: Wed, May 14, 2014
TIME: 2:30 pm
PLACE: Council Room (SITE 5-084)
TITLE: Reinforcement Learning: Turning Big Data into Actions
PRESENTER: Doina Precup
McGill University
ABSTRACT:

Reinforcement learning is an approach for learning how to make sequential optimal decisions, in real time, by interacting with a stochastic environment. The main idea is that if an action results in an improved situation, then the tendency to produce that action is strengthened, i.e. reinforced. Reinforcement learning methods have developed at the confluence of control theory, neuroscience, animal learning, machine learning and operations research. Notable success stories include world-best computer game players for Backgammon and Go, helicopter and robotic control, computer network design, cell phone routing, transportation and logistics.
A key problem in reinforcement learning is dealing with big data, in terms of a very large or infinite number of environment configurations, many possible actions, or a very fast sampling and decision rate. In this talk, I will present three solutions to deal with these problems. First, I will present temporal abstraction, an approach in which extended actions are used to control the environment. I will describe the theoretical framework of "options", which provides well-founded and very efficient learning and planning algorithms, and has been used extensively by other research groups in big data applications. Secondly, I will discuss "off-policy" learning methods, which compute improved action choices from data gathered under pre-existing control strategies, such as observational studies in medical applications, or existing advertising policies in e-commerce. I will describe the theory behind such methods and an application to building a world-class computer network design system (joint work with Nortel). Finally, I will discuss constructive function approximation methods, which incrementally build a representation of the value function (expected long-term return), based on a stream of data. These methods offer an automated way of providing appropriate regularization for the amount of data available.

Short bio: Dr. Doina Precup is an Associate Professor in the School of Computer Science of McGill University. She earned her B.Sc. degree from the Technical University Cluj-Napoca, Romania (1994) and her M.Sc. (1997) and Ph.D. (2000) degrees from the University of Massachusetts Amherst, where she was a Fulbright fellow. Her research interests lie in the area of machine learning, with an emphasis on reinforcement learning and time series data, as well as applications of machine learning and artificial intelligence to activity recognition, medicine, electronic commerce and robotics. She currently serves as chair of the NSERC Discovery Evaluation Group for Computer Science.