TITLE:  Learning to optimize via efficient experimentation

SPEAKER:  Daniel Russo

ABSTRACT:

The information revolution is spawning systems that require very frequent decisions and provide high volumes of data concerning past outcomes. Fueling the design of algorithms used in such systems is a vibrant research area at the intersection of sequential decision-making and machine learning that addresses how to balance between exploration and exploitation and learn over time to make increasingly effective decisions.  In this talk, I will formulate a broad family of such problems that greatly extends the classical multi-armed bandit problem by allowing samples of one action to inform the decision-maker's assessment of other actions. I'll describe the rising importance of this problem class, and then discuss two recent methodological advances. One advance is Thompson sampling, a simple and tractable approach that is provably efficient for many relevant problem classes. The other is information-directed sampling, a new algorithm we propose that is inspired by an information-theoretic perspective and can offer greatly superior statistical efficiently. We provide new insight into both algorithms and establish general theoretical guarantees.