EE 277:
Bandit Learning: Behaviors and Applications (MS&E 237A)
The subject of reinforcement learning addresses the design of agents that improve decisions over time while operating within complex and uncertain environments. This first course of the sequence restricts attention to the special case of bandit learning, which focuses on environments in which all consequences of an action are realized immediately. This course covers desired agent behaviors and principled scalable approaches to realizing such behavior. Topics include learning from trial and error, exploration, contextualization, generalization, and representation learning. Motivating examples will be drawn from recommendation systems, crowdsourcing, education, and generative artificial intelligence. Homework assignments primarily involve programming exercises carried out in Colab, using the python programming language and standard libraries for numerical computation and machine learning. Prerequisites: programming (e.g., CS106B), probability (e.g., MS&E 121, EE 178 or CS 109), machine learning (e.g., EE 104/ CME 107, MS&E 226 or CS 229).
Terms: Aut
| Units: 3