## CS 224D: Deep Learning for Natural Language Processing

Deep learning approaches have obtained very high performance across many different natural language processing tasks. In this class, students will learn to understand, implement, train, debug, visualize and potentially invent their own neural network models for a variety of language understanding tasks. The course provides a deep excursion from early models to cutting-edge research. Applications will range across a broad spectrum: from simple tasks like part of speech tagging, over sentiment analysis to question answering and machine translation. The final project will involve implementing a complex neural network model and applying it to a large scale NLP problem. We will introduce a common programming framework for deep learning for the problem sets.Prerequisites: programming abilities (python), linear algebra,
Math 21 or equivalent, machine learning background (
CS 229 or similar) Recommended:
CS 224N,
EE364a (convex optimization),
CS 231N

Last offered: Spring 2016

## CS 229: Machine Learning (STATS 229)

Topics: statistical pattern recognition, linear and non-linear regression, non-parametric methods, exponential family, GLMs, support vector machines, kernel methods, model/feature selection, learning theory, VC dimension, clustering, density estimation, EM, dimensionality reduction, ICA, PCA, reinforcement learning and adaptive control, Markov decision processes, approximate dynamic programming, and policy search. Prerequisites: linear algebra, and basic probability and statistics.

Terms: Aut
| Units: 3-4

Instructors:
Duchi, J. (PI)
;
Ng, A. (PI)
;
Agrawal, P. (TA)
;
Bhargava, R. (TA)
;
Desai, N. (TA)
;
Dixit, K. (TA)
;
Germain, F. (TA)
;
Ji, J. (TA)
;
Katanforoosh, K. (TA)
;
Kumar, P. (TA)
;
Levy, D. (TA)
;
Meng, C. (TA)
;
Pai, S. (TA)
;
Ruban, T. (TA)
;
Seshadri, A. (TA)
;
Sheng, H. (TA)
;
Tian, Y. (TA)
;
Wang, B. (TA)
;
Wang, D. (TA)
;
Xie, Z. (TA)
;
Yin, Z. (TA)
;
Zhou, B. (TA)
;
Zhu, M. (TA)

## CS 229T: Statistical Learning Theory (STATS 231)

How do we formalize what it means for an algorithm to learn from data? This course focuses on developing mathematical tools for answering this question. We will present various common learning algorithms and prove theoretical guarantees about them. Topics include classical asymptotics, method of moments, generalization bounds via uniform convergence, kernel methods, online learning, and multi-armed bandits. Prerequisites: A solid background in linear algebra and probability theory, statistics and machine learning (
STATS 315A or
CS 229). Convex optimization (
EE 364A) is helpful but not required.

Terms: Spr
| Units: 3

Instructors:
Duchi, J. (PI)

## CS 329M: Topics in Artificial Intelligence: Algorithms of Advanced Machine Learning

This advanced graduate course explores in depth several important classes of algorithms in modern machine learning. We will focus on understanding the mathematical properties of these algorithms in order to gain deeper insights on when and why they perform well. We will also study applications of each algorithm on interesting, real-world settings. Topics include: spectral clustering, tensor decomposition, Hamiltonian Monte Carlo, adversarial training, and variational approximation. Students will learn mathematical techniques for analyzing these algorithms and hands-on experience in using them. We will supplement the lectures with latest papers and there will be a significant research project component to the class. Prerequisites: Probability (
CS 109), linear algebra (
Math 113), machine learning (
CS 229), and some coding experience.

Terms: Spr
| Units: 3

Instructors:
Zou, J. (PI)

## CS 345S: Data-intensive Systems for the Next 1000x

The last decade saw enormous shifts in the design of large-scale data-intensive systems due to the rise of Internet services, cloud computing, and Big Data processing. Where will we see the next 1000x increases in scale and data volume, and how should data-intensive systems accordingly evolve? This course will critically examine a range of trends, including the Internet of Things, drones, smart cities, and emerging hardware capabilities, through the lens of software systems research and design. Students will perform a comparative analysis by reading and discussing cutting-edge research while performing their own original research. Prerequisites: Strong background in software systems, especially databases (
CS 245) and distributed systems (
CS 244B), and/or machine learning (
CS 229). Undergraduates who have completed
CS 245 are strongly encouraged to attend.

Terms: Aut
| Units: 3-4

Instructors:
Bailis, P. (PI)
;
Wang, F. (TA)

## EE 392K: Big Data and Learning Systems for Large-Scale Networks

The course will consider data from sensors in large-scale networks such as Cloud Computing Systems and the Internet of Moving Things. Methods for sensing, denoising the sensed data, and reconstructing the evolution of the network in fine detail from snapshot observations will be discussed. Techniques for synchronizing clocks across a large data center, and using machine learning to perform real-time inference at scale will be presented. The principles of creating an interactive database for detecting anomalies, raising alerts, and serving insights to the user will be discussed. The course will involve a team-based project. Required prerequisites: basic computer networks as in
CS 144 or
CS 244. Recommended: basic statistics or machine learning, as in
EE 278 or
CS 229.

Terms: Win
| Units: 3

Instructors:
Prabhakar, B. (PI)
;
Yin, Z. (TA)

## STATS 231: Statistical Learning Theory (CS 229T)

How do we formalize what it means for an algorithm to learn from data? This course focuses on developing mathematical tools for answering this question. We will present various common learning algorithms and prove theoretical guarantees about them. Topics include classical asymptotics, method of moments, generalization bounds via uniform convergence, kernel methods, online learning, and multi-armed bandits. Prerequisites: A solid background in linear algebra and probability theory, statistics and machine learning (
STATS 315A or
CS 229). Convex optimization (
EE 364A) is helpful but not required.

Terms: Spr
| Units: 3

Instructors:
Duchi, J. (PI)

Filter Results: