CS 25: Transformers United V4
Since their introduction in 2017, Transformers have taken the world by storm, and are finding applications all over Deep Learning. They have enabled the creation of powerful language models like ChatGPT and Gemini, and are a critical component in other ML applications such as text-to-image and video generation (e.g. DALL-E and Sora). They have significantly elevated the capabilities and impact of Artificial Intelligence. In
CS 25, which has become one of Stanford's hottest and most exciting seminars, we examine the details of how Transformers work, and dive deep into the different kinds of Transformers and how they're applied in various fields and applications. We do this through a combination of instructor lectures, guest lectures, and classroom discussions. Potential topics include LLM architectures, creative use cases (e.g. art and music), healthcare/biology and neuroscience applications, robotics and RL (e.g. physical tasks, simulations, or games), and so forth. We invite folks at
more »
Since their introduction in 2017, Transformers have taken the world by storm, and are finding applications all over Deep Learning. They have enabled the creation of powerful language models like ChatGPT and Gemini, and are a critical component in other ML applications such as text-to-image and video generation (e.g. DALL-E and Sora). They have significantly elevated the capabilities and impact of Artificial Intelligence. In
CS 25, which has become one of Stanford's hottest and most exciting seminars, we examine the details of how Transformers work, and dive deep into the different kinds of Transformers and how they're applied in various fields and applications. We do this through a combination of instructor lectures, guest lectures, and classroom discussions. Potential topics include LLM architectures, creative use cases (e.g. art and music), healthcare/biology and neuroscience applications, robotics and RL (e.g. physical tasks, simulations, or games), and so forth. We invite folks at the forefront of Transformers research for talks, which will also be livestreamed and recorded through YouTube/Zoom. Past speakers have included Andrej Karpathy, Geoffrey Hinton, Jim Fan, Ashish Vaswani, and folks from OpenAI, Google DeepMind, NVIDIA, etc. Our class includes social events and networking sessions and has a popular reception within and outside Stanford, with around 1 million total views on YouTube. This is a 1-unit S/NC course, where attendance is the only homework! Please enroll on Axess or audit by joining the livestream (or in person if seats are available). Prerequisites: basic knowledge of Deep Learning (should understand attention) or
CS224N/
CS231N/
CS230. Course website:
https://web.stanford.edu/class/cs25/
Terms: Spr
| Units: 1
Instructors:
Feng, S. (PI)
;
Garg, D. (PI)
CS 230: Deep Learning
Deep Learning is one of the most highly sought after skills in AI. We will help you become good at Deep Learning. In this course, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. You will work on case studies from healthcare, autonomous driving, sign language reading, music generation, and natural language processing. You will master not only the theory, but also see how it is applied in industry. You will practice all these ideas in Python and in TensorFlow, which we will teach. AI is transforming multiple industries. After this course, you will likely find creative ways to apply it to your work. This class is taught in the flipped-classroom format. You will watch videos and complete in-depth programming assignments and online quizzes at home, then come in to class for advanced discussions and work on projects. This class will culminate in an open-ended final project, which the teaching team will help you on. Prerequisites: Familiarity with programming in Python and Linear Algebra (matrix / vector multiplications).
CS 229 may be taken concurrently.
Terms: Aut
| Units: 3-4
| UG Reqs: WAY-AQR, WAY-FR
Instructors:
Katanforoosh, K. (PI)
;
Ng, A. (PI)
;
Chang, J. (TA)
;
Ganesh, R. (TA)
;
Mousavi, S. (TA)
CS 236G: Generative Adversarial Networks
Generative Adversarial Networks (GANs) have rapidly emerged as the state-of-the-art technique in realistic image generation. This course presents theoretical intuition and practical knowledge on GANs, from their simplest to their state-of-the-art forms. Their benefits and applications span realistic image editing that is omnipresent in popular app filters, enabling tumor classification under low data schemes in medicine, and visualizing realistic scenarios of climate change destruction. This course also examines key challenges of GANs today, including reliable evaluation, inherent biases, and training stability. After this course, students should be familiar with GANs and the broader generative models and machine learning contexts in which these models are situated. Prerequisites: linear algebra, statistics,
CS106B, plus a graduate-level AI course such as:
CS230,
CS229 (or
CS129), or
CS221.
Last offered: Winter 2022
CS 329S: Machine Learning Systems Design
This project-based course covers the iterative process for designing, developing, and deploying machine learning systems. It focuses on systems that require massive datasets and compute resources, such as large neural networks. Students will learn about data management, data engineering, approaches to model selection, training, scaling, how to continually monitor and deploy changes to ML systems, as well as the human side of ML projects. In the process, students will learn about important issues including privacy, fairness, and security. Pre-requisites: At least one of the following;
CS229,
CS230,
CS231N, CS224N or equivalent. Students should have a good understanding of machine learning algorithms and should be familiar with at least one framework such as TensorFlow, PyTorch, JAX.
Last offered: Winter 2022
CS 329T: Trustworthy Machine Learning
This course will provide an introduction to state-of-the-art ML methods designed to make AI more trustworthy. The course focuses on four concepts: explanations, fairness, privacy, and robustness. We first discuss how to explain and interpret ML model outputs and inner workings. Then, we examine how bias and unfairness can arise in ML models and learn strategies to mitigate this problem. Next, we look at differential privacy and membership inference in the context of models leaking sensitive information when they are not supposed to. Finally, we look at adversarial attacks and methods for imparting robustness against adversarial manipulation.Students will gain understanding of a set of methods and tools for deploying transparent, ethically sound, and robust machine learning solutions. Students will complete labs, homework assignments, and discuss weekly readings. Prerequisites: CS229 or similar introductory Python-based ML class; knowledge of deep learning such as
CS230,
CS231N; familiarity with ML frameworks in Python (scikit-learn, Keras) assumed.
Terms: Aut
| Units: 3
CS 375: Large-Scale Neural Network Modeling for Neuroscience (PSYCH 249)
The last ten years has seen a watershed in the development of large-scale neural networks in artificial intelligence. At the same time, computational neuroscientists have discovered a surprisingly robust mapping between the internal components of these networks and real neural structures in the human brain. In this class we will discuss a panoply of examples of such "convergent man-machine evolution", including: feedforward models of sensory systems (vision, audition, somatosensation); recurrent neural networks for dynamics and motor control; integrated models of attention, memory, and navigation; transformer models of language areas; self-supervised models of learning; and deep RL models of decision and planning. We will also delve into the methods and metrics for comparing such models to real-world neural data, and address how unsolved open problems in AI (that you can work on!) will drive forward novel neural models. Some meaningful background in modern neural networks is highly advised (e.g.
CS229,
CS230,
CS231n,
CS234,
CS236,
CS 330), but formal preparation in cognitive science or neuroscience is not needed (we will provide this).
Terms: Win
| Units: 3
Instructors:
Yamins, D. (PI)
PSYCH 249: Large-Scale Neural Network Modeling for Neuroscience (CS 375)
The last ten years has seen a watershed in the development of large-scale neural networks in artificial intelligence. At the same time, computational neuroscientists have discovered a surprisingly robust mapping between the internal components of these networks and real neural structures in the human brain. In this class we will discuss a panoply of examples of such "convergent man-machine evolution", including: feedforward models of sensory systems (vision, audition, somatosensation); recurrent neural networks for dynamics and motor control; integrated models of attention, memory, and navigation; transformer models of language areas; self-supervised models of learning; and deep RL models of decision and planning. We will also delve into the methods and metrics for comparing such models to real-world neural data, and address how unsolved open problems in AI (that you can work on!) will drive forward novel neural models. Some meaningful background in modern neural networks is highly advised (e.g.
CS229,
CS230,
CS231n,
CS234,
CS236,
CS 330), but formal preparation in cognitive science or neuroscience is not needed (we will provide this).
Terms: Win
| Units: 3
Instructors:
Yamins, D. (PI)
Filter Results: