CS 120: Introduction to AI Safety
What is safe AI, and how do we make it? CS120 explores this question, focusing on the technical challenges of creating reliable, ethical, and aligned AI systems. We distinguish between model-specific and systemic safety issues, from examining fairness and data limitations to adversarial vulnerabilities and embedding desired behavior in AI. While primarily focusing on current solutions and their limitations through CS publications, we will also discuss socio-technical concerns of modern AI deployment, how oversight of intelligence could look like, and what future risks we might face. Topics will span reinforcement learning, computer vision, and natural language processing, focusing on interpretability, robustness, and evaluations. You will gain insights into the complexities and problems of why ensuring AI safety and reliability is challenging through lectures, readings, quizzes, and a final project. This course aims to prepare you to critically assess and contribute to safe AI development, equipping them with knowledge of cutting-edge research and ongoing debates in the field. This course has no official requirements, although we recommend some knowledge about machine learning and statistics. For more details, see also the course website:
https://web.stanford.edu/class/cs120/
Terms: Aut, Spr
| Units: 3
Filter Results: