Print Settings
 

STATS 32: Introduction to R for Undergraduates

This short course runs for weeks one through five of the quarter. It is recommended for undergraduate students who want to use R in the humanities or social sciences and for students who want to learn the basics of R programming. The goal of the short course is to familiarize students with R's tools for data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven. No prior programming experience is needed. Topics covered include basic data structures, File I/O, data transformation and visualization, simple statistical tests, etc, and some useful packages in R. Prerequisite: undergraduate student. Priority given to non-engineering students. Laptops necessary for use in class.
Terms: Aut, Spr | Units: 1

STATS 60: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 160)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 110: Statistical Methods in Engineering and the Physical Sciences

Introduction to statistics for engineers and physical scientists. Topics: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, elementary experimental design. Prerequisite: one year of calculus. Please note that students must enroll in one section in addition to the main lecture.
Terms: Aut | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 116: Theory of Probability

Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. Introduction to the laws of large numbers and central limit theorem. Prerequisites: MATH 52 and familiarity with infinite series, or equivalent. Undergraduate students enroll for 5 units, graduate students enroll for 4 units. Undergraduate students must enroll in one section in addition to the main lecture. Sections are optional for graduate students. Note: Autumn 2023-24 is the last time this course will be offered. It will be replaced by STATS 117 and STATS 118 in 2024-25.
Terms: Aut | Units: 4-5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 160: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 60)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5

STATS 199: Independent Study

For undergraduates.
Terms: Aut, Win, Spr, Sum | Units: 1-15 | Repeatable for credit

STATS 200: Introduction to Statistical Inference

Modern statistical concepts and procedures derived from a mathematical framework. Statistical inference, decision theory; point and interval estimation, tests of hypotheses; Neyman-Pearson theory. Bayesian analysis; maximum likelihood, large sample theory. Prerequisite: STATS 116. Please note that students must enroll in one section in addition to the main lecture.
Terms: Aut, Win, Sum | Units: 4

STATS 202: Data Mining and Analysis

Data mining is used to discover patterns and relationships in data. Emphasis is on large complex data sets such as those in very large databases or through web mining. Topics: decision trees, association rules, clustering, case based methods, and data visualization. Prereqs: Introductory courses in statistics or probability (e.g., Stats 60), linear algebra (e.g., Math 51), and computer programming (e.g., CS 105). May not be taken for credit by students with credit in STATS 216 or 216V.
Terms: Aut, Sum | Units: 3

STATS 206: Applied Multivariate Analysis (BIODS 206)

Introduction to the statistical analysis of several quantitative measurements on each observational unit. Emphasis is on concepts, computer-intensive methods. Examples from economics, education, geology, psychology. Topics: multiple regression, multivariate analysis of variance, principal components, factor analysis, canonical correlations, multidimensional scaling, clustering. Pre- or corequisite: 200.
Terms: Aut | Units: 3
Instructors: ; Owen, A. (PI); Li, H. (TA)

STATS 209: Introduction to Causal Inference

This course introduces the fundamental ideas and methods in causal inference, with examples drawn from education, economics, medicine, and digital marketing. Topics include potential outcomes, randomization, observational studies, matching, covariate adjustment, AIPW, heterogeneous treatment effects, instrumental variables, regression discontinuity, and synthetic controls. Prerequisites: basic probability and statistics, familiarity with R.
Terms: Aut | Units: 3

STATS 214: Machine Learning Theory (CS 229M)

How do we use mathematical thinking to design better machine learning methods? This course focuses on developing mathematical tools for answering this question. This course will cover fundamental concepts and principled algorithms in machine learning, particularly those that are related to modern large-scale non-linear models. The topics include concentration inequalities, generalization bounds via uniform convergence, non-convex optimization, implicit regularization effect in deep learning, and unsupervised learning and domain adaptations. Prerequisites: linear algebra ( MATH 51 or CS 205), probability theory (STATS 116, MATH 151 or CS 109), and machine learning ( CS 229, STATS 229, or STATS 315A).
Terms: Aut | Units: 3

STATS 223: Sequential Analysis (STATS 323)

This course will survey the history of sequential analysis from its origin in the 1940s via its continuing role in clinical trials to current activity in machine learning. Subject to the limitations of time, the following topics will be discussed: parametric and semi-parametric hypothesis testing from Wald to sequential clinical trials; fixed precision estimation; change-point detection and estimation; iterative stochastic algorithms and machine learning; anytime-valid inference; optimal stopping, dynamic programming, and stochastic control; multi-armed bandits; applications. Prerequisites: for 223, Stats 200 or equivalent; for 323, Stats 300A and 310A.
Terms: Aut | Units: 3

STATS 229: Machine Learning (CS 229)

Topics: statistical pattern recognition, linear and non-linear regression, non-parametric methods, exponential family, GLMs, support vector machines, kernel methods, deep learning, model/feature selection, learning theory, ML advice, clustering, density estimation, EM, dimensionality reduction, ICA, PCA, reinforcement learning and adaptive control, Markov decision processes, approximate dynamic programming, and policy search. Prerequisites: knowledge of basic computer science principles and skills at a level sufficient to write a reasonably non-trivial computer program in Python/NumPy to the equivalency of CS106A, CS106B, or CS106X, familiarity with probability theory to the equivalency of CS 109, MATH151, or STATS 116, and familiarity with multivariable calculus and linear algebra to the equivalency of MATH51 or CS205.
Terms: Aut, Win, Sum | Units: 3-4

STATS 232: Machine Learning for Sequence Modeling (CS 229B)

Sequence data and time series are becoming increasingly ubiquitous in fields as diverse as bioinformatics, neuroscience, health, environmental monitoring, finance, speech recognition/generation, video processing, and natural language processing. Machine learning has become an indispensable tool for analyzing such data; in fact, sequence models lie at the heart of recent progress in AI like GPT3. This class integrates foundational concepts in time series analysis with modern machine learning methods for sequence modeling. Connections and key differences will be highlighted, as well as how grounding modern neural network approaches with traditional interpretations can enable powerful leaps forward. You will learn theoretical fundamentals, but the focus will be on gaining practical, hands-on experience with modern methods through real-world case studies. You will walk away with a broad and deep perspective of sequence modeling and key ways in which such data are not just 1D images.
Terms: Aut | Units: 3-4
Instructors: ; Fox, E. (PI)

STATS 242: NeuroTech Training Seminar (NSUR 239)

This is a required course for students in the NeuroTech training program, and is also open to other graduate students interested in learning the skills necessary for neurotechnology careers in academia or industry. Over the academic year, topics will include: emerging research in neurotechnology, communication skills, team science, leadership and management, intellectual property, entrepreneurship and more.
Terms: Aut, Win, Spr | Units: 1 | Repeatable 9 times (up to 9 units total)

STATS 249: Experimental Immersion in Neuroscience (NSUR 249)

This course provides students from technical backgrounds (e.g., physics, applied physics, electrical or chemical engineering, bioengineering, computer science, statistics) the opportunity to learn how they can apply their expertise to advancing experimental research in the neurosciences. Students will visit one neuroscience lab per week to watch experiments, understand the technical apparatus and animal models being used, discuss the questions being addressed, and interact with students and others conducting the research. This course is strongly encouraged for students who wish to apply to the NeuroTech graduate training program. Our course has limited enrollment, therefore, if you are interested in registering please complete the form here: https://forms.gle/QXmkVfCqeS4zHmwB7 prior and someone will follow-up with you with a permission code accordingly.
Terms: Aut | Units: 1

STATS 256: Modern Statistics for Modern Biology (BIOS 221, STATS 366)

Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Working knowledge of R and two core Biology courses. Note that the 155 offering is a writing intensive course for undergraduates only and requires instructor consent. (WIM). See https://web.stanford.edu/class/bios221/index.html
Terms: Aut | Units: 3

STATS 260A: Workshop in Biostatistics (BIODS 260A)

Applications of data science techniques to current problems in biology, medicine and healthcare. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write a two page critical summary of one of the workshops, with the choice made by the student.
Terms: Aut | Units: 1-2 | Repeatable for credit

STATS 264: Foundations of Statistical and Scientific Inference (EPI 264)

The course will consist of readings and discussion of foundational papers and book sections in the domains of statistical and scientific inference. Topics to be covered include philosophy of science, interpretations of probability, Bayesian and frequentist approaches to statistical inference and current controversies about the proper use of p-values and research reproducibility. Recommended preparation: At least 2 quarters of biostatistics and one of epidemiology. Intended for second year Masters students or PhD students with at least 1 year of preceding graduate training.
Terms: Aut | Units: 1
Instructors: ; Goodman, S. (PI)

STATS 285: Massive Computational Experiments, Painlessly

Ambitious Data Science requires massive computational experimentation; the entry ticket for a solid PhD in some fields is now to conduct experiments involving 1 Million CPU hours. Recently several groups have created efficient computational environments that make it painless to run such massive experiments. This course reviews state-of-the-art practices for doing massive computational experiments on compute clusters in a painless and reproducible manner. Students will learn how to automate their computing experiments first of all using nuts-and-bolts tools such as Perl and Bash, and later using available comprehensive frameworks such as ClusterJob and CodaLab, which enables them to take on ambitious Data Science projects. The course also features few guest lectures by renowned scientists in the field of Data Science. Students should have a familiarity with computational experiments and be facile in some high-level computer language such as R, Matlab, or Python.
Terms: Aut | Units: 2
Instructors: ; Donoho, D. (PI)

STATS 298: Industrial Research for Statisticians

Masters-level research as in 299, but with the approval and supervision of a faculty adviser, it must be conducted for an off-campus employer. Students must submit a written final report upon completion of the internship in order to receive credit. Repeatable for credit. Prerequisite: enrollment in Statistics M.S. program. IMPORTANT: F-1 international students enrolled in this CPT course cannot start working without first obtaining a CPT-endorsed I-20 from Bechtel International Center (enrolling in the CPT course alone is insufficient to meet federal immigration regulations).
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable 3 times (up to 3 units total)

STATS 299: Independent Study

For Statistics M.S. students only. Reading or research program under the supervision of a Statistics faculty member. May be repeated for credit.
Terms: Aut, Win, Spr, Sum | Units: 1-5 | Repeatable for credit

STATS 300A: Theory of Statistics I

Finite sample optimality of statistical procedures; Decision theory: loss, risk, admissibility; Principles of data reduction: sufficiency, ancillarity, completeness; Statistical models: exponential families, group families, nonparametric families; Point estimation: optimal unbiased and equivariant estimation, Bayes estimation, minimax estimation; Hypothesis testing and confidence intervals: uniformly most powerful tests, uniformly most accurate confidence intervals, optimal unbiased and invariant tests. Prerequisites: Real analysis, introductory probability (at the level of STATS 116), and introductory statistics.
Terms: Aut | Units: 3

STATS 303: Statistics Faculty Research Presentations

For Statistics first and second year PhD students only. Discussion of statistics topics and research areas; consultation with PhD advisors.
Terms: Aut | Units: 1 | Repeatable 2 times (up to 2 units total)
Instructors: ; Taylor, J. (PI)

STATS 305A: Applied Statistics I

Statistics of real valued responses. Review of multivariate normal distribution theory. Univariate regression. Multiple regression. Constructing features from predictors. Geometry and algebra of least squares: subspaces, projections, normal equations, orthogonality, rank deficiency, Gauss-Markov. Gram-Schmidt, the QR decomposition and the SVD. Interpreting coefficients. Collinearity. Dependence and heteroscedasticity. Fits and the hat matrix. Model diagnostics. Model selection, Cp/AIC and crossvalidation, stepwise, lasso. Multiple comparisons. ANOVA, fixed and random effects. Use of bootstrap and permutations. Emphasis on problem sets involving substantive computations with data sets. Prerequisites: consent of instructor, 116, 200, applied statistics course, CS 106A, MATH 114.
Terms: Aut | Units: 3

STATS 310A: Theory of Probability I (MATH 230A)

Mathematical tools: sigma algebras, measure theory, connections between coin tossing and Lebesgue measure, basic convergence theorems. Probability: independence, Borel-Cantelli lemmas, almost sure and Lp convergence, weak and strong laws of large numbers. Large deviations. Weak convergence; central limit theorems; Poisson convergence; Stein's method. Prerequisites: STATS 116, MATH 171.
Terms: Aut | Units: 3

STATS 311: Information Theory and Statistics (EE 377)

Information theoretic techniques in probability and statistics. Fano, Assouad,nand Le Cam methods for optimality guarantees in estimation. Large deviationsnand concentration inequalities (Sanov's theorem, hypothesis testing, thenentropy method, concentration of measure). Approximation of (Bayes) optimalnprocedures, surrogate risks, f-divergences. Penalized estimators and minimumndescription length. Online game playing, gambling, no-regret learning. Prerequisites: EE 276 (or equivalent) or STATS 300A.
Terms: Aut | Units: 3

STATS 319: Literature of Statistics

Literature study of topics in statistics and probability culminating in oral and written reports. May be repeated for credit.
Terms: Aut, Win, Spr | Units: 1 | Repeatable for credit

STATS 323: Sequential Analysis (STATS 223)

This course will survey the history of sequential analysis from its origin in the 1940s via its continuing role in clinical trials to current activity in machine learning. Subject to the limitations of time, the following topics will be discussed: parametric and semi-parametric hypothesis testing from Wald to sequential clinical trials; fixed precision estimation; change-point detection and estimation; iterative stochastic algorithms and machine learning; anytime-valid inference; optimal stopping, dynamic programming, and stochastic control; multi-armed bandits; applications. Prerequisites: for 223, Stats 200 or equivalent; for 323, Stats 300A and 310A.
Terms: Aut | Units: 3

STATS 335: The Challenge Problems Paradigm in Empirical Machine Learning and Beyond

In many fields of science and technology, empirical research has been making rapid progress by implicitly following a little-studied research paradigm (CPP) with several distinctive features: a shared public database, a common task, (for example, prediction of class labels or a response variable from given input features), an objective scoring rule that quantifies performance on that task, a leaderboard that tracks performance of submissions, and a set of enrolled competitors who each try to improve the current best-known performance on that task. In the context of Empirical Machine Learning, this is explicitly the famous "Kaggle" model; however, Kaggle didn't originate this approach, and many research disciplines follow the same ingredients, in many cases implicitly or tacitly. As we know, the CPP anchored recent claims of progress in image understanding and in natural language processing. In this course we will review the many instances and variations on the CPP that exist in modern research, including not only in the standard areas of empirical machine learning (computer vision and natural language understanding) but also in academic empirical finance and computational hard sciences. We will discuss evidence that the CPP itself is a kind of secret sauce, rather than the specific technologies that are spotlighted because of CPP. We will discuss software platforms implementing CPP, including Kaggle, but also academic platforms like CodaLab, which is often used for challenge problems in natural language processing, and Nightingale Open Science which is used for challenge problems involving potentially protected health information. Prerequisite: an introductory statistics or machine learning course.
Terms: Aut | Units: 3
Instructors: ; Donoho, D. (PI)

STATS 366: Modern Statistics for Modern Biology (BIOS 221, STATS 256)

Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Working knowledge of R and two core Biology courses. Note that the 155 offering is a writing intensive course for undergraduates only and requires instructor consent. (WIM). See https://web.stanford.edu/class/bios221/index.html
Terms: Aut | Units: 3

STATS 390: Consulting Workshop

Skills required of practicing statistical consultants, including exposure to statistical applications. Students participate as consultants in the department's drop-in consulting service, analyze client data, and prepare formal written reports. Seminar provides supervised experience in short term consulting. May be repeated for credit. Prerequisites: graduate course work in applied statistics or data analysis, and consent of instructor.
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable for credit

STATS 398: Industrial Research for Statisticians

Doctoral research as in 399, but must be conducted for an off-campus employer. A final report acceptable to the advisor outlining work activity, problems investigated, key results, and any follow-up projects they expect to perform is required. The report is due at the end of the quarter in which the course is taken. May be repeated for credit. Prerequisite: Statistics Ph.D. candidate. IMPORTANT: F-1 international students enrolled in this CPT course cannot start working without first obtaining a CPT-endorsed I-20 from Bechtel International Center (enrolling in the CPT course alone is insufficient to meet federal immigration regulations).
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable for credit
© Stanford University | Terms of Use | Copyright Complaints