Print Settings
 

STATS 32: Introduction to R for Undergraduates

This short course runs for weeks one through five of the quarter. It is recommended for undergraduate students who want to use R in the humanities or social sciences and for students who want to learn the basics of R programming. The goal of the short course is to familiarize students with R's tools for data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven. No prior programming experience is needed. Topics covered include basic data structures, File I/O, data transformation and visualization, simple statistical tests, etc, and some useful packages in R. Prerequisite: undergraduate student. Priority given to non-engineering students. Laptops necessary for use in class.
Terms: Aut, Spr | Units: 1

STATS 60: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 160)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 100: Mathematics of Sports

This course will teach you how statistics and probability can be applied in sports, in order to evaluate team and individual performance, build optimal in-game strategies and ensure fairness between participants. Topics will include examples drawn from multiple sports such as basketball, baseball, soccer, football and tennis. The course is intended to focus on data-based applications, and will involve computations in R with real data sets via tutorial sessions and homework assignments. Prereqs: No statistical or programming background is assumed, but introductory courses, e.g, Stats 60,101 or 116, are recommended. A prior knowledge of Linear Algebra (e.g., Math 51) and basic probability is strongly recommended.
Terms: Win | Units: 3 | UG Reqs: GER:DB-Math, WAY-AQR

STATS 110: Statistical Methods in Engineering and the Physical Sciences

Introduction to statistics for engineers and physical scientists. Topics: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, elementary experimental design. Prerequisite: one year of calculus. Please note that students must enroll in one section in addition to the main lecture.
Terms: Aut | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 116: Theory of Probability

Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. Introduction to the laws of large numbers and central limit theorem. Prerequisites: MATH 52 and familiarity with infinite series, or equivalent. Undergraduate students enroll for 5 units, graduate students enroll for 4 units. Undergraduate students must enroll in one section in addition to the main lecture. Sections are optional for graduate students. Note: Autumn 2023-24 is the last time this course will be offered. It will be replaced by STATS 117 and STATS 118 in 2024-25.
Terms: Aut | Units: 4-5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 117: Theory of Probability I

Introduction to probability theory, including probability axioms, conditional probability, independence, random variables, and expectation. Joint, marginal, and conditional distributions. Discrete models (binomial, hypergeometric, Poisson) and continuous models (normal, exponential). Prerequisites: Single-variable calculus including infinite series (e.g., MATH 21) and at least one MATH course at Stanford. May not be taken for credit by students with credit in STATS 116, CS 109, MATH 151, or MS&E 120.
Terms: Spr, Sum | Units: 3

STATS 118: Theory of Probability II

Continuation of STATS 117, with a focus on probability topics useful for statistics. Sampling distributions of sums, means, variances, and order statistics of random variables. Convolutions, moment generating functions, and limit theorems. Probability distributions useful in statistics (gamma, beta, chi-square, t, multivariate normal). Prerequisites: a calculus-based first course in probability (such as STATS 117, CS 109, or MS&E 120) and multivariable calculus, including multiple integrals (MATH 52 or equivalent, can be taken concurrently). May not be taken for credit by students with credit in STATS 116.
Terms: Sum | Units: 4
Instructors: ; Hwang, J. (PI)

STATS 141: Biostatistics (BIO 141)

Introductory statistical methods for biological data: describing data (numerical and graphical summaries); introduction to probability; and statistical inference (hypothesis tests and confidence intervals). Intermediate statistical methods: comparing groups (analysis of variance); analyzing associations (linear and logistic regression); and methods for categorical data (contingency tables and odds ratio). Course content integrated with statistical computing in R.
Terms: Win | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR

STATS 160: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 60)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5

STATS 191: Introduction to Applied Statistics

Statistical tools for modern data analysis. Topics include regression and prediction, elements of the analysis of variance, bootstrap, and cross-validation. Emphasis is on conceptual rather than theoretical understanding. Applications to social/biological sciences. Student assignments/projects require use of the software package R. Prerequisite: introductory statistical methods course. Recommended: 60, 110, or 141.
Terms: Spr, Sum | Units: 3 | UG Reqs: GER:DB-Math, WAY-AQR

STATS 195: Introduction to R

This short course runs for four weeks. It is recommended for students who want to use R in statistics, science or engineering courses, and for students who want to learn the basics of data science with R. The goal of the short course is to familiarize students with some of the most important R tools for data analysis. Lectures will focus on learning by example and assignments will be application-driven. No prior programming experience is assumed.
Terms: Win | Units: 1
Instructors: ; Zhang, I. (PI)

STATS 199: Independent Study

For undergraduates.
Terms: Aut, Win, Spr, Sum | Units: 1-15 | Repeatable for credit

STATS 200: Introduction to Statistical Inference

Modern statistical concepts and procedures derived from a mathematical framework. Statistical inference, decision theory; point and interval estimation, tests of hypotheses; Neyman-Pearson theory. Bayesian analysis; maximum likelihood, large sample theory. Prerequisite: STATS 116. Please note that students must enroll in one section in addition to the main lecture.
Terms: Aut, Win, Sum | Units: 4

STATS 202: Data Mining and Analysis

Data mining is used to discover patterns and relationships in data. Emphasis is on large complex data sets such as those in very large databases or through web mining. Topics: decision trees, association rules, clustering, case based methods, and data visualization. Prereqs: Introductory courses in statistics or probability (e.g., Stats 60), linear algebra (e.g., Math 51), and computer programming (e.g., CS 105). May not be taken for credit by students with credit in STATS 216 or 216V.
Terms: Aut, Sum | Units: 3

STATS 203: Introduction to Regression Models and Analysis of Variance

Modeling and interpretation of observational and experimental data using linear and nonlinear regression methods. Model building and selection methods. Multivariable analysis. Fixed and random effects models. Experimental design. Prerequisites: A post-calculus introductory probability course, e.g. STATS 116, basic computer programming knowledge, some familiarity with matrix algebra, and a pre- or co-requisite post-calculus mathematical statistics course, e.g. STATS 200.
Terms: Win | Units: 3

STATS 205: Introduction to Nonparametric Statistics

Nonparametric regression and nonparametric density estimation, modern nonparametric techniques, nonparametric confidence interval estimates, nearest neighbor algorithms (with non-linear features), wavelet, bootstrap. Nonparametric analogs of the one- and two-sample t-tests and analysis of variance
Terms: Spr | Units: 3

STATS 206: Applied Multivariate Analysis (BIODS 206)

Introduction to the statistical analysis of several quantitative measurements on each observational unit. Emphasis is on concepts, computer-intensive methods. Examples from economics, education, geology, psychology. Topics: multiple regression, multivariate analysis of variance, principal components, factor analysis, canonical correlations, multidimensional scaling, clustering. Pre- or corequisite: 200.
Terms: Aut | Units: 3
Instructors: ; Owen, A. (PI); Li, H. (TA)

STATS 207: Introduction to Time Series Analysis (STATS 307)

Time series models used in economics and engineering. Trend fitting, autoregressive and moving average models and spectral analysis, Kalman filtering, and state-space models. Seasonality, transformations, and introduction to financial time series. Prerequisite: basic course in Statistics at the level of 200.
Terms: Spr | Units: 3

STATS 208: Bootstrap, Cross-Validation, and Sample Re-use

By re-using the sample data, sometimes in ingenious ways, we can evaluate the accuracy of predictions, test the significance of a conclusion, place confidence bounds on an unknown parameter, select the best prediction architecture, and develop more accurate predictors. In this course, we will describe the many ways that samples get reused to achieve these goals, including the bootstrap, the parametric bootstrap, cross-validation, conformal prediction, random forests, and sample splitting. We also develop basic theory justifying such methods. Prerequisite: course in statistics or probability.
Terms: Win | Units: 3
Instructors: ; Donoho, D. (PI); Wang, Y. (TA)

STATS 209: Introduction to Causal Inference

This course introduces the fundamental ideas and methods in causal inference, with examples drawn from education, economics, medicine, and digital marketing. Topics include potential outcomes, randomization, observational studies, matching, covariate adjustment, AIPW, heterogeneous treatment effects, instrumental variables, regression discontinuity, and synthetic controls. Prerequisites: basic probability and statistics, familiarity with R.
Terms: Aut | Units: 3

STATS 211: Meta-research: Appraising Research Findings, Bias, and Meta-analysis (CHPR 206, EPI 206, MED 206)

Open to graduate, medical, and undergraduate students. Appraisal of the quality and credibility of research findings; evaluation of sources of bias. Meta-analysis as a quantitative (statistical) method for combining results of independent studies. Examples from medicine, epidemiology, genomics, ecology, social/behavioral sciences, education. Collaborative analyses. Project involving generation of a meta-research project or reworking and evaluation of an existing published meta-analysis. Prerequisite: knowledge of basic statistics.
Terms: Win | Units: 3

STATS 214: Machine Learning Theory (CS 229M)

How do we use mathematical thinking to design better machine learning methods? This course focuses on developing mathematical tools for answering this question. This course will cover fundamental concepts and principled algorithms in machine learning, particularly those that are related to modern large-scale non-linear models. The topics include concentration inequalities, generalization bounds via uniform convergence, non-convex optimization, implicit regularization effect in deep learning, and unsupervised learning and domain adaptations. Prerequisites: linear algebra ( MATH 51 or CS 205), probability theory (STATS 116, MATH 151 or CS 109), and machine learning ( CS 229, STATS 229, or STATS 315A).
Terms: Aut | Units: 3

STATS 215: Statistical Models in Biology

Poisson and renewal processes, Markov chains in discrete and continuous time, branching processes, diffusion. Applications to models of nucleotide evolution, recombination, the Wright-Fisher process, coalescence, genetic mapping, sequence analysis. Theoretical material approximately the same as in STATS 217, but emphasis is on examples drawn from applications in biology, especially genetics. Prerequisite: 116 or equivalent.
Terms: Win | Units: 3

STATS 216: Introduction to Statistical Learning

Overview of supervised learning, with a focus on regression and classification methods. Syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis;cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; Some unsupervised learning: principal components and clustering (k-means and hierarchical). Computing is done in R, through tutorial sessions and homework assignments. This math-light course is offered via video segments (MOOC style), and in-class problem solving sessions. Prereqs: Introductory courses in statistics or probability (e.g., Stats 60 or Stats 101), linear algebra (e.g., Math 51), and computer programming (e.g., CS 105). May not be taken for credit by students with credit in STATS 202 or STATS 216V.
Terms: Win | Units: 3

STATS 216V: Introduction to Statistical Learning

Overview of supervised learning, with a focus on regression and classification methods. Syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; Some unsupervised learning: principal components and clustering (k-means and hierarchical). Computing is done in R, through tutorial sessions and homework assignments. This math-light course is offered remotely only via video segments (MOOC style). TAs will host remote weekly office hours using an online platform such as Zoom. There are four homework assignments, a midterm, and a final exam, all of which are administered remotely. Prereqs: Introductory courses in statistics or probability (e.g., Stats 60 or Stats 101), linear algebra (e.g., Math 51), and computer programming (e.g., CS 105). May not be taken for credit by students with credit in STATS 202 or STATS 216.
Terms: Sum | Units: 3
Instructors: ; Bodwin, K. (PI)

STATS 217: Introduction to Stochastic Processes I

Discrete and continuous time Markov chains, poisson processes, random walks, branching processes, first passage times, recurrence and transience, stationary distributions. Non-Statistics masters students may want to consider taking STATS 215 instead. Prerequisite: a post-calculus introductory probability course e.g. STATS 116
Terms: Win, Sum | Units: 3

STATS 218: Introduction to Stochastic Processes II

Renewal theory, Brownian motion, Gaussian processes, second order processes, martingales.
Terms: Spr | Units: 3
Instructors: ; Li, S. (PI); Zhou, Y. (TA)

STATS 219: Stochastic Processes (MATH 136)

Introduction to measure theory, Lp spaces and Hilbert spaces. Random variables, expectation, conditional expectation, conditional distribution. Uniform integrability, almost sure and Lp convergence. Stochastic processes: definition, stationarity, sample path continuity. Examples: random walk, Markov chains, Gaussian processes, Poisson processes, Martingales. Construction and basic properties of Brownian motion. Prerequisite: STATS 116 or MATH 151 or equivalent. Recommended: MATH 115 or equivalent. http://statweb.stanford.edu/~adembo/math-136/
Terms: Win | Units: 4

STATS 223: Sequential Analysis (STATS 323)

This course will survey the history of sequential analysis from its origin in the 1940s via its continuing role in clinical trials to current activity in machine learning. Subject to the limitations of time, the following topics will be discussed: parametric and semi-parametric hypothesis testing from Wald to sequential clinical trials; fixed precision estimation; change-point detection and estimation; iterative stochastic algorithms and machine learning; anytime-valid inference; optimal stopping, dynamic programming, and stochastic control; multi-armed bandits; applications. Prerequisites: for 223, Stats 200 or equivalent; for 323, Stats 300A and 310A.
Terms: Aut | Units: 3

STATS 229: Machine Learning (CS 229)

Topics: statistical pattern recognition, linear and non-linear regression, non-parametric methods, exponential family, GLMs, support vector machines, kernel methods, deep learning, model/feature selection, learning theory, ML advice, clustering, density estimation, EM, dimensionality reduction, ICA, PCA, reinforcement learning and adaptive control, Markov decision processes, approximate dynamic programming, and policy search. Prerequisites: knowledge of basic computer science principles and skills at a level sufficient to write a reasonably non-trivial computer program in Python/NumPy to the equivalency of CS106A, CS106B, or CS106X, familiarity with probability theory to the equivalency of CS 109, MATH151, or STATS 116, and familiarity with multivariable calculus and linear algebra to the equivalency of MATH51 or CS205.
Terms: Aut, Win, Sum | Units: 3-4

STATS 232: Machine Learning for Sequence Modeling (CS 229B)

Sequence data and time series are becoming increasingly ubiquitous in fields as diverse as bioinformatics, neuroscience, health, environmental monitoring, finance, speech recognition/generation, video processing, and natural language processing. Machine learning has become an indispensable tool for analyzing such data; in fact, sequence models lie at the heart of recent progress in AI like GPT3. This class integrates foundational concepts in time series analysis with modern machine learning methods for sequence modeling. Connections and key differences will be highlighted, as well as how grounding modern neural network approaches with traditional interpretations can enable powerful leaps forward. You will learn theoretical fundamentals, but the focus will be on gaining practical, hands-on experience with modern methods through real-world case studies. You will walk away with a broad and deep perspective of sequence modeling and key ways in which such data are not just 1D images.
Terms: Aut | Units: 3-4
Instructors: ; Fox, E. (PI)

STATS 242: NeuroTech Training Seminar (NSUR 239)

This is a required course for students in the NeuroTech training program, and is also open to other graduate students interested in learning the skills necessary for neurotechnology careers in academia or industry. Over the academic year, topics will include: emerging research in neurotechnology, communication skills, team science, leadership and management, intellectual property, entrepreneurship and more.
Terms: Aut, Win, Spr | Units: 1 | Repeatable 9 times (up to 9 units total)

STATS 249: Experimental Immersion in Neuroscience (NSUR 249)

This course provides students from technical backgrounds (e.g., physics, applied physics, electrical or chemical engineering, bioengineering, computer science, statistics) the opportunity to learn how they can apply their expertise to advancing experimental research in the neurosciences. Students will visit one neuroscience lab per week to watch experiments, understand the technical apparatus and animal models being used, discuss the questions being addressed, and interact with students and others conducting the research. This course is strongly encouraged for students who wish to apply to the NeuroTech graduate training program. Our course has limited enrollment, therefore, if you are interested in registering please complete the form here: https://forms.gle/QXmkVfCqeS4zHmwB7 prior and someone will follow-up with you with a permission code accordingly.
Terms: Aut | Units: 1

STATS 250: Mathematical Finance (MATH 238)

Stochastic models of financial markets. Risk neutral pricing for derivatives, hedging strategies and management of risk. Multidimensional portfolio theory and introduction to statistical arbitrage. Prerequisite: Math 136 or equivalent. NOTE: Undergraduates require instructor permission to enroll. Undergraduates interested in taking the course should contact the instructor for permission, providing information about relevant background such as other courses taken.
Terms: Win | Units: 3
Instructors: ; Papanicolaou, G. (PI)

STATS 251: Clinical Trial Design in the Age of Precision Medicine (BIODS 250)

This course offers an overview of statistical foundation for modern clinical trial design in precision medicine research. Starting from a quick review of traditional clinical development paradigm through Phase I to III clinical trials for medical product approval and Phase IV post-marketing studies for safety evaluation, and challenges in the time and society costs, we will introduce recently developed innovative designs and their statistical methodology across all phases of clinical trials. You expected to learn the statistical considerations for novel phase I-II trial designs, master protocols for umbrella, platform and basket trials, adaptive and enrichment designs including subgroup selections, estimand, surrogate and composite endpoints, integration of real-world evidence and patient-focused medical product development, and meta-analysis of clinical trial endpoints. Prerequisites: Working knowledge of statistics and R.
Terms: Win | Units: 3

STATS 256: Modern Statistics for Modern Biology (BIOS 221, STATS 366)

Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Working knowledge of R and two core Biology courses. Note that the 155 offering is a writing intensive course for undergraduates only and requires instructor consent. (WIM). See https://web.stanford.edu/class/bios221/index.html
Terms: Aut | Units: 3

STATS 260A: Workshop in Biostatistics (BIODS 260A)

Applications of data science techniques to current problems in biology, medicine and healthcare. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write a two page critical summary of one of the workshops, with the choice made by the student.
Terms: Aut | Units: 1-2 | Repeatable for credit

STATS 260B: Workshop in Biostatistics (BIODS 260B)

Applications of data science techniques to current problems in biology, medicine and healthcare. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write a two page critical summary of one of the workshops, with the choice made by the student
Terms: Win | Units: 1-2 | Repeatable for credit

STATS 260C: Workshop in Biostatistics (BIODS 260C)

Applications of data science techniques to current problems in biology, medicine and healthcare. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write a two page critical summary of one of the workshops, with the choice made by the student
Terms: Spr | Units: 1-2 | Repeatable for credit

STATS 261: Intermediate Biostatistics: Analysis of Discrete Data (BIOMEDIN 233, EPI 261)

Methods for analyzing data from case-control and cross-sectional studies: the 2x2 table, chi-square test, Fisher's exact test, odds ratios, Mantel-Haenzel methods, stratification, tests for matched data, logistic regression, conditional logistic regression. Emphasis is on data analysis in SAS or R. Special topics: cross-fold validation and bootstrap inference.
Terms: Win | Units: 3

STATS 262: Intermediate Biostatistics: Regression, Prediction, Survival Analysis (EPI 262)

Methods for analyzing longitudinal data. Topics include Kaplan-Meier methods, Cox regression, hazard ratios, time-dependent variables, longitudinal data structures, profile plots, missing data, modeling change, MANOVA, repeated-measures ANOVA, GEE, and mixed models. Emphasis is on practical applications. Prerequisites: basic ANOVA and linear regression.
Terms: Spr | Units: 3

STATS 264: Foundations of Statistical and Scientific Inference (EPI 264)

The course will consist of readings and discussion of foundational papers and book sections in the domains of statistical and scientific inference. Topics to be covered include philosophy of science, interpretations of probability, Bayesian and frequentist approaches to statistical inference and current controversies about the proper use of p-values and research reproducibility. Recommended preparation: At least 2 quarters of biostatistics and one of epidemiology. Intended for second year Masters students or PhD students with at least 1 year of preceding graduate training.
Terms: Aut | Units: 1
Instructors: ; Goodman, S. (PI)

STATS 270: Bayesian Statistics (STATS 370)

This course will treat Bayesian statistics at a relatively advanced level. Assuming familiarity with standard probability and multivariate distribution theory, we will provide a discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. In particular, we will examine the construction of priors and the asymptotic properties of likelihoods and posterior distributions. The discussion will include but will not be limited to the case of finite dimensional parameter space. There will also be some discussions on the computational algorithms useful for Bayesian inference. Prerequisites: Stats 116 or equivalent probability course, plus basic programming knowledge; basic calculus, analysis and linear algebra strongly recommended; Stats 200 or equivalent statistical theory course desirable.
Terms: Spr | Units: 3
Instructors: ; Wong, W. (PI); Lu, S. (TA)

STATS 285: Massive Computational Experiments, Painlessly

Ambitious Data Science requires massive computational experimentation; the entry ticket for a solid PhD in some fields is now to conduct experiments involving 1 Million CPU hours. Recently several groups have created efficient computational environments that make it painless to run such massive experiments. This course reviews state-of-the-art practices for doing massive computational experiments on compute clusters in a painless and reproducible manner. Students will learn how to automate their computing experiments first of all using nuts-and-bolts tools such as Perl and Bash, and later using available comprehensive frameworks such as ClusterJob and CodaLab, which enables them to take on ambitious Data Science projects. The course also features few guest lectures by renowned scientists in the field of Data Science. Students should have a familiarity with computational experiments and be facile in some high-level computer language such as R, Matlab, or Python.
Terms: Aut | Units: 2
Instructors: ; Donoho, D. (PI)

STATS 298: Industrial Research for Statisticians

Masters-level research as in 299, but with the approval and supervision of a faculty adviser, it must be conducted for an off-campus employer. Students must submit a written final report upon completion of the internship in order to receive credit. Repeatable for credit. Prerequisite: enrollment in Statistics M.S. program. IMPORTANT: F-1 international students enrolled in this CPT course cannot start working without first obtaining a CPT-endorsed I-20 from Bechtel International Center (enrolling in the CPT course alone is insufficient to meet federal immigration regulations).
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable 3 times (up to 3 units total)

STATS 299: Independent Study

For Statistics M.S. students only. Reading or research program under the supervision of a Statistics faculty member. May be repeated for credit.
Terms: Aut, Win, Spr, Sum | Units: 1-5 | Repeatable for credit

STATS 300A: Theory of Statistics I

Finite sample optimality of statistical procedures; Decision theory: loss, risk, admissibility; Principles of data reduction: sufficiency, ancillarity, completeness; Statistical models: exponential families, group families, nonparametric families; Point estimation: optimal unbiased and equivariant estimation, Bayes estimation, minimax estimation; Hypothesis testing and confidence intervals: uniformly most powerful tests, uniformly most accurate confidence intervals, optimal unbiased and invariant tests. Prerequisites: Real analysis, introductory probability (at the level of STATS 116), and introductory statistics.
Terms: Aut | Units: 3

STATS 300B: Theory of Statistics II

Elementary decision theory; loss and risk functions, Bayes estimation; UMVU estimator, minimax estimators, shrinkage estimators. Hypothesis testing and confidence intervals: Neyman-Pearson theory; UMP tests and uniformly most accurate confidence intervals; use of unbiasedness and invariance to eliminate nuisance parameters. Large sample theory: basic convergence concepts; robustness; efficiency; contiguity, locally asymptotically normal experiments; convolution theorem; asymptotically UMP and maximin tests. Asymptotic theory of likelihood ratio and score tests. Rank permutation and randomization tests; jackknife, bootstrap, subsampling and other resampling methods. Further topics: sequential analysis, optimal experimental design, empirical processes with applications to statistics, Edgeworth expansions, density estimation, time series.
Terms: Win | Units: 3

STATS 300C: Theory of Statistics III

Decision theory formulation of statistical problems. Minimax, admissible procedures. Complete class theorems ("all" minimax or admissible procedures are "Bayes"), Bayes procedures, conjugate priors, hierarchical models. Bayesian non parametrics: diaichlet, tail free, polya trees, bayesian sieves. Inconsistency of bayes rules.
Terms: Spr | Units: 3

STATS 301: Statistics Teaching Practicum

Ordinarily for Statistics first year PhD students. Discussion of effective teaching, assessment, and course design. Students practice teaching in a guided environment. There will be a total of 10 course meetings spread out across autumn, winter, and spring quarters, but students enroll in spring quarter.
Terms: Spr | Units: 1 | Repeatable 3 times (up to 3 units total)
Instructors: ; Sun, D. (PI)

STATS 302: Qualifying Exams Workshop

Prepares Statistics Ph.D. students for the qualifying exams by reviewing relevant course topics and problem solving strategies.
Terms: Sum | Units: 5-10

STATS 303: Statistics Faculty Research Presentations

For Statistics first and second year PhD students only. Discussion of statistics topics and research areas; consultation with PhD advisors.
Terms: Aut | Units: 1 | Repeatable 2 times (up to 2 units total)
Instructors: ; Taylor, J. (PI)

STATS 305A: Applied Statistics I

Statistics of real valued responses. Review of multivariate normal distribution theory. Univariate regression. Multiple regression. Constructing features from predictors. Geometry and algebra of least squares: subspaces, projections, normal equations, orthogonality, rank deficiency, Gauss-Markov. Gram-Schmidt, the QR decomposition and the SVD. Interpreting coefficients. Collinearity. Dependence and heteroscedasticity. Fits and the hat matrix. Model diagnostics. Model selection, Cp/AIC and crossvalidation, stepwise, lasso. Multiple comparisons. ANOVA, fixed and random effects. Use of bootstrap and permutations. Emphasis on problem sets involving substantive computations with data sets. Prerequisites: consent of instructor, 116, 200, applied statistics course, CS 106A, MATH 114.
Terms: Aut | Units: 3

STATS 305B: Applied Statistics II

This course uses exponential family structure to motivate generalized linear models and other useful applied techniques including survival analysis methods and Bayes and empirical Bayes analyses. The lectures are based on a forthcoming book whose notes will be distributed. Prerequisites: 305A or consent of the instructor.
Terms: Win | Units: 3

STATS 305C: Applied Statistics III

Methods for multivariate responses. Theory, computation, and practice for multivariate statistical tools. Topics may include multivariate Gaussian models, probabilistic graphical models, MCMC and variational Bayesian inference, dimensionality reduction, principal components, factor analysis, independent components analysis, canonical correlations, linear discriminant analysis, hierarchical clustering, bi-clustering, multidimensional scaling and variants (e.g., Isomap, spectral clustering, t-SNE), matrix completion, topic modeling, and state space models. Extensive work with data involving programming, ideally in Python and/or R. Prerequisites: Stats 305A and Stats 305B or consent of the instructor.
Terms: Spr | Units: 3

STATS 307: Introduction to Time Series Analysis (STATS 207)

Time series models used in economics and engineering. Trend fitting, autoregressive and moving average models and spectral analysis, Kalman filtering, and state-space models. Seasonality, transformations, and introduction to financial time series. Prerequisite: basic course in Statistics at the level of 200.
Terms: Spr | Units: 3

STATS 310A: Theory of Probability I (MATH 230A)

Mathematical tools: sigma algebras, measure theory, connections between coin tossing and Lebesgue measure, basic convergence theorems. Probability: independence, Borel-Cantelli lemmas, almost sure and Lp convergence, weak and strong laws of large numbers. Large deviations. Weak convergence; central limit theorems; Poisson convergence; Stein's method. Prerequisites: STATS 116, MATH 171.
Terms: Aut | Units: 3

STATS 310B: Theory of Probability II (MATH 230B)

Conditional expectations, discrete time martingales, stopping times, uniform integrability, applications to 0-1 laws, Radon-Nikodym Theorem, ruin problems, etc. Other topics as time allows selected from (i) local limit theorems, (ii) renewal theory, (iii) discrete time Markov chains, (iv) random walk theory,n(v) ergodic theory. http://statweb.stanford.edu/~adembo/stat-310b. Prerequisite: 310A or MATH 230A.
Terms: Win | Units: 3

STATS 310C: Theory of Probability III (MATH 230C)

Continuous time stochastic processes: martingales, Brownian motion, stationary independent increments, Markov jump processes and Gaussian processes. Invariance principle, random walks, LIL and functional CLT. Markov and strong Markov property. Infinitely divisible laws. Some ergodic theory. Prerequisite: 310B or MATH 230B. http://statweb.stanford.edu/~adembo/stat-310c/
Terms: Spr | Units: 3
Instructors: ; Dembo, A. (PI); Tung, N. (TA)

STATS 311: Information Theory and Statistics (EE 377)

Information theoretic techniques in probability and statistics. Fano, Assouad,nand Le Cam methods for optimality guarantees in estimation. Large deviationsnand concentration inequalities (Sanov's theorem, hypothesis testing, thenentropy method, concentration of measure). Approximation of (Bayes) optimalnprocedures, surrogate risks, f-divergences. Penalized estimators and minimumndescription length. Online game playing, gambling, no-regret learning. Prerequisites: EE 276 (or equivalent) or STATS 300A.
Terms: Aut | Units: 3

STATS 315A: Modern Applied Statistics: Learning

Overview of supervised learning. Linear regression and related methods. Model selection, least angle regression and the lasso, stepwise methods. Classification. Linear discriminant analysis, logistic regression, and support vector machines (SVMs). Basis expansions, splines and regularization. Kernel methods. Generalized additive models. Kernel smoothing. Gaussian mixtures and the EM algorithm. Model assessment and selection: crossvalidation and the bootstrap. Pathwise coordinate descent. Sparse graphical models. Prerequisites: STATS 305A, 305B, 305C or consent of instructor.
Terms: Win | Units: 3

STATS 317: Stochastic Processes

Semimartingales, stochastic integration, Ito's formula, Girsanov's theorem. Gaussian and related processes. Stationary/isotropic processes. Integral geometry and geometric probability. Maxima of random fields and applications to spatial statistics and imaging.
Terms: Win | Units: 3
Instructors: ; Li, S. (PI)

STATS 318: Modern Markov Chains (MATH 235)

Tools for understanding Markov chains as they arise in applications. Random walk on graphs, reversible Markov chains, Metropolis algorithm, Gibbs sampler, hybrid Monte Carlo, auxiliary variables, hit and run, Swedson-Wong algorithms, geometric theory, Poincare-Nash-Cheeger-Log-Sobolov inequalities. Comparison techniques, coupling, stationary times, Harris recurrence, central limit theorems, and large deviations.
Terms: Win | Units: 3

STATS 319: Literature of Statistics

Literature study of topics in statistics and probability culminating in oral and written reports. May be repeated for credit.
Terms: Aut, Win, Spr | Units: 1 | Repeatable for credit

STATS 323: Sequential Analysis (STATS 223)

This course will survey the history of sequential analysis from its origin in the 1940s via its continuing role in clinical trials to current activity in machine learning. Subject to the limitations of time, the following topics will be discussed: parametric and semi-parametric hypothesis testing from Wald to sequential clinical trials; fixed precision estimation; change-point detection and estimation; iterative stochastic algorithms and machine learning; anytime-valid inference; optimal stopping, dynamic programming, and stochastic control; multi-armed bandits; applications. Prerequisites: for 223, Stats 200 or equivalent; for 323, Stats 300A and 310A.
Terms: Aut | Units: 3

STATS 335: The Challenge Problems Paradigm in Empirical Machine Learning and Beyond

In many fields of science and technology, empirical research has been making rapid progress by implicitly following a little-studied research paradigm (CPP) with several distinctive features: a shared public database, a common task, (for example, prediction of class labels or a response variable from given input features), an objective scoring rule that quantifies performance on that task, a leaderboard that tracks performance of submissions, and a set of enrolled competitors who each try to improve the current best-known performance on that task. In the context of Empirical Machine Learning, this is explicitly the famous "Kaggle" model; however, Kaggle didn't originate this approach, and many research disciplines follow the same ingredients, in many cases implicitly or tacitly. As we know, the CPP anchored recent claims of progress in image understanding and in natural language processing. In this course we will review the many instances and variations on the CPP that exist in modern research, including not only in the standard areas of empirical machine learning (computer vision and natural language understanding) but also in academic empirical finance and computational hard sciences. We will discuss evidence that the CPP itself is a kind of secret sauce, rather than the specific technologies that are spotlighted because of CPP. We will discuss software platforms implementing CPP, including Kaggle, but also academic platforms like CodaLab, which is often used for challenge problems in natural language processing, and Nightingale Open Science which is used for challenge problems involving potentially protected health information. Prerequisite: an introductory statistics or machine learning course.
Terms: Aut | Units: 3
Instructors: ; Donoho, D. (PI)

STATS 352: Topics in Computing for Data Science (BIODS 352)

A seminar-style course with lectures on a range of computational topics important for modern data-intensive science, jointly supported by the Statistics department and Stanford Data Science, and suitable for advanced undergraduate/graduate students engaged in either research on data science techniques (statistical or computational, for example) or research in scientific fields relying on advanced data science to achieve its goals. Seminars will alternate a presentation of a topic, usually by an expert on that topic, typically leading to exercises applying the techniques, with a follow up lecture to further discuss the topic and the exercises. Prerequisites: Understanding of basic modern data science and competence in related programming, e.g., in R or Python. https://stats352.stanford.edu/
Terms: Spr | Units: 1

STATS 361: Causal Inference

This course covers statistical underpinnings of causal inference, with a focus on experimental design and data-driven decision making. Topics include randomization, potential outcomes, observational studies, propensity score methods, matching, double robustness, semiparametric efficiency, treatment heterogeneity, structural models, instrumental variables, principal stratification, mediation, regression discontinuities, synthetic controls, interference, sensitivity analysis, policy learning, dynamic treatment rules, invariant prediction, graphical models, and structure learning. We will also discuss the relevance of optimization and machine learning tools to causal inference. Prerequisite: STATS 300A and STATS 300B, or equivalent graduate-level coursework on the theory of statistics.
Terms: Spr | Units: 3
Instructors: ; Wager, S. (PI); Jing, A. (TA)

STATS 362: Topic: Monte Carlo

Random numbers and vectors: inversion, acceptance-rejection, copulas. Variance reduction: antithetics, stratification, control variates, importance sampling. MCMC: Markov chains, detailed balance, Metropolis-Hastings, random walk Metropolis,nnindependence sampler, Gibbs sampling, slice sampler, hybrids of Gibbs and Metropolis, tempering. Sequential Monte Carlo. Quasi-Monte Carlo. Randomized quasi-Monte Carlo. Examples, problems and motivation from Bayesian statistics,nnmachine learning, computational finance and graphics. May be repeat for credit.
Terms: Win | Units: 3
Instructors: ; Owen, A. (PI); Pan, Z. (TA)

STATS 366: Modern Statistics for Modern Biology (BIOS 221, STATS 256)

Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Working knowledge of R and two core Biology courses. Note that the 155 offering is a writing intensive course for undergraduates only and requires instructor consent. (WIM). See https://web.stanford.edu/class/bios221/index.html
Terms: Aut | Units: 3

STATS 370: Bayesian Statistics (STATS 270)

This course will treat Bayesian statistics at a relatively advanced level. Assuming familiarity with standard probability and multivariate distribution theory, we will provide a discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. In particular, we will examine the construction of priors and the asymptotic properties of likelihoods and posterior distributions. The discussion will include but will not be limited to the case of finite dimensional parameter space. There will also be some discussions on the computational algorithms useful for Bayesian inference. Prerequisites: Stats 116 or equivalent probability course, plus basic programming knowledge; basic calculus, analysis and linear algebra strongly recommended; Stats 200 or equivalent statistical theory course desirable.
Terms: Spr | Units: 3
Instructors: ; Wong, W. (PI); Lu, S. (TA)

STATS 375: Mathematical Problems in Machine Learning (MATH 276)

Mathematical tools to understand modern machine learning systems. Generalization in machine learning, the classical view: uniform convergence, Radamacher complexity. Generalization from stability. Implicit (algorithmic) regularization. Infinite-dimensional models: reproducing kernel Hilbert spaces. Random features approximations to kernel methods. Connections to neural networks, and neural tangent kernel. Nonparametric regression. Asymptotic behavior of wide neural networks. Properties of convolutionalnetworks. Prerequisites: EE364A or equivalent; Stat310A or equivalent.
Terms: Spr | Units: 3
Instructors: ; Montanari, A. (PI)

STATS 390: Consulting Workshop

Skills required of practicing statistical consultants, including exposure to statistical applications. Students participate as consultants in the department's drop-in consulting service, analyze client data, and prepare formal written reports. Seminar provides supervised experience in short term consulting. May be repeated for credit. Prerequisites: graduate course work in applied statistics or data analysis, and consent of instructor.
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable for credit

STATS 398: Industrial Research for Statisticians

Doctoral research as in 399, but must be conducted for an off-campus employer. A final report acceptable to the advisor outlining work activity, problems investigated, key results, and any follow-up projects they expect to perform is required. The report is due at the end of the quarter in which the course is taken. May be repeated for credit. Prerequisite: Statistics Ph.D. candidate. IMPORTANT: F-1 international students enrolled in this CPT course cannot start working without first obtaining a CPT-endorsed I-20 from Bechtel International Center (enrolling in the CPT course alone is insufficient to meet federal immigration regulations).
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable for credit
© Stanford University | Terms of Use | Copyright Complaints