STATS 48N:
Riding the Data Wave
Imagine collecting a bit of your saliva and sending it in to one of the personalized genomics company: for very little money you will get back information about hundreds of thousands of variable sites in your genome. Records of exposure to a variety of chemicals in the areas you have lived are only a few clicks away on the web; as are thousands of studies and informal reports on the effects of different diets, to which you can compare your own. What does this all mean for you? Never before in history humans have recorded so much information about themselves and the world that surrounds them. Nor has this data been so readily available to the lay person. Expression as "data deluge'' are used to describe such wealth as well as the loss of proper bearings that it often generates. How to summarize all this information in a useful way? How to boil down millions of numbers to just a meaningful few? How to convey the gist of the story in a picture without misleading oversimplifications? To answer these questions we need to consider the use of the data, appreciate the diversity that they represent, and understand how people instinctively interpret numbers and pictures. During each week, we will consider a different data set to be summarized with a different goal. We will review analysis of similar problems carried out in the past and explore if and how the same tools can be useful today. We will pay attention to contemporary media (newspapers, blogs, etc.) to identify settings similar to the ones we are examining and critique the displays and summaries there documented. Taking an experimental approach, we will evaluate the effectiveness of different data summaries in conveying the desired information by testing them on subsets of the enrolled students.
Terms: Aut

Units: 3

UG Reqs: WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 50:
Mathematics of Sports (MCS 100)
The use of mathematics, statistics, and probability in the analysis of sports performance, sports records, and strategy. Topics include mathematical analysis of the physics of sports and the determinations of optimal strategies. New diagnostic statistics and strategies for each sport. Corequisite: STATS 60, 110 or 116.
Terms: Aut

Units: 3

UG Reqs: GER:DBMath

Grading: Letter or Credit/No Credit
STATS 60:
Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 160)
Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, ttests, correlation, and regression. Possible topics: analysis of variance and chisquare tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum

Units: 5

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 110:
Statistical Methods in Engineering and the Physical Sciences
Introduction to statistics for engineers and physical scientists. Topics: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, elementary experimental design. Prerequisite: one year of calculus.
Terms: Aut, Sum

Units: 45

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 116:
Theory of Probability
Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. Introduction to the laws of large numbers and central limit theorem. Prerequisites: MATH 52 and familiarity with infinite series, or equivalent.
Terms: Aut, Spr, Sum

Units: 35

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 141:
Biostatistics (BIO 141)
Introductory statistical methods for biological data: describing data (numerical and graphical summaries); introduction to probability; and statistical inference (hypothesis tests and confidence intervals). Intermediate statistical methods: comparing groups (analysis of variance); analyzing associations (linear and logistic regression); and methods for categorical data (contingency tables and odds ratio). Course content integrated with statistical computing in R.
Terms: Aut

Units: 35

UG Reqs: GER:DBMath, WAYAQR

Grading: Letter or Credit/No Credit
STATS 160:
Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 60)
Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, ttests, correlation, and regression. Possible topics: analysis of variance and chisquare tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum

Units: 5

Grading: Letter or Credit/No Credit
STATS 167:
Probability: Ten Great Ideas About Chance (PHIL 166, PHIL 266, STATS 267)
Foundational approaches to thinking about chance in matters such as gambling, the law, and everyday affairs. Topics include: chance and decisions; the mathematics of chance; frequencies, symmetry, and chance; Bayes great idea; chance and psychology; misuses of chance; and harnessing chance. Emphasis is on the philosophical underpinnings and problems. Prerequisite: exposure to probability or a first course in statistics at the level of STATS 60 or 116.
Terms: not given this year

Units: 4

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 191:
Introduction to Applied Statistics
Statistical tools for modern data analysis. Topics include regression and prediction, elements of the analysis of variance, bootstrap, and crossvalidation. Emphasis is on conceptual rather than theoretical understanding. Applications to social/biological sciences. Student assignments/projects require use of the software package R. Recommended: 60, 110, or 141.
Terms: Win

Units: 34

UG Reqs: GER:DBMath, WAYAQR

Grading: Letter or Credit/No Credit
STATS 199:
Independent Study
For undergraduates.
Terms: Aut, Win, Spr, Sum

Units: 115

Repeatable for credit

Grading: Satisfactory/No Credit
Instructors: ;
Candes, E. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Duchi, J. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Jackman, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Lemley, M. (PI);
Mackey, L. (PI);
Montanari, A. (PI);
Mukherjee, R. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)
STATS 200:
Introduction to Statistical Inference
Modern statistical concepts and procedures derived from a mathematical framework. Statistical inference, decision theory; point and interval estimation, tests of hypotheses; NeymanPearson theory. Bayesian analysis; maximum likelihood, large sample theory. Prerequisite: 116.
Terms: Win, Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 203:
Introduction to Regression Models and Analysis of Variance
Modeling and interpretation of observational and experimental data using linear and nonlinear regression methods. Model building and selection methods. Multivariable analysis. Fixed and random effects models. Experimental design. Pre or corequisite: 200.
Terms: Spr, Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 206:
Applied Multivariate Analysis
Introduction to the statistical analysis of several quantitative measurements on each observational unit. Emphasis is on concepts, computerintensive methods. Examples from economics, education, geology, psychology. Topics: multiple regression, multivariate analysis of variance, principal components, factor analysis, canonical correlations, multidimensional scaling, clustering. Pre or corequisite: 200.
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 207:
Introduction to Time Series Analysis
Time series models used in economics and engineering. Trend fitting, autoregressive and moving average models and spectral analysis, Kalman filtering, and statespace models. Seasonality, transformations, and introduction to financial time series. Prerequisite: basic course in Statistics at the level of 200.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 208:
Introduction to the Bootstrap
The bootstrap is a computerbased method for assigning measures of accuracy to statistical estimates. By substituting computation in place of mathematical formulas, it permits the statistical analysis of complicated estimators. Topics: nonparametric assessment of standard errors, biases, and confidence intervals; related resampling methods including the jackknife, crossvalidation, and permutation tests. Theory and applications. Prerequisite: course in statistics or probability.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 211:
Metaresearch: Appraising Research Findings, Bias, and Metaanalysis (HRP 206, MED 206)
Open to graduate, medical, and undergraduate students. Appraisal of the quality and credibility of research findings; evaluation of sources of bias. Metaanalysis as a quantitative (statistical) method for combining results of independent studies. Examples from medicine, epidemiology, genomics, ecology, social/behavioral sciences, education. Collaborative analyses. Project involving generation of a metaresearch project or reworking and evaluation of an existing published metaanalysis. Prerequisite: knowledge of basic statistics.
Terms: Win

Units: 3

Grading: Medical Satisfactory/No Credit
STATS 213:
Introduction to Graphical Models (STATS 313)
Multivariate Normal Distribution and Inference, Wishart distributions, graph theory, probabilistic Markov models, pairwise and global Markov property, decomposable graph, Markov equivalence, MLE for DAG models and undirected graphical models, Bayesian inference for DAG models and undirected graphical models. Prerequisites: STATS 217, STATS 200 (preferably STATS 300A), MATH 104 or equivalent class in linear algebra.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 216:
Introduction to Statistical Learning
Overview of supervised learning, with a focus on regression and classification methods. Syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis;crossvalidation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; treebased methods, random forests and boosting; supportvector machines; Some unsupervised learning: principal components and clustering (kmeans and hierarchical). Computing is done in R, through tutorial sessions and homework assignments. This mathlight course is offered via video segments (MOOC style), and inclass problem solving sessions. Prerequisites: first courses in statistics, linear algebra, and computing.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 216V:
Introduction to Statistical Learning
Overview of supervised learning, with a focus on regression and classification methods. Syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; crossvalidation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; treebased methods, random forests and boosting; supportvector machines; Some unsupervised learning: principal components and clustering (kmeans and hierarchical). Computing is done in R, through tutorial sessions and homework assignments. This mathlight course is offered remotely only via video segments (MOOC style). TAs will host remote weekly office hours using an online platform such as Google Hangout or BlueJeans. There are four homework assignments, a midterm, and final exam. Prerequisites: first courses in statistics, linear algebra, and computing.
Terms: Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 217:
Introduction to Stochastic Processes
Discrete and continuous time Markov chains, poisson processes, random walks, branching processes, first passage times, recurrence and transience, stationary distributions. NonStatistics masters students may want to consider taking STATS 215 instead. Prerequisite: STATS 116 or consent of instructor.
Terms: Win, Sum

Units: 23

Grading: Letter or Credit/No Credit
STATS 218:
Introduction to Stochastic Processes
Renewal theory, Brownian motion, Gaussian processes, second order processes, martingales.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 219:
Stochastic Processes (MATH 136)
Introduction to measure theory, Lp spaces and Hilbert spaces. Random variables, expectation, conditional expectation, conditional distribution. Uniform integrability, almost sure and Lp convergence. Stochastic processes: definition, stationarity, sample path continuity. Examples: random walk, Markov chains, Gaussian processes, Poisson processes, Martingales. Construction and basic properties of Brownian motion. Prerequisite: STATS 116 or MATH 151 or equivalent. Recommended: MATH 115 or equivalent.
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 231:
Statistical Learning Theory (CS 229T)
(Same as STATS 231) How do we formalize what it means for an algorithm to learn from data? This course focuses on developing mathematical tools for answering this question. We will present various common learning algorithms and prove theoretical guarantees about them. Topics include online learning, kernel methods, generalization bounds (uniform convergence), and spectral methods. Prerequisites: A solid background in linear algebra and probability theory, statistics and machine learning (STATS 315A or CS 229). Convex optimization (EE 364a) is helpful but not required.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 237:
Theory of Investment Portfolios and Derivative Securities
Asset returns and their volatilities. Markowitz¿s portfolio theory, capital asset pricing model, multifactor pricing models. Measures of market risk. Financial derivatives and hedging. Black¿Scholes pricing of European options. Valuation of American options. Implied volatility and the Greeks. Prerequisite: STATS 116 or equivalent
Terms: Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 243:
Financial Models and Statistical Methods in Active Risk Management (CME 243)
(SCPD students register for 243P.) Market risk and credit risk, credit markets. Back testing, stress testing and Monte Carlo methods. Logistic regression, generalized linear models and generalized mixed models. Loan prepayment and default as competing risks. Survival and hazard functions, correlated default intensities, frailty and contagion. Risk surveillance, early warning and adaptive control methodologies. Banking and bank regulation, asset and liability management. Prerequisite: STATS 240 or equivalent.
Terms: Win

Units: 24

Grading: Letter or Credit/No Credit
STATS 243P:
Financial Models and Statistical Methods in Risk Management
For SCPD students; see STATS243.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 250:
Mathematical Finance (MATH 238)
Stochastic models of financial markets. Forward and futures contracts. European options and equivalent martingale measures. Hedging strategies and management of risk. Term structure models and interest rate derivatives. Optimal stopping and American options. Corequisites: MATH 236 and 227 or equivalent.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 267:
Probability: Ten Great Ideas About Chance (PHIL 166, PHIL 266, STATS 167)
Foundational approaches to thinking about chance in matters such as gambling, the law, and everyday affairs. Topics include: chance and decisions; the mathematics of chance; frequencies, symmetry, and chance; Bayes great idea; chance and psychology; misuses of chance; and harnessing chance. Emphasis is on the philosophical underpinnings and problems. Prerequisite: exposure to probability or a first course in statistics at the level of STATS 60 or 116.
Terms: not given this year

Units: 4

Grading: Letter or Credit/No Credit
STATS 270:
A Course in Bayesian Statistics (STATS 370)
Advancedlevel Bayesian statistics. Topics: Discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. Examination of the construction of priors and the asymptotic properties of likelihoods and posterior densities. Discussion including but not limited to the case of finite dimensional parameter space. Prerequisite: familiarity with standard probability and multivariate distribution theory.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 290:
Paradigms for Computing with Data
Advanced programming and computing techniques to support projects in data analysis and related research. For Statistics graduate students and others whose research involves data analysis and development of associated computational software. Prerequisites: Programming experience including familiarity with R; computing at least at the level of CS 106; statistics at the level of STATS 110 or 141.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 299:
Independent Study
For Statistics M.S. students only. Reading or research program under the supervision of a Statistics faculty member. May be repeated for credit.
Terms: Aut, Win, Spr, Sum

Units: 110

Repeatable for credit

Grading: Letter or Credit/No Credit
Instructors: ;
Bacallado, S. (PI);
Baiocchi, M. (PI);
Benjamini, Y. (PI);
Candes, E. (PI);
Chatterjee, S. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Duchi, J. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Khare, A. (PI);
Lai, T. (PI);
Mackey, L. (PI);
Montanari, A. (PI);
Mukherjee, R. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Ross, K. (PI);
Sabatti, C. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)
STATS 300:
Advanced Topics in Statistics
Topic: Exploratory Multivariate Data Analysis. Describing and visualizing data with principal component analysis (PCA) for continuous data, correspondence analysis (CA) for contingency tables, multiple correspondence analysis (MCA) for categorical data, factorial analysis for mixed data (FAMD) for both continuous and categorical data, and multiple factor analysis (MFA) for data structured into groups of variables. Studying and visualization of the correlation between groups of variables with the RV coefficient. Performing PCA with missing values, matrix completion of continuous and categorical data with principal components. Examples from sensory analysis, public health, genetics. All the analysis will be performed with R.
Terms: Sum

Units: 23

Repeatable for credit

Grading: Letter or Credit/No Credit
STATS 300A:
Theory of Statistics
Finite sample optimality of statistical procedures; Decision theory: loss, risk, admissibility; Principles of data reduction: sufficiency, ancillarity, completeness; Statistical models: exponential families, group families, nonparametric families; Point estimation: optimal unbiased and equivariant estimation, Bayes estimation, minimax estimation; Hypothesis testing and confidence intervals: uniformly most powerful tests, uniformly most accurate confidence intervals, optimal unbiased and invariant tests.nnnPrerequisites: Real analysis, introductory probability (at the level of STATS 116), and introductory statistics.
Terms: Aut

Units: 23

Grading: Letter or Credit/No Credit
STATS 300B:
Theory of Statistics
Elementary decision theory; loss and risk functions, Bayes estimation; UMVU estimator, minimax estimators, shrinkage estimators. Hypothesis testing and confidence intervals: NeymanPearson theory; UMP tests and uniformly most accurate confidence intervals; use of unbiasedness and invariance to eliminate nuisance parameters. Large sample theory: basic convergence concepts; robustness; efficiency; contiguity, locally asymptotically normal experiments; convolution theorem; asymptotically UMP and maximin tests. Asymptotic theory of likelihood ratio and score tests. Rank permutation and randomization tests; jackknife, bootstrap, subsampling and other resampling methods. Further topics: sequential analysis, optimal experimental design, empirical processes with applications to statistics, Edgeworth expansions, density estimation, time series.
Terms: Win

Units: 24

Grading: Letter or Credit/No Credit
STATS 300C:
Theory of Statistics
Decision theory formulation of statistical problems. Minimax, admissible procedures. Complete class theorems ("all" minimax or admissible procedures are "Bayes"), Bayes procedures, conjugate priors, hierarchical models. Bayesian non parametrics: diaichlet, tail free, polya trees, bayesian sieves. Inconsistency of bayes rules.
Terms: Spr

Units: 24

Grading: Letter or Credit/No Credit
STATS 305:
Introduction to Statistical Modeling
Review of univariate regression. Multiple regression. Geometry, subspaces, orthogonality, projections, normal equations, rank deficiency, estimable functions and GaussMarkov theorem. Computation via QR decomposition, GrammSchmidt orthogonalization and the SVD. Interpreting coefficients, collinearity, graphical displays. Fits and the Hat matrix, leverage & influence, diagnostics, weighted least squares and resistance. Model selection, Cp/Aic and crossvalidation, stepwise, lasso. Basis expansions, splines. Multivariate normal distribution theory. ANOVA: Sources of measurements, fixed and random effects, randomization. Emphasis on problem sets involving substantive computations with data sets. Prerequisites: consent of instructor, 116, 200, applied statistics course, CS 106A, MATH 114.
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 306A:
Methods for Applied Statistics
Regression modeling extended to categorical data. Logistic regression. Loglinear models. Generalized linear models. Discriminant analysis. Categorical data models from information retrieval and Internet modeling. Prerequisite: 305 or equivalent.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 306B:
Methods for Applied Statistics: Unsupervised Learning
Unsupervised learning techniques in statistics, machine learning, and data mining.
Terms: Spr

Units: 23

Grading: Letter or Credit/No Credit
STATS 310A:
Theory of Probability (MATH 230A)
Mathematical tools: sigma algebras, measure theory, connections between coin tossing and Lebesgue measure, basic convergence theorems. Probability: independence, BorelCantelli lemmas, almost sure and Lp convergence, weak and strong laws of large numbers. Large deviations. Weak convergence; central limit theorems; Poisson convergence; Stein's method. Prerequisites: 116, MATH 171.
Terms: Aut

Units: 24

Grading: Letter or Credit/No Credit
STATS 310B:
Theory of Probability (MATH 230B)
Conditional expectations, discrete time martingales, stopping times, uniform integrability, applications to 01 laws, RadonNikodym Theorem, ruin problems, etc. Other topics as time allows selected from (i) local limit theorems, (ii) renewal theory, (iii) discrete time Markov chains, (iv) random walk theory,nn(v) ergodic theory. Prerequisite: 310A or MATH 230A.
Terms: Win

Units: 23

Grading: Letter or Credit/No Credit
STATS 310C:
Theory of Probability (MATH 230C)
Continuous time stochastic processes: martingales, Brownian motion, stationary independent increments, Markov jump processes and Gaussian processes. Invariance principle, random walks, LIL and functional CLT. Markov and strong Markov property. Infinitely divisible laws. Some ergodic theory. Prerequisite: 310B or MATH 230B.
Terms: Spr

Units: 24

Grading: Letter or Credit/No Credit
STATS 311:
Information Theory and Statistics (EE 377)
Information theoretic techniques in probability and statistics. Fano, Assouad,nand Le Cam methods for optimality guarantees in estimation. Large deviationsnand concentration inequalities (Sanov's theorem, hypothesis testing, thenentropy method, concentration of measure). Approximation of (Bayes) optimalnprocedures, surrogate risks, fdivergences. Penalized estimators and minimumndescription length. Online game playing, gambling, noregret learning. Prerequisites: EE 376A (or equivalent) or STATS 300A.
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 313:
Introduction to Graphical Models (STATS 213)
Multivariate Normal Distribution and Inference, Wishart distributions, graph theory, probabilistic Markov models, pairwise and global Markov property, decomposable graph, Markov equivalence, MLE for DAG models and undirected graphical models, Bayesian inference for DAG models and undirected graphical models. Prerequisites: STATS 217, STATS 200 (preferably STATS 300A), MATH 104 or equivalent class in linear algebra.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 317:
Stochastic Processes
Semimartingales, stochastic integration, Ito's formula, Girsanov's theorem. Gaussian and related processes. Stationary/isotropic processes. Integral geometry and geometric probability. Maxima of random fields and applications to spatial statistics and imaging.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 324:
Multivariate Analysis
Classic multivariate statistics: properties of the multivariate normal distribution, determinants, volumes, projections, matrix square roots, the singular value decomposition; Wishart distributions, Hotelling's Tsquare; principal components, canonical correlations, Fisher's discriminant, the Cauchy projection formula.
Terms: not given this year

Units: 23

Grading: Letter or Credit/No Credit
STATS 330:
An Introduction to Compressed Sensing (CME 362)
Compressed sensing is a new data acquisition theory asserting that one can design nonadaptive sampling techniques that condense the information in a compressible signal into a small amount of data. This revelation may change the way engineers think about signal acquisition. Course covers fundamental theoretical ideas, numerical methods in largescale convex optimization, hardware implementations, connections with statistical estimation in high dimensions, and extensions such as recovery of data matrices from few entries (famous Netflix Prize).
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 331:
Survival Analysis
The course introduces basic concepts, theoretical basis and statistical methods associated with survival data. Topics include censoring, KaplanMeier estimation, logrank test, proportional hazards regression, accelerated failure time model, multivariate failure time analysis and competing risks. The traditional counting process/martingale methods as well as modern empirical process methods will be covered. Prerequisite: Understanding of basic probability theory and statistical inference methods.
Terms: Win

Units: 2

Grading: Letter or Credit/No Credit
STATS 345:
Statistical and Machine Learning Methods for Genomics (BIO 268, BIOMEDIN 245, CS 373, GENE 245)
Introduction to statistical and computational methods for genomics. Sample topics include: expectation maximization, hidden Markov model, Markov chain Monte Carlo, ensemble learning, probabilistic graphical models, kernel methods and other modern machine learning paradigms. Rationales and techniques illustrated with existing implementations used in population genetics, disease association, and functional regulatory genomics studies. Instruction includes lectures and discussion of readings from primary literature. Homework and projects require implementing some of the algorithms and using existing toolkits for analysis of genomic datasets.
Terms: Spr

Units: 3

Grading: Medical Option (MedLtrCR/NC)
STATS 360:
Advanced Statistical Methods for Earth System Analysis (EESS 260)
Introduction for graduate students to important issues in data analysis relevant to earth system studies. Emphasis on methodology, concepts and implementation (in R), rather than formal proofs. Likely topics include the bootstrap, nonparametric methods, regression in the presence of spatial and temporal correlation, extreme value analysis, timeseries analysis, highdimensional regressions and changepoint models. Topics subject to change each year. Prerequisites: STATS 110 or equivalent.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 367:
Statistical Models in Genetics
Statistical problems in association and linkage analysis of qualitative and quantitative traits in human and experimental populations; sequence alignment and analysis; population genetics/evolution (WrightFisher model, Kingman coalescent, models of nucleotide substitution); related computational algorithms. Prerequisites: knowledge of probability through elementary stochastic processes and statistics through likelihood theory.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 370:
A Course in Bayesian Statistics (STATS 270)
Advancedlevel Bayesian statistics. Topics: Discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. Examination of the construction of priors and the asymptotic properties of likelihoods and posterior densities. Discussion including but not limited to the case of finite dimensional parameter space. Prerequisite: familiarity with standard probability and multivariate distribution theory.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 376A:
Information Theory (EE 376A)
The fundamental ideas of information theory. Entropy and intrinsic randomness. Data compression to the entropy limit. Huffman coding. Arithmetic coding. Channel capacity, the communication limit. Gaussian channels. Kolmogorov complexity. Asymptotic equipartition property. Information theory and Kelly gambling. Applications to communication and data compression. Prerequisite: EE178 or STATS 116, or equivalent.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 376B:
Network Information Theory (EE 376B)
Network information theory deals with the fundamental limits on information flow in networks and the optimal coding schemes that achieve these limits. It aims to extend Shannon's pointtopoint information theory and the FordFulkerson maxflow mincut theorem to networks with multiple sources and destinations. The course presents the basic results and tools in the field in a simple and unified manner. Topics covered include: multiple access channels, broadcast channels, interference channels, channels with state, distributed source coding, multiple description coding, network coding, relay channels, interactive communication, and noisy network coding. Prerequisites: EE376A.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 396:
Research Workshop in Computational Biology
Applications of Computational Statistics and Data Mining to Biological Data. Attendance mandatory. Instructor approval required.
Terms: Aut, Win, Spr

Units: 12

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 399:
Research
Research work as distinguished from independent study of nonresearch character listed in 199. May be repeated for credit.
Terms: Aut, Win, Spr, Sum

Units: 110

Repeatable for credit

Grading: Satisfactory/No Credit
Instructors: ;
Bacallado, S. (PI);
Baiocchi, M. (PI);
Benjamini, Y. (PI);
Candes, E. (PI);
Chatterjee, S. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Duchi, J. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Mackey, L. (PI);
Montanari, A. (PI);
Mukherjee, R. (PI);
Narasimhan, B. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)
STATS 42Q:
Undergraduate Admissions to Selective Universities  a Statistical Perspective
The goal is the building of a statistical model, based on applicant data, for predicting admission to selective universities. The model will consider factors such as gender, ethnicity, legacy status, publicprivate schooling, test scores, effects of early action, and athletics. Common misconceptions and statistical pitfalls are investigated. The applicant data are not those associated with any specific university.
Terms: not given this year

Units: 2

Grading: Satisfactory/No Credit
STATS 90:
Mathematics in the Real World (MATH 16)
Introduction to noncalculus applications of mathematical ideas and principles in realworld problems. Topics include probability and counting, basic statistical concepts, geometric series. Applications include insurance, gambler's ruin, false positives in disease testing, present value of money, and mortgages. No knowledge of calculus required. Enrollment limited to students who do not have Stanford credit for a high school or college course in calculus or statistics.
Terms: Spr

Units: 3

UG Reqs: GER:DBMath

Grading: Letter or Credit/No Credit
STATS 155:
Statistical Methods in Computational Genetics
The computational methods necessary for the construction and evaluation of sequence alignments and phylogenies built from molecular data and genetic data such as microarrays and data base searches. How to formulate biological problems in an algorithmic decomposed form, and building blocks common to many problems such as Markovian models, multivariate analyses. Some software covered in labs (Python, Biopython, XGobi, MrBayes, HMMER, Probe). Prerequisites: knowledge of probability equivalent to STATS 116, STATS 202 and one class in computing at the CS 106 level. Writing intensive course for undergraduates only. Instructor consent required. (WIM)
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 195:
Introduction to R (CME 195)
This short course runs for the first four weeks of the quarter and is offered in fall and spring. It is recommended for students who want to use R in statistics, science, or engineering courses and for students who want to learn the basics of R programming. The goal of the short course is to familiarize students with R's tools for scientific computing. Lectures will be interactive with a focus on learning by example, and assignments will be applicationdriven. No prior programming experience is needed. Topics covered include basic data structures, File I/O, graphs, control structures, etc, and some useful packages in R.
Terms: Aut, Spr

Units: 1

Grading: Satisfactory/No Credit
STATS 198:
Practical Training
For students majoring in Mathematical and Computational Science only. Students obtain employment in a relevant industrial or research activity to enhance their professional experience.
Terms: Aut, Win, Spr, Sum

Units: 13

Repeatable for credit

Grading: Letter or Credit/No Credit
STATS 201:
Design and Analysis of Experiments
Theory and applications. Factors that affect response. Optimum levels of parameters. How to balance theory and practical design techniques. Prerequisites: basic statistics and probability theory.
Terms: not given this year

Units: 35

Grading: Letter (ABCD/NP)
STATS 202:
Data Mining and Analysis
Data mining is used to discover patterns and relationships in data. Emphasis is on large complex data sets such as those in very large databases or through web mining. Topics: decision trees, association rules, clustering, case based methods, and data visualization. Prereqs: Probability at the level of Stats 116 and familiarity with linear algebra
Terms: Aut, Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 205:
Introduction to Nonparametric Statistics
Nonparametric analogs of the one and twosample ttests and analysis of variance; the sign test, median test, Wilcoxon's tests, and the KruskalWallis and Friedman tests, tests of independence. Nonparametric regression and nonparametric density estimation, modern nonparametric techniques, nonparametric confidence interval estimates.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 209:
Statistical Methods for Group Comparisons and Causal Inference (EDUC 260X, HRP 239)
Critical examination of statistical methods in social science applications, especially for cause and effect determinations. Topics: path analysis, multilevel models, matching and propensity score methods, analysis of covariance, instrumental variables, compliance, longitudinal data, mediating and moderating variables. See http://web.stanford.edu/~rag/stat209/. Prerequisite: intermediatelevel statistical methods.
Terms: Win

Units: 3

Grading: Medical Option (MedLtrCR/NC)
STATS 212:
Applied Statistics with SAS
Data analysis and implementation of statistical tools in SAS. Topics: reading in and describing data, categorical data, dates and longitudinal data, correlation and regression, nonparametric comparisons, ANOVA, multiple regression, multivariate data analysis, using arrays and macros in SAS. Prerequisite: statistical techniques at the level of STATS 191 or 203; knowledge of SAS not required.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 221:
Introduction to Mathematical Finance
Interest rate and discounted value. Financial derivatives, hedging, and risk management. Stochastic models of financial markets, introduction to Ito calculus and stochastic differential equations. BlackScholes pricing of European options. Optimal stopping and American options. Prerequisites: MATH 53, STATS 116, or equivalents.
Terms: not given this year

Units: 34

Grading: Letter or Credit/No Credit
STATS 222:
Statistical Methods for Longitudinal Data (EDUC 351A)
Research designs and statistical procedures for timeordered (repeatedmeasures) data. The analysis of longitudinal panel data is central to empirical research on learning, development, aging, and the effects of interventions. Topics include: measurement of change, growth curve models, analysis of durations including survival analysis, experimental and nonexperimental group comparisons, reciprocal effects, stability. See http://web.stanford.edu/~rag/stat222/. Prerequisite: intermediate statistical methods.
Terms: Aut

Units: 23

Grading: Letter or Credit/No Credit
STATS 238:
The Future of Finance (ECON 152, ECON 252, PUBLPOL 364)
If you are interested in a career in finance or that touches finance (legal, regulatory, corporate, public policy), this course will give you a useful perspective. We will survey the players and current landscape of the global markets as the world continues to evolve from the financial crisis. We will discuss the sweeping change underway at the policy level by regulators and legislators around the world and this will include guestlecturer perspectives on where the greatest opportunities exist for students entering or touching the world of finance today. The course will also review, in a nontechnical way, the basics of the financial derivatives and other quantitative techniques that are a core part of the global capital markets. Elements used in grading: Class Participation, Attendance, Final Paper. Consent Application: To apply for this course, students must complete and email the Consent Application found on the Public Policy website to the instructor at tbeder@stanford.edu. Please visit https://publicpolicy.stanford.edu/academics/undergraduate/forms to locate the Consent Application Form for this class. The form is located on the Public Policy website under "Academics" and "Forms." See Consent Application Form for submission deadline.
Terms: Win

Units: 2

Grading: Letter or Credit/No Credit
STATS 239:
Mathematical and Computational Finance Seminar (CME 242)
Terms: Aut

Units: 1

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 239A:
Workshop in Quantitative Finance
Topics of current interest.
Terms: not given this year

Units: 1

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 239B:
Workshop in Quantitative Finance (CME 239B)
Topics of current interest. May be repeated for credit.
Terms: not given this year

Units: 1

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 240:
Statistical Methods in Finance
(SCPD students register for 240P.) Regression analysis and applications to investment models. Principal components and multivariate analysis. Likelihood inference and Bayesian methods. Financial time series. Estimation and modeling of volatilities. Statistical methods for portfolio management. Prerequisite: STATS 200 or equivalent.
Terms: Aut

Units: 34

Grading: Letter or Credit/No Credit
STATS 240P:
Statistical Methods in Finance
For SCPD students; see 240.
Terms: Aut

Units: 3

Grading: Letter or Credit/No Credit
STATS 241:
Datadriven financial and risk econometrics
(SCPD students register for 241P) Substantive and empirical modeling approaches in options, interest rate, and credit markets. Nonlinear least squares, logistic regression and generalized linear models. Nonparametric regression and model selection. Multivariate time series modeling and forecasting. Vector autoregressive models and cointegration. Risk measures, models and analytics. Prerequisite or corequisite: STATS 240 or equivalent.
Terms: alternate years, not given next year

Units: 34

Grading: Letter or Credit/No Credit
STATS 241P:
Datadriven financial and risk econometrics
For SCPD students; see STATS241.
Terms: alternate years, not given next year

Units: 3

Grading: Letter or Credit/No Credit
STATS 242:
Algorithmic Trading and Quantitative Strategies
An introduction to financial trading strategies based on methods of statistical arbitrage that can be automated. Methodologies related to high frequency data and stylized facts on asset returns; models of order book dynamics and order placement, dynamic trade planning with feedback; momentum strategies, pairs trading. Emphasis on developing and implementing models that reflect the market and behavioral patterns. Prerequisite: STATS 240 or equivalent.
Terms: Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 244:
Quantitative Trading: Algorithms, Data, and Optimization
Statistical trading rules and performances evaluation. Active portfolio management and dynamic investment strategies. Data analytics and models of transactions data. Limit order book dynamics in electronic exchanges. Algorithmic trading, informatics, and optimal execution. Market making and inventory control. Risk management and regulatory issues. Prerequisites: STATS 240 or equivalent.
Terms: Win

Units: 24

Grading: Letter or Credit/No Credit
STATS 253:
Analysis of Spatial and Temporal Data
A unified treatment of methods for spatial data, time series, and other correlated data from the perspective of regression with correlated errors. Two main paradigms for dealing with autocorrelation: covariance modeling (kriging) and autoregressive processes. Bayesian methods. Prerequisites: applied linear algebra (MATH 103 or equivalent), statistical estimation (STATS 200 or CS 229), and linear regression (STATS 203 or equivalent).
Terms: Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 260A:
Workshop in Biostatistics (HRP 260A)
Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Aut

Units: 12

Repeatable for credit

Grading: Medical Satisfactory/No Credit
STATS 260B:
Workshop in Biostatistics (HRP 260B)
Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Win

Units: 12

Repeatable for credit

Grading: Medical Satisfactory/No Credit
STATS 260C:
Workshop in Biostatistics (HRP 260C)
Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Spr

Units: 12

Repeatable for credit

Grading: Medical Satisfactory/No Credit
STATS 261:
Intermediate Biostatistics: Analysis of Discrete Data (BIOMEDIN 233, HRP 261)
Methods for analyzing data from casecontrol and crosssectional studies: the 2x2 table, chisquare test, Fisher's exact test, odds ratios, MantelHaenzel methods, stratification, tests for matched data, logistic regression, conditional logistic regression. Emphasis is on data analysis in SAS. Special topics: crossfold validation and bootstrap inference.
Terms: Win

Units: 3

Grading: Medical Option (MedLtrCR/NC)
STATS 262:
Intermediate Biostatistics: Regression, Prediction, Survival Analysis (HRP 262)
Methods for analyzing longitudinal data. Topics include KaplanMeier methods, Cox regression, hazard ratios, timedependent variables, longitudinal data structures, profile plots, missing data, modeling change, MANOVA, repeatedmeasures ANOVA, GEE, and mixed models. Emphasis is on practical applications. Prerequisites: basic ANOVA and linear regression.
Terms: Spr

Units: 3

Grading: Medical Option (MedLtrCR/NC)
STATS 263:
Design of Experiments (STATS 363)
Experiments vs observation. Confounding. Randomization. ANOVA.Blocking. Latin squares. Factorials and fractional factorials. Split plot. Response surfaces. Mixture designs. Optimal design. Central composite. BoxBehnken. Taguchi methods. Computer experiments and space filling designs. Prerequisites: probability at STATS 116 level or higher, and at least one course in linear models.
Terms: Aut

Units: 3

Grading: Letter (ABCD/NP)
STATS 266:
Advanced Statistical Methods for Observational Studies (EDUC 260B, SOMGEN 290)
Design principles and statistical methods for observational studies, particularly for cause and effect determinations. Topics include: matching methods, sensitivity analysis, instrumental variables, graphical models, marginal structural models. 3 unit registration requires a small project and presentation. Computing is in R. Prerequisites: HRP 261 and 262 or STAT 209 (HRP 239), or equivalent. See http://rogosateaching.com/somgen290/
Terms: Spr

Units: 23

Grading: Medical Option (MedLtrCR/NC)
STATS 297:
Practical Training
For students in the M.S. program in Financial Mathematics only. Students obtain employment, with the approval and supervision of a faculty member, in a relevant industrial or research activity to enhance their professional experience. Students must submit a written final report upon completion of the internship in order to receive credit. May be repeated once for credit. Prerequisite: consent of adviser.
Terms: Aut, Win, Spr, Sum

Units: 13

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 298:
Industrial Research for Statisticians
Masterslevel research as in 299, but with the approval and supervision of a faculty adviser, it must be conducted for an offcampus employer. Students must submit a written final report upon completion of the internship in order to receive credit. May be repeated once for credit. Prerequisite: enrollment in Statistics M.S. or Ph.D. program, prior to candidacy.
Terms: Aut, Win, Spr, Sum

Units: 13

Repeatable for credit

Grading: Letter or Credit/No Credit
Instructors: ;
Candes, E. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Montanari, A. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI);
Ementon, S. (GP);
Gates, C. (GP)
STATS 302:
Qualifying Exams Workshop
Prepares Statistics Ph.D. students for the qualifying exams by reviewing relevantnncourse topics and problem solving strategies.
Terms: Sum

Units: 3

Grading: Credit/No Credit
STATS 303:
PhD First Year Student Workshop
For Statistics First Year PhD students only. Discussion of relevant topics in first year student courses, consultation with PhD advisor.
Terms: Aut, Win, Spr, Sum

Units: 1

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 312:
Statistical Methods in Neuroscience
The goal is to discuss statistical methods for neuroscience in their natural habitat: the research questions, measurement technologies and experiment designs used in modern neuroscience. We will emphasize both the choice and quality of the methods, as well as the reporting, interpretation and visualization of results. Likely topics include preprocessing and signal extraction for singleneuron and neuroimaging technologies, statistical models for single response, encoding and decoding models, multipleresponses and parametric maps, and testing. Participation includes analyzing methods and real data, discussing papers in class, and a final project. Requirements: we will assume familiarity with linear models, likelihoods etc. Students who have not taken graduate level statistics courses are required to contact the instructor. Background in neuroscience is not assumed.
Terms: Win

Units: 3

Grading: Letter or Credit/No Credit
STATS 314:
Advanced Statistical Methods
Topic this year is empirical likelihood. Empirical likelihood (EL) allows likelihood based inferences without assuming any parametric form for the likelihood. It is based instead on reweighting the sample values. It provides data driven shapes for confidence regions and confidence bands. EL tests have competitive power.nThis course covers: nonparametric maximum likelihood and likelihood ratios, censoring and truncation, biased sampling, estimating equations, GMM, Bayesian bootstrap, Euclidean and KullbackLeibler log likelihoods and recentnresearch directions.
Terms: not given this year

Units: 3

Repeatable for credit

Grading: Letter or Credit/No Credit
STATS 315A:
Modern Applied Statistics: Learning
Overview of supervised learning. Linear regression and related methods. Model selection, least angle regression and the lasso, stepwise methods. Classification. Linear discriminant analysis, logistic regression, and support vector machines (SVMs). Basis expansions, splines and regularization. Kernel methods. Generalized additive models. Kernel smoothing. Gaussian mixtures and the EM algorithm. Model assessment and selection: crossvalidation and the bootstrap. Pathwise coordinate descent. Sparse graphical models. Prerequisites: STATS 305, 306A,B or consent of instructor.
Terms: Win

Units: 23

Grading: Letter or Credit/No Credit
STATS 315B:
Modern Applied Statistics: Data Mining
Twopart sequence. New techniques for predictive and descriptive learning using ideas that bridge gaps among statistics, computer science, and artificial intelligence. Emphasis is on statistical aspects of their application and integration with more standard statistical methodology. Predictive learning refers to estimating models from data with the goal of predicting future outcomes, in particular, regression and classification models. Descriptive learning is used to discover general patterns and relationships in data without a predictive goal, viewed from a statistical perspective as computer automated exploratory analysis of large complex data sets.
Terms: Spr

Units: 23

Grading: Letter or Credit/No Credit
STATS 316:
Stochastic Processes on Graphs
Local weak convergence, Gibbs measures on trees, cavity method, and replica symmetry breaking. Examples include random ksatisfiability, the assignment problem, spin glasses, and neural networks. Prerequisite: 310A or equivalent.
Terms: Aut

Units: 13

Grading: Letter or Credit/No Credit
STATS 318:
Modern Markov Chains
Tools for understanding Markov chains as they arise in applications. Random walk on graphs, reversible Markov chains, Metropolis algorithm, Gibbs sampler, hybrid Monte Carlo, auxiliary variables, hit and run, SwedsonWong algorithms, geometric theory, PoincareNashChegerLogSobolov inequalities. Comparison techniques, coupling, stationary times, Harris recurrence, central limit theorems, and large deviations.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 319:
Literature of Statistics
Literature study of topics in statistics and probability culminating in oral and written reports. May be repeated for credit.
Terms: Aut, Spr

Units: 13

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 320:
Heterogeneous Data with Kernels
Mathematical and computational methods necessary to understanding analysis of heterogeneous data using generalized inner products and Kernels. For areas that need to integrate data from various sources, biology, environmental and chemical engineering, molecular biology, bioinformatics. Topics: Distances, inner products and duality. Multivariate projections. Complex heterogeneous data structures (networks, trees, categorical as well as multivariate continuous data). Canonical correlation analysis, canonical correspondence analysis. Kernel methods in Statistics. Representer theorem. Kernels on graphs. Kernel versions of standard statistical procedures. Data cubes and tensor methods.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 321:
Modern Applied Statistics: Transposable Data
Topics: clustering, biclustering, and spectral clustering. Data analysis using the singular value decomposition, nonnegative decomposition, and generalizations. Plaid model, aspect model, and additive clustering. Correspondence analysis, Rasch model, and independent component analysis. Page rank, hubs, and authorities. Probabilistic latent semantic indexing. Recommender systems. Applications to genomics and information retrieval. Prerequisites: 315A,B, 305/306A,B, or consent of instructor.
Terms: not given this year

Units: 23

Grading: Letter or Credit/No Credit
STATS 322:
Function Estimation in White Noise
Gaussian white noise model sequence space form. Hyperrectangles, quadratic convexity, and Pinsker's theorem. Minimax estimation on Lp balls and Besov spaces. Role of wavelets and unconditional bases. Linear and threshold estimators. Oracle inequalities. Optimal recovery and universal thresholding. Stein's unbiased risk estimator and threshold choice. Complexity penalized model selection. Connecting fast wavelet algorithms and theory. Beyond orthogonal bases.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 325:
Multivariate Analysis and Random Matrices in Statistics
Topics on Multivariate Analysis and Random Matrices in Statistics (full description TBA)
Terms: not given this year

Units: 23

Grading: Letter or Credit/No Credit
STATS 329:
LargeScale Simultaneous Inference
Estimation, testing, and prediction for microarraylike data. Modern scientific technologies, typified by microarrays and imaging devices, produce inference problems with thousands of parallel cases to consider simultaneously. Topics: empirical Bayes techniques, JamesStein estimation, largescale simultaneous testing, false discovery rates, local fdr, proper choice of null hypothesis (theoretical, permutation, empirical nulls), power, effects of correlation on tests and estimation accuracy, prediction methods, related sets of cases ("enrichment"), effect size estimation. Theory and methods illustrated on a variety of largescale data sets.
Terms: not given this year

Units: 13

Grading: Letter or Credit/No Credit
STATS 333:
Modern Spectral Analysis
Traditional spectral analysis encompassed Fourier methods and their elaborations, under the assumption of a simple superposition of sinusoids, independent of time. This enables development of efficient and effective computational schemes, such as the FFT. Since many systems change in time, it becomes of interest to generalize classical spectral analysis to the timevarying setting. In addition, classical methods suffer from resolution limits which we hope to surpass. In this topics course, we follow two threads. On the one hand, we consider the ¿estimation of instantaneous frequencies and decomposition of source signals, which may be timevarying¿. The thread begins with the empirical mode decomposition (EMD) for nonstationary signal decomposition into intrinsic mode functions (IMF¿s), introduced by N. Huang et al [1], together with its machinery of the sifting process and computation of the Hilbert spectrum, resulting in the socalled adaptive harmonic model (AHM).nNext, this thread considers the wavelet synchrosqueezing transform (WSST) proposed by Daubechies et al [2], which attempts to estimate instantaneous frequencies (IF¿s), via the frequency reassignment (FRA) rule, that facilitaes nonstationary signal decomposition. In reference [3], a realtime method is proposed for computing the FRA rule; and in reference [4], the exact number of AHM components is determined with more precise estimation of the IF¿s, for more accurate extraction of the signal components and polynomiallike trend. nIn another thread, recent developments in optimization have been applied to obtain timevarying spectra or very highresolution spectra; in particular, references [5][8] give examples of recent results where convex estimation is applied to obtain new and more highly resolved spectral estimates, some with timevarying structure.
Terms: Spr

Units: 3

Grading: Letter or Credit/No Credit
STATS 338:
Topics in Biostatistics
Data monitoring and interim analysis of clinical trials. Design of Phase I, II, III trials. Survival analysis. Longitudinal data analysis.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 341:
Applied Multivariate Statistics
Theory, computational aspects, and practice of a variety of important multivariate statistical tools for data analysis. Topics include classicalnmultivariate Gaussian and undirected graphical models, graphical displays. PCA, SVD and generalizations including canonical correlation analysis, linear discriminant analysis, correspondence analysis, with focus on recent variants. Factor analysis and independent component analysis. Multidimensional scalingnand its variants (e.g. Isomap, spectral clustering). Students are expected to program in R. Prerequisite: STATS 305 or equivalent.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 344:
Introduction to Statistical Genetics (GENE 244)
Statistical methods for analyzing human genetics studies of Mendelian disorders and common complex traits. Probable topics include: principles of population genetics; epidemiologic designs; familial aggregation; segregation analysis; linkage analysis; linkagedisequilibriumbased association mapping approaches; and genomewide analysis based on highthroughput genotyping platforms. Prerequisite: STATS 116 or equivalent or consent of instructor.
Terms: alternate years, given next year

Units: 3

Grading: Medical Option (MedLtrCR/NC)
STATS 350:
Topics in Probability Theory: Probabilistic Concepts in Statistical Physics and Information Theory
Concentration of measure techniques. Mean field models for disordered systems: infinite size limit, computing the free energy, ultrametricity, dynamics. Interpolation techniques and infinite size limit in information theory and coding. May be repeated once for credit. Prerequisite: 310A or equivalent.
Terms: not given this year

Units: 13

Repeatable for credit

Grading: Letter or Credit/No Credit
STATS 351:
Random Walks, Networks and Environment
Selected material about probability on trees and networks, random walk in random and nonrandom environments, percolation and related interacting particle systems. Prerequisite: Exposure to measure theoretic probability and to stochastic processes.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 351A:
An Introduction to Random Matrix Theory (MATH 231A)
Patterns in the eigenvalue distribution of typical large matrices, which also show up in physics (energy distribution in scattering experiments), combinatorics (length of longest increasing subsequence), first passage percolation and number theory (zeros of the zeta function). Classical compact ensembles (random orthogonal matrices). The tools of determinental point processes.
Terms: not given this year

Units: 3

Grading: Letter (ABCD/NP)
STATS 355:
Observational Studies (HRP 255)
This course will cover statistical methods for the design and analysis of observational studies. Topics for the course will include the potential outcomes framework for causal inference; randomized experiments; methods for controlling for observed confounders in observational studies; sensitivity analysis for hidden bias; instrumental variables; tests of hidden bias; coherence; and design of observational studies.
Terms: not given this year

Units: 23

Grading: Letter or Credit/No Credit
STATS 362:
Topic: Monte Carlo
Random numbers and vectors: inversion, acceptancerejection, copulas. Variance reduction: antithetics, stratification, control variates, importance sampling. MCMC: Markov chains, detailed balance, MetropolisHastings, random walk Metropolis,nnindependence sampler, Gibbs sampling, slice sampler, hybrids of Gibbs and Metropolis, tempering. Sequential Monte Carlo. QuasiMonte Carlo. Randomized quasiMonte Carlo. Examples, problems and motivation from Bayesian statistics,nnmachine learning, computational finance and graphics. May be repeat for credit.
Terms: Spr

Units: 23

Repeatable for credit

Grading: Letter or Credit/No Credit
STATS 363:
Design of Experiments (STATS 263)
Experiments vs observation. Confounding. Randomization. ANOVA.Blocking. Latin squares. Factorials and fractional factorials. Split plot. Response surfaces. Mixture designs. Optimal design. Central composite. BoxBehnken. Taguchi methods. Computer experiments and space filling designs. Prerequisites: probability at STATS 116 level or higher, and at least one course in linear models.
Terms: Aut

Units: 3

Grading: Letter (ABCD/NP)
STATS 366:
Modern Statistics for Modern Biology (BIOS 221)
Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Minimal familiarity with computers. Instructor consent.
Terms: Sum

Units: 3

Grading: Letter or Credit/No Credit
STATS 374:
Large Deviations Theory (MATH 234)
Combinatorial estimates and the method of types. Large deviation probabilities for partial sums and for empirical distributions, Cramer's and Sanov's theorems and their Markov extensions. Applications in statistics, information theory, and statistical mechanics. Prerequisite: MATH 230A or STATS 310. Offered every 23 years.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 375:
Inference in Graphical Models
Graphical models as a unifying framework for describing the statistical relationships between large sets of variables; computing the marginal distribution of one or a few such variables. Focus is on sparse graphical structures, lowcomplexity algorithms, and their analysis. Topics include: variational inference; message passing algorithms; belief propagation; generalized belief propagation; survey propagation. Analysis techniques: correlation decay; distributional recursions. Applications from engineering, computer science, and statistics. Prerequisite: EE 278, STATS 116, or CS 228. Recommended: EE 376A or STATS 217.
Terms: not given this year

Units: 3

Grading: Letter or Credit/No Credit
STATS 390:
Consulting Workshop
Skills required of practicing statistical consultants, including exposure to statistical applications. Students participate as consultants in the department's dropin consulting service, analyze client data, and prepare formal written reports. Seminar provides supervised experience in short term consulting. May be repeated for credit. Prerequisites: course work in applied statistics or data analysis, and consent of instructor.
Terms: Aut, Win, Spr, Sum

Units: 13

Repeatable for credit

Grading: Satisfactory/No Credit
STATS 397:
PhD Oral Exam Workshop
For Statistics PhD students defending their dissertation.
Terms: Spr

Units: 1

Grading: Satisfactory/No Credit
STATS 398:
Industrial Research for Statisticians
Doctoral research as in 298, but must be conducted for an offcampus employer. Final report required. May be repeated for credit. Prerequisite: Statistics Ph.D. candidate.
Terms: Aut, Win, Spr, Sum

Units: 13

Repeatable for credit

Grading: Letter or Credit/No Credit
Instructors: ;
Candes, E. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Montanari, A. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)
Terms: Aut, Win, Spr, Sum

Units: 0

Repeatable for credit

Grading: TGR
Instructors: ;
Candes, E. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Montanari, A. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)
STATS 802:
TGR Dissertation
Terms: Aut, Win, Spr, Sum

Units: 0

Repeatable for credit

Grading: TGR
Instructors: ;
Candes, E. (PI);
Dembo, A. (PI);
Diaconis, P. (PI);
Donoho, D. (PI);
Efron, B. (PI);
Friedman, J. (PI);
Hastie, T. (PI);
Holmes, S. (PI);
Johnstone, I. (PI);
Lai, T. (PI);
Montanari, A. (PI);
Olkin, I. (PI);
Olshen, R. (PI);
Owen, A. (PI);
Rajaratnam, B. (PI);
Rogosa, D. (PI);
Romano, J. (PI);
Siegmund, D. (PI);
Switzer, P. (PI);
Taylor, J. (PI);
Tibshirani, R. (PI);
Walther, G. (PI);
Wong, W. (PI);
Zhang, N. (PI)