Print Settings
 

STATS 48N: Riding the Data Wave

Imagine collecting a bit of your saliva and sending it in to one of the personalized genomics company: for very little money you will get back information about hundreds of thousands of variable sites in your genome. Records of exposure to a variety of chemicals in the areas you have lived are only a few clicks away on the web; as are thousands of studies and informal reports on the effects of different diets, to which you can compare your own. What does this all mean for you? Never before in history humans have recorded so much information about themselves and the world that surrounds them. Nor has this data been so readily available to the lay person. Expression as "data deluge'' are used to describe such wealth as well as the loss of proper bearings that it often generates. How to summarize all this information in a useful way? How to boil down millions of numbers to just a meaningful few? How to convey the gist of the story in a picture without misleading oversimplifications? To answer these questions we need to consider the use of the data, appreciate the diversity that they represent, and understand how people instinctively interpret numbers and pictures. During each week, we will consider a different data set to be summarized with a different goal. We will review analysis of similar problems carried out in the past and explore if and how the same tools can be useful today. We will pay attention to contemporary media (newspapers, blogs, etc.) to identify settings similar to the ones we are examining and critique the displays and summaries there documented. Taking an experimental approach, we will evaluate the effectiveness of different data summaries in conveying the desired information by testing them on subsets of the enrolled students.
Terms: Aut | Units: 3 | UG Reqs: WAY-AQR, WAY-FR
Instructors: ; Sabatti, C. (PI)

STATS 60: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 160)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 90: Mathematics and Statistics in the Real World (MATH 16)

This is an introductory quantitative literacy course, that offers an introduction to the mathematics (outside of calculus) used in real-world problems. Topics include: (a) Exponential functions, compound interest, population growth. (b) Geometric series, applications to mortgage payments, amortization of loans, present value of money, drug doses and blood levels. (c) First-order approximation, estimating areas and volumes. (d) Basic probability: Bayes's rule, false positives in disease detection and drug testing. (e) Basic descriptive statistics: mean, median, standard deviation f) Least squares and linear regression.
Terms: Win | Units: 3 | UG Reqs: GER:DB-Math

STATS 110: Statistical Methods in Engineering and the Physical Sciences

Introduction to statistics for engineers and physical scientists. Topics: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, elementary experimental design. Prerequisite: one year of calculus.
Terms: Aut, Sum | Units: 4-5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 116: Theory of Probability

Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. Introduction to the laws of large numbers and central limit theorem. Prerequisites: MATH 52 and familiarity with infinite series, or equivalent.
Terms: Aut, Spr, Sum | Units: 3-5 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 141: Biostatistics (BIO 141)

Introductory statistical methods for biological data: describing data (numerical and graphical summaries); introduction to probability; and statistical inference (hypothesis tests and confidence intervals). Intermediate statistical methods: comparing groups (analysis of variance); analyzing associations (linear and logistic regression); and methods for categorical data (contingency tables and odds ratio). Course content integrated with statistical computing in R.
Terms: Aut | Units: 3-5 | UG Reqs: GER:DB-Math, WAY-AQR
Instructors: ; Li, L. (PI)

STATS 160: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 60)

Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, t-tests, correlation, and regression. Possible topics: analysis of variance and chi-square tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum | Units: 5

STATS 166: Statistical and Machine Learning Methods for Genomics (BIO 268, BIOMEDIN 245, CS 373, GENE 245, STATS 345)

Computational algorithms for human genetics research. Topics include: permutation, bootstrap, expectation maximization, hidden Markov model, and Markov chain Monte Carlo. Rationales and techniques illustrated with existing implementations commonly used in population genetics research, disease association studies, and genomics analysis. Prerequisite: GENE 244 or consent of instructor
Terms: Spr | Units: 3 | UG Reqs: WAY-AQR
Instructors: ; Tang, H. (PI); Chen, H. (TA)

STATS 167: Probability: Ten Great Ideas About Chance (PHIL 166, PHIL 266, STATS 267)

Foundational approaches to thinking about chance in matters such as gambling, the law, and everyday affairs. Topics include: chance and decisions; the mathematics of chance; frequencies, symmetry, and chance; Bayes great idea; chance and psychology; misuses of chance; and harnessing chance. Emphasis is on the philosophical underpinnings and problems. Prerequisite: exposure to probability or a first course in statistics at the level of STATS 60 or 116.
Terms: Spr | Units: 4 | UG Reqs: GER:DB-Math, WAY-AQR, WAY-FR

STATS 191: Introduction to Applied Statistics

Statistical tools for modern data analysis. Topics include regression and prediction, elements of the analysis of variance, bootstrap, and cross-validation. Emphasis is on conceptual rather than theoretical understanding. Applications to social/biological sciences. Student assignments/projects require use of the software package R. Recommended: 60, 110, or 141.
Terms: Aut | Units: 3-4 | UG Reqs: GER:DB-Math, WAY-AQR
Instructors: ; Taylor, J. (PI)

STATS 198: Practical Training

For students majoring in Mathematical and Computational Science only. Students obtain employment in a relevant industrial or research activity to enhance their professional experience.
Terms: Aut, Win, Spr, Sum | Units: 1-3 | Repeatable 2 times (up to 6 units total)

STATS 200: Introduction to Statistical Inference

Modern statistical concepts and procedures derived from a mathematical framework. Statistical inference, decision theory; point and interval estimation, tests of hypotheses; Neyman-Pearson theory. Bayesian analysis; maximum likelihood, large sample theory. Prerequisite: 116.
Terms: Win, Sum | Units: 3

STATS 202: Data Mining and Analysis

Data mining is used to discover patterns and relationships in data. Emphasis is on large complex data sets such as those in very large databases or through web mining. Topics: decision trees, association rules, clustering, case based methods, and data visualization.
Terms: Aut, Sum | Units: 3

STATS 203: Introduction to Regression Models and Analysis of Variance

Modeling and interpretation of observational and experimental data using linear and nonlinear regression methods. Model building and selection methods. Multivariable analysis. Fixed and random effects models. Experimental design. Pre- or corequisite: 200.
Terms: Win | Units: 3

STATS 206: Applied Multivariate Analysis

Introduction to the statistical analysis of several quantitative measurements on each observational unit. Emphasis is on concepts, computer-intensive methods. Examples from economics, education, geology, psychology. Topics: multiple regression, multivariate analysis of variance, principal components, factor analysis, canonical correlations, multidimensional scaling, clustering. Pre- or corequisite: 200.
Terms: Aut, Sum | Units: 3

STATS 207: Introduction to Time Series Analysis

Time series models used in economics and engineering. Trend fitting, autoregressive and moving average models and spectral analysis, Kalman filtering, and state-space models. Seasonality, transformations, and introduction to financial time series. Prerequisite: basic course in Statistics at the level of 200.
Terms: Spr | Units: 3
Instructors: ; Donoho, D. (PI); Shen, M. (TA)

STATS 208: Introduction to the Bootstrap

The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. By substituting computation in place of mathematical formulas, it permits the statistical analysis of complicated estimators. Topics: nonparametric assessment of standard errors, biases, and confidence intervals; related resampling methods including the jackknife, cross-validation, and permutation tests. Theory and applications. Prerequisite: course in statistics or probability.
Terms: Spr | Units: 3
Instructors: ; Khalessi, S. (PI)

STATS 209: Understanding Statistical Models and their Social Science Applications (EDUC 260X, HRP 239)

Critical examination of statistical methods in social science applications, especially for cause and effect determinations. Topics: path analysis, multilevel models, matching and propensity score methods, analysis of covariance, instrumental variables, compliance, longitudinal data, mediating and moderating variables. See http://www-stat.stanford.edu/~rag/stat209. Prerequisite: intermediate-level statistical methods
Terms: Win | Units: 3

STATS 211: Meta-research: Appraising Research Findings, Bias, and Meta-analysis (HRP 206, MED 206)

Open to graduate, medical, and undergraduate students. Appraisal of the quality and credibility of research findings; evaluation of sources of bias. Meta-analysis as a quantitative (statistical) method for combining results of independent studies. Examples from medicine, epidemiology, genomics, ecology, social/behavioral sciences, education. Collaborative analyses. Project involving generation of a meta-research project or reworking and evaluation of an existing published meta-analysis. Prerequisite: knowledge of basic statistics.
Terms: Win | Units: 3

STATS 212: Applied Statistics with SAS

Data analysis and implementation of statistical tools in SAS. Topics: reading in and describing data, categorical data, dates and longitudinal data, correlation and regression, nonparametric comparisons, ANOVA, multiple regression, multivariate data analysis, using arrays and macros in SAS. Prerequisite: statistical techniques at the level of STATS 191 or 203; knowledge of SAS not required.
Last offered: Summer 2011 | Units: 3

STATS 213: Introduction to Graphical Models

Multivariate Normal Distribution and Inference, Wishart distributions, graph theory, probabilistic Markov models, pairwise and global Markov property, decomposable graph, Markov equivalence, MLE for DAG models and undirected graphical models, Bayesian inference for DAG models and undirected graphical models. Prerequisites: STATS 116, MATH 104 or equivalent class in linear algebra.
Terms: Aut | Units: 3

STATS 215: Statistical Models in Biology

Poisson and renewal processes, Markov chains in discrete and continuous time, branching processes, diffusion. Applications to models of nucleotide evolution, recombination, the Wright-Fisher process, coalescence, genetic mapping, sequence analysis. Theoretical material approximately the same as in STATS 217, but emphasis is on examples drawn from applications in biology, especially genetics. Prerequisite: 116 or equivalent.
Terms: Win | Units: 3
Instructors: ; Siegmund, D. (PI); Li, J. (TA)

STATS 217: Introduction to Stochastic Processes

Discrete and continuous time Markov chains, poisson processes, random walks, branching processes, first passage times, recurrence and transience, stationary distributions. Non-Statistics masters students may want to consider taking STATS 215 instead. Prerequisite: STATS 116 or consent of instructor.
Terms: Win, Sum | Units: 3

STATS 218: Introduction to Stochastic Processes

Renewal theory, Brownian motion, Gaussian processes, second order processes, martingales.
Terms: Spr | Units: 3
Instructors: ; Bogdan, K. (PI); Lee, M. (TA)

STATS 219: Stochastic Processes (MATH 136)

Introduction to measure theory, Lp spaces and Hilbert spaces. Random variables, expectation, conditional expectation, conditional distribution. Uniform integrability, almost sure and Lp convergence. Stochastic processes: definition, stationarity, sample path continuity. Examples: random walk, Markov chains, Gaussian processes, Poisson processes, Martingales. Construction and basic properties of Brownian motion. Prerequisite: STATS 116 or MATH 151 or equivalent. Recommended: MATH 115 or equivalent.
Terms: Aut | Units: 3
Instructors: ; Camilier, I. (PI)

STATS 222: Statistical Methods for Longitudinal Data (EDUC 351A)

Research designs and statistical procedures for time-ordered (repeated-measures) data. The analysis of longitudinal panel data is central to empirical research on learning and development. Topics: measurement of change, growth curve models, analysis of durations including survival analysis, experimental and non-experimental group comparisons, reciprocal effects, stability. See http://www-stat.stanford.edu/~rag/stat222/ . Prerequisite: intermediate statistical methods.
Terms: Spr | Units: 2-3
Instructors: ; Rogosa, D. (PI)

STATS 231: Statistical Learning Theory (CS 229T)

(Same as STATS 231) For a given learning problem, what methods should be employed, and under what assumptions can we expect them to work? This course focuses on developing algorithms for various scenarios (e.g., high-dimensional, online, unsupervised) as well as theoretical analyses of these algorithms. Topics include kernel methods, generalization bounds, spectral methods, online learning, and nonparametric Bayes. Prerequisites: A solid background in linear algebra and probability theory. Basic exposure to statistics and machine learning (STAT 315A or CS 229), and graphical models (CS 228) is helpful but not essential.
Terms: Win | Units: 3
Instructors: ; Liang, P. (PI)

STATS 237: Theory of Investment Portfolios and Derivative Securities

Asset returns and their volatilities. Markowitz¿s portfolio theory, capital asset pricing model, multifactor pricing models. Measures of market risk. Financial derivatives and hedging. Black¿Scholes pricing of European options. Valuation of American options. Implied volatility and the Greeks. Prerequisite: STATS 116 or equivalent
Terms: Sum | Units: 3
Instructors: ; Camilier, I. (PI); Li, J. (TA)

STATS 239A: Workshop in Quantitative Finance

Topics of current interest.
Terms: Aut | Units: 1 | Repeatable for credit
Instructors: ; Camilier, I. (PI)

STATS 239B: Workshop in Quantitative Finance

Topics of current interest. May be repeated for credit.
Terms: Spr | Units: 1 | Repeatable for credit
Instructors: ; Camilier, I. (PI)

STATS 240: Statistical Methods in Finance

(SCPD students register for 240P.) Regression analysis and applications to investment models. Principal components and multivariate analysis. Likelihood inference and Bayesian methods. Financial time series. Estimation and modeling of volatilities. Statistical methods for portfolio management. Prerequisite: STATS 200 or equivalent.
Terms: Aut | Units: 3-4
Instructors: ; Lai, T. (PI)

STATS 240P: Statistical Methods in Finance

For SCPD students; see 240.
Terms: Aut | Units: 3
Instructors: ; Lai, T. (PI)

STATS 241: Financial Modeling Methodology and Applications

(SCPD students register for 241P.) Substantive and empirical modeling approaches in options and interest rate markets. Nonlinear least squares and nonparametric regression. Multivariate time series modeling and forecasting. Applications of canonical correlation analysis and cointegration. Statistical trading strategies and their evaluation. Prerequisite: 240 or equivalent.
Last offered: Winter 2012 | Units: 3-4

STATS 241P: Financial Modeling Methodology and Applications

For SCPD students; see 241.
Last offered: Winter 2012 | Units: 3

STATS 242: Algorithmic Trading and Quantitative Strategies

An introduction to financial trading strategies based on methods of statistical arbitrage that can be automated. Methodologies related to high frequency data and stylized facts on asset returns; models of order book dynamics and order placement, dynamic trade planning with feedback; momentum strategies, pairs trading. Emphasis on developing and implementing models that reflect the market and behavioral patterns. Prerequisite: STATS 240 or equivalent.
Terms: Sum | Units: 3
Instructors: ; Velu, R. (PI); Kuang, Y. (TA)

STATS 243: Statistical Models and Methods for Risk Management and Surveillance

(SCPD students register for 243P.) Market risk and credit risk, credit markets. Back testing, stress testing and Monte Carlo methods. Logistic regression, generalized linear models and generalized mixed models. Loan prepayment and default as competing risks. Survival and hazard functions, correlated default intensities, frailty and contagion. Risk surveillance, early warning andnnadaptive control methodologies. Banking and bank regulation, asset and liability management. Prerequisite: STATS 240 or equivalent.
Terms: Win | Units: 3-4

STATS 243P: Statistical Models and Methods for Risk Management and Surveillance

For SCPD students; see 243.
Terms: Win | Units: 3
Instructors: ; Lai, T. (PI)

STATS 250: Mathematical Finance (MATH 238)

Stochastic models of financial markets. Forward and futures contracts. European options and equivalent martingale measures. Hedging strategies and management of risk. Term structure models and interest rate derivatives. Optimal stopping and American options. Corequisites: MATH 236 and 227 or equivalent.
Terms: Win | Units: 3
Instructors: ; Papanicolaou, G. (PI)

STATS 253: Spatial Statistics (STATS 352)

Statistical descriptions of spatial variability, spatial random functions, grid models, spatial partitions, spatial sampling, linear and nonlinear interpolation and smoothing with error estimation, Bayes methods and pattern simulation from posterior distributions, multivariate spatial statistics, spatial classification, nonstationary spatial statistics, space-time statistics and estimation of time trends from monitoring data, spatial point patterns, models of attraction and repulsion. Applications to earth and environmental sciences, meteorology, astronomy, remote-sensing, ecology, materials.
Last offered: Spring 2009 | Units: 3

STATS 260A: Workshop in Biostatistics (HRP 260A)

Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Aut | Units: 1-2 | Repeatable for credit

STATS 260B: Workshop in Biostatistics (HRP 260B)

Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Win | Units: 1-2 | Repeatable for credit

STATS 260C: Workshop in Biostatistics (HRP 260C)

Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.
Terms: Spr | Units: 1-2 | Repeatable for credit

STATS 261: Intermediate Biostatistics: Analysis of Discrete Data (BIOMEDIN 233, HRP 261)

Methods for analyzing data from case-control and cross-sectional studies: the 2x2 table, chi-square test, Fisher's exact test, odds ratios, Mantel-Haenzel methods, stratification, tests for matched data, logistic regression, conditional logistic regression. Emphasis is on data analysis in SAS. Special topics: cross-fold validation and bootstrap inference.
Terms: Win | Units: 3
Instructors: ; Sainani, K. (PI)

STATS 262: Intermediate Biostatistics: Regression, Prediction, Survival Analysis (HRP 262)

Methods for analyzing longitudinal data. Topics include Kaplan-Meier methods, Cox regression, hazard ratios, time-dependent variables, longitudinal data structures, profile plots, missing data, modeling change, MANOVA, repeated-measures ANOVA, GEE, and mixed models. Emphasis is on practical applications. Prerequisites: basic ANOVA and linear regression.
Terms: Spr | Units: 3
Instructors: ; Sainani, K. (PI)

STATS 267: Probability: Ten Great Ideas About Chance (PHIL 166, PHIL 266, STATS 167)

Foundational approaches to thinking about chance in matters such as gambling, the law, and everyday affairs. Topics include: chance and decisions; the mathematics of chance; frequencies, symmetry, and chance; Bayes great idea; chance and psychology; misuses of chance; and harnessing chance. Emphasis is on the philosophical underpinnings and problems. Prerequisite: exposure to probability or a first course in statistics at the level of STATS 60 or 116.
Terms: Spr | Units: 4

STATS 270: A Course in Bayesian Statistics (STATS 370)

Advanced-level Bayesian statistics. Topics: Discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. Examination of the construction of priors and the asymptotic properties of likelihoods and posterior densities. Discussion including but not limited to the case of finite dimensional parameter space. Prerequisite: familiarity with standard probability and multivariate distribution theory.
Terms: Aut, Win | Units: 3
Instructors: ; Sabatti, C. (PI); Sun, D. (TA)

STATS 290: Paradigms for Computing with Data

Advanced programming and computing techniques to support projects in data analysis and related research. For Statistics graduate students and others whose research involves data analysis and development of associated computational software. Prerequisites: Programming experience including familiarity with R; computing at least at the level of CS 106; statistics at the level of STATS 110 or 141.
Terms: Win | Units: 3

STATS 297: Practical Training

For students in the M.S. program in Financial Mathematics only. Students obtain employment, with the approval and supervision of a faculty member, in a relevant industrial or research activity to enhance their professional experience. Students must submit a written final report upon completion of the internship in order to receive credit. May be repeated once for credit. Prerequisite: consent of adviser.
Terms: Aut, Win, Spr, Sum | Units: 1-3 | Repeatable 2 times (up to 6 units total)
Instructors: ; Lai, T. (PI)

STATS 298: Industrial Research for Statisticians

Masters-level research as in 299, but with the approval and supervision of a faculty adviser, it must be conducted for an off-campus employer. Students must submit a written final report upon completion of the internship in order to receive credit. May be repeated once for credit. Prerequisite: enrollment in Statistics M.S. or Ph.D. program, prior to candidacy.
Terms: Aut, Win, Spr, Sum | Units: 1-3 | Repeatable 2 times (up to 6 units total)

STATS 299: Independent Study

For Statistics M.S. students only. Reading or research program under the supervision of a Statistics faculty member. May be repeated for credit.
Terms: Aut, Win, Spr, Sum | Units: 1-10 | Repeatable for credit

STATS 300: Advanced Topics in Statistics

May be repeated for credit.
Terms: Sum | Units: 2-3 | Repeatable for credit

STATS 300A: Theory of Statistics

Elementary decision theory; loss and risk functions, Bayes estimation; UMVU estimator, minimax estimators, shrinkage estimators. Hypothesis testing and confidence intervals: Neyman-Pearson theory; UMP tests and uniformly most accurate confidence intervals; use of unbiasedness and invariance to eliminate nuisance parameters. Large sample theory: basic convergence concepts; robustness; efficiency; contiguity, locally asymptotically normal experiments; convolution theorem; asymptotically UMP and maximin tests. Asymptotic theory of likelihood ratio and score tests. Rank permutation and randomization tests; jackknife, bootstrap, subsampling and other resampling methods. Further topics: sequential analysis, optimal experimental design, empirical processes with applications to statistics, Edgeworth expansions, density estimation, time series.
Terms: Aut | Units: 2-3
Instructors: ; Romano, J. (PI)

STATS 300B: Theory of Statistics

Elementary decision theory; loss and risk functions, Bayes estimation; UMVU estimator, minimax estimators, shrinkage estimators. Hypothesis testing and confidence intervals: Neyman-Pearson theory; UMP tests and uniformly most accurate confidence intervals; use of unbiasedness and invariance to eliminate nuisance parameters. Large sample theory: basic convergence concepts; robustness; efficiency; contiguity, locally asymptotically normal experiments; convolution theorem; asymptotically UMP and maximin tests. Asymptotic theory of likelihood ratio and score tests. Rank permutation and randomization tests; jackknife, bootstrap, subsampling and other resampling methods. Further topics: sequential analysis, optimal experimental design, empirical processes with applications to statistics, Edgeworth expansions, density estimation, time series.
Terms: Win | Units: 2-4

STATS 300C: Theory of Statistics

Decision theory formulation of statistical problems. Minimax, admissible procedures. Complete class theorems ("all" minimax or admissible procedures are "Bayes"), Bayes procedures, conjugate priors, hierarchical models. Bayesian non parametrics: diaichlet, tail free, polya trees, bayesian sieves. Inconsistency of bayes rules.
Terms: Spr | Units: 2-4

STATS 302: Qualifying Exams Workshop

Prepares Statistics Ph.D. students for the qualifying exams by reviewing relevantnncourse topics and problem solving strategies.
Terms: Sum | Units: 3

STATS 303: PhD First Year Student Workshop

For Statistics First Year PhD students only. Discussion of relevant topics in first year student courses, consultation with PhD advisor.
Terms: Aut, Win, Spr, Sum | Units: 1 | Repeatable 4 times (up to 4 units total)

STATS 305: Introduction to Statistical Modeling

Review of univariate regression. Multiple regression. Geometry, subspaces, orthogonality, projections, normal equations, rank deficiency, estimable functions and Gauss-Markov theorem. Computation via QR decomposition, Gramm-Schmidt orthogonalization and the SVD. Interpreting coefficients, collinearity, graphical displays. Fits and the Hat matrix, leverage & influence, diagnostics, weighted least squares and resistance. Model selection, Cp/Aic and crossvalidation, stepwise, lasso. Basis expansions, splines. Multivariate normal distribution theory. ANOVA: Sources of measurements, fixed and random effects, randomization. Emphasis on problem sets involving substantive computations with data sets. Prerequisites: consent of instructor, 116, 200, applied statistics course, CS 106A, MATH 114.
Terms: Aut | Units: 2-4
Instructors: ; Hastie, T. (PI)

STATS 306A: Methods for Applied Statistics

Regression modeling extended to categorical data. Logistic regression. Loglinear models. Generalized linear models. Discriminant analysis. Categorical data models from information retrieval and Internet modeling. Prerequisite: 305 or equivalent.
Terms: Win | Units: 2-4

STATS 306B: Methods for Applied Statistics: Unsupervised Learning

Unsupervised learning techniques in statistics, machine learning, and data mining.
Terms: Spr | Units: 2-3

STATS 310A: Theory of Probability (MATH 230A)

Mathematical tools: sigma algebras, measure theory, connections between coin tossing and Lebesgue measure, basic convergence theorems. Probability: independence, Borel-Cantelli lemmas, almost sure and Lp convergence, weak and strong laws of large numbers. Large deviations. Weak convergence; central limit theorems; Poisson convergence; Stein's method. Prerequisites: 116, MATH 171.
Terms: Aut | Units: 2-4
Instructors: ; Montanari, A. (PI)

STATS 310B: Theory of Probability (MATH 230B)

Conditional expectations, discrete time martingales, stopping times, uniform integrability, applications to 0-1 laws, Radon-Nikodym Theorem, ruin problems, etc. Other topics as time allows selected from (i) local limit theorems, (ii) renewal theory, (iii) discrete time Markov chains, (iv) random walk theory,nn(v) ergodic theory. Prerequisite: 310A or MATH 230A.
Terms: Win | Units: 2-3

STATS 310C: Theory of Probability (MATH 230C)

Continuous time stochastic processes: martingales, Brownian motion, stationary independent increments, Markov jump processes and Gaussian processes. Invariance principle, random walks, LIL and functional CLT. Markov and strong Markov property. Infinitely divisible laws. Some ergodic theory. Prerequisite: 310B or MATH 230B.
Terms: Spr | Units: 2-4

STATS 315A: Modern Applied Statistics: Learning

Overview of supervised learning. Linear regression and related methods. Model selection, least angle regression and the lasso, stepwise methods. Classification. Linear discriminant analysis, logistic regression, and support vector machines (SVMs). Basis expansions, splines and regularization. Kernel methods. Generalized additive models. Kernel smoothing. Gaussian mixtures and the EM algorithm. Model assessment and selection: crossvalidation and the bootstrap. Pathwise coordinate descent. Sparse graphical models. Prerequisites: STATS 305, 306A,B or consent of instructor.
Terms: Win | Units: 2-3

STATS 315B: Modern Applied Statistics: Data Mining

Two-part sequence. New techniques for predictive and descriptive learning using ideas that bridge gaps among statistics, computer science, and artificial intelligence. Emphasis is on statistical aspects of their application and integration with more standard statistical methodology. Predictive learning refers to estimating models from data with the goal of predicting future outcomes, in particular, regression and classification models. Descriptive learning is used to discover general patterns and relationships in data without a predictive goal, viewed from a statistical perspective as computer automated exploratory analysis of large complex data sets.
Terms: Spr | Units: 2-3

STATS 317: Stochastic Processes

Semimartingales, stochastic integration, Ito's formula, Girsanov's theorem. Gaussian and related processes. Stationary/isotropic processes. Integral geometry and geometric probability. Maxima of random fields and applications to spatial statistics and imaging.
Terms: Spr | Units: 3
Instructors: ; Siegmund, D. (PI); Su, W. (TA)

STATS 318: Modern Markov Chains

Tools for understanding Markov chains as they arise in applications. Random walk on graphs, reversible Markov chains, Metropolis algorithm, Gibbs sampler, hybrid Monte Carlo, auxiliary variables, hit and run, Swedson-Wong algorithms, geometric theory, Poincare-Nash-Cheger-Log-Sobolov inequalities. Comparison techniques, coupling, stationary times, Harris recurrence, central limit theorems, and large deviations.
Terms: Spr | Units: 3

STATS 319: Literature of Statistics

Literature study of topics in statistics and probability culminating in oral and written reports. May be repeated for credit.
Terms: Aut, Spr | Units: 1-3 | Repeatable for credit

STATS 324: Multivariate Analysis

Classic multivariate statistics: properties of the multivariate normal distribution, determinants, volumes, projections, matrix square roots, the singular value decomposition; Wishart distributions, Hotelling's T-square; principal components, canonical correlations, Fisher's discriminant, the Cauchy projection formula.
| Units: 2-3

STATS 325: Multivariate Analysis and Random Matrices in Statistics

Topics on Multivariate Analysis and Random Matrices in Statistics (full description TBA)
Terms: Aut | Units: 2-3
Instructors: ; Johnstone, I. (PI)

STATS 345: Statistical and Machine Learning Methods for Genomics (BIO 268, BIOMEDIN 245, CS 373, GENE 245, STATS 166)

Computational algorithms for human genetics research. Topics include: permutation, bootstrap, expectation maximization, hidden Markov model, and Markov chain Monte Carlo. Rationales and techniques illustrated with existing implementations commonly used in population genetics research, disease association studies, and genomics analysis. Prerequisite: GENE 244 or consent of instructor
Terms: Spr | Units: 3
Instructors: ; Tang, H. (PI); Chen, H. (TA)

STATS 355: Observational Studies (HRP 255)

This course will cover statistical methods for the design and analysis of observational studies. Topics for the course will include the potential outcomes framework for causal inference; randomized experiments; methods for controlling for observed confounders in observational studies; sensitivity analysis for hidden bias; instrumental variables; tests of hidden bias; coherence; and design of observational studies.
Terms: Win | Units: 2-3
Instructors: ; Baiocchi, M. (PI)

STATS 366: Modern Statistics for Modern Biology (BIOS 221)

Application based course in nonparametric statistics. Modern toolbox of visualization and statistical methods for the analysis of data, examples drawn from immunology, microbiology, cancer research and ecology. Methods covered include multivariate methods (PCA and extensions), sparse representations (trees, networks, contingency tables) as well as nonparametric testing (Bootstrap, permutation and Monte Carlo methods). Hands on, use R and cover many Bioconductor packages. Prerequisite: Minimal familiarity with computers. Instructor consent.
Terms: Sum | Units: 3

STATS 370: A Course in Bayesian Statistics (STATS 270)

Advanced-level Bayesian statistics. Topics: Discussion of the mathematical and theoretical foundation for Bayesian inferential procedures. Examination of the construction of priors and the asymptotic properties of likelihoods and posterior densities. Discussion including but not limited to the case of finite dimensional parameter space. Prerequisite: familiarity with standard probability and multivariate distribution theory.
Terms: Aut, Win | Units: 3
Instructors: ; Sabatti, C. (PI); Sun, D. (TA)

STATS 374: Large Deviations Theory (MATH 234)

Combinatorial estimates and the method of types. Large deviation probabilities for partial sums and for empirical distributions, Cramer's and Sanov's theorems and their Markov extensions. Applications in statistics, information theory, and statistical mechanics. Prerequisite: MATH 230A or STATS 310.
Terms: Win | Units: 3
Instructors: ; Dembo, A. (PI)

STATS 375: Inference in Graphical Models

Graphical models as a unifying framework for describing the statistical relationships between large sets of variables; computing the marginal distribution of one or a few such variables. Focus is on sparse graphical structures, low-complexity algorithms, and their analysis. Topics include: variational inference; message passing algorithms; belief propagation; generalized belief propagation; survey propagation. Analysis techniques: correlation decay; distributional recursions. Applications from engineering, computer science, and statistics. Prerequisite: EE 278, STATS 116, or CS 228. Recommended: EE 376A or STATS 217.
| Units: 3

STATS 376A: Information Theory (EE 376A)

The fundamental ideas of information theory. Entropy and intrinsic randomness. Data compression to the entropy limit. Huffman coding. Arithmetic coding. Channel capacity, the communication limit. Gaussian channels. Kolmogorov complexity. Asymptotic equipartition property. Information theory and Kelly gambling. Applications to communication and data compression. Prerequisite: EE178/278A or STATS 116, or equivalent.
Terms: Win | Units: 3
Instructors: ; Weissman, T. (PI)

STATS 390: Consulting Workshop

Skills required of practicing statistical consultants, including exposure to statistical applications. Students participate as consultants in the department's drop-in consulting service, analyze client data, and prepare formal written reports. Seminar provides supervised experience in short term consulting. May be repeated for credit. Prerequisites: course work in applied statistics or data analysis, and consent of instructor.
Terms: Aut, Win, Spr, Sum | Units: 1-3 | Repeatable for credit

STATS 396: Research Workshop in Computational Biology

Applications of Computational Statistics and Data Mining to Biological Data. Attendance mandatory. Instructor approval required.
| Units: 1-2 | Repeatable 3 times (up to 6 units total)

STATS 397: PhD Oral Exam Workshop

For Statistics PhD students defending their dissertation.
Terms: Spr | Units: 1

STATS 398: Industrial Research for Statisticians

Doctoral research as in 298, but must be conducted for an off-campus employer. Final report required. May be repeated for credit. Prerequisite: Statistics Ph.D. candidate.
Terms: Aut, Win, Spr, Sum | Units: 1-3 | Repeatable for credit

STATS 399: Research

Research work as distinguished from independent study of nonresearch character listed in 199. May be repeated for credit.
Terms: Aut, Win, Spr, Sum | Units: 1-10 | Repeatable for credit

STATS 42Q: Undergraduate Admissions to Selective Universities - a Statistical Perspective

The goal is the building of a statistical model, based on applicant data, for predicting admission to selective universities. The model will consider factors such as gender, ethnicity, legacy status, public-private schooling, test scores, effects of early action, and athletics. Common misconceptions and statistical pitfalls are investigated. The applicant data are not those associated with any specific university.
| Units: 2

STATS 205: Introduction to Nonparametric Statistics

Nonparametric analogs of the one- and two-sample t-tests and analysis of variance; the sign test, median test, Wilcoxon's tests, and the Kruskal-Wallis and Friedman tests, tests of independence. Nonparametric regression and nonparametric density estimation, modern nonparametric techniques, nonparametric confidence interval estimates.
| Units: 3

STATS 221: Introduction to Mathematical Finance

Interest rate and discounted value. Financial derivatives, hedging, and risk management. Stochastic models of financial markets, introduction to Ito calculus and stochastic differential equations. Black-Scholes pricing of European options. Optimal stopping and American options. Prerequisites: MATH 53, STATS 116, or equivalents.
| Units: 3-4

STATS 238: Policy & Strategy Issues in Financial Engineering

(Same as LAW 564). This is a non-technical course that will focus on a series of case studies each designed to illuminate a serious public policy issue raised by the evolution of modern financial engineering. These will include discussions of Freddie Mac, Fannie Mae, sub-prime and Alt-A mortgages and the flaws of AAA CDOs; the spectacular losses by Orange County and the Florida Local Government Investment Pool and the challenges posed by unregulated investment pools; how credit default swaps are likely to change with central clearing using the PIIGS (Portugal/ Ireland/ Iceland/ Greece/ Spain), the monolines, AIG, Lehman and MF Global as examples; views of rogue trading using the similarities and disparities of Askin, Madoff, Barings, Soc Gen and UBS for discussion; and Risk Management 101 : the why/ how/ where/ when firms went wrong plus what to keep and what to throw out in the next phase of risk programs among other case studies. The subject matter, by necessity, is multi-disciplinary and so the course is particularly suited to those students having an interest in public policy and the evolution of modern financial markets. This includes students from the law or business schools, or the public policy, economics, EES, political science, or financial math and engineering programs among others. Several themes will tie the case studies, reading and discussions together:-Is this an example of an innovation that got too far ahead of existing operations, risk management, legal, accounting, regulatory or supervisory oversight?-How might temporary infrastructure be implemented without stifling innovation or growth?-How might losses be avoided by requiring permanent infrastructure sooner? Will Dodd-Frank, Basel III, etc., help to prevent such problems? What are the potential unintended consequences?-Is this an example of improperly viewing exposures that are subject to uncertainty or incorrectly modeling risk or both? Guest speakers will be invited to share their experiences. This course will aim to provide a practitioner(s) view of financial engineering over the past 3 ½ decades as well as a broad understanding of what went right and what went wrong plus cutting edge views of the future of financial engineering. Prerequisite: STATS 237 or equivalent and consent of instructor.
| Units: 2
Instructors: ; Beder, T. (PI)

STATS 314: Advanced Statistical Methods

Topic this year is multiple hypothesis testing. The demand for new methodology for the simultaneous testing of many hypotheses as driven by modern applications in genomics, imaging, astronomy, and finance. High dimensionality: how tests of many hypotheses may be considered simultaneously. Classical techniques, and recent developments. Stepwise methods, generalized error rates such as the false discovery rate, and the role of resampling. May be repeated for credit.
| Units: 2-3 | Repeatable for credit

STATS 316: Stochastic Processes on Graphs

Local weak convergence, Gibbs measures on trees, cavity method, and replica symmetry breaking. Examples include random k-satisfiability, the assignment problem, spin glasses, and neural networks. Prerequisite: 310A or equivalent.
| Units: 1-3

STATS 320: Heterogeneous Data with Kernels

Mathematical and computational methods necessary to understanding analysis of heterogeneous data using generalized inner products and Kernels. For areas that need to integrate data from various sources, biology, environmental and chemical engineering, molecular biology, bioinformatics. Topics: Distances, inner products and duality. Multivariate projections. Complex heterogeneous data structures (networks, trees, categorical as well as multivariate continuous data). Canonical correlation analysis, canonical correspondence analysis. Kernel methods in Statistics. Representer theorem. Kernels on graphs. Kernel versions of standard statistical procedures. Data cubes and tensor methods.
| Units: 3

STATS 321: Modern Applied Statistics: Transposable Data

Topics: clustering, biclustering, and spectral clustering. Data analysis using the singular value decomposition, nonnegative decomposition, and generalizations. Plaid model, aspect model, and additive clustering. Correspondence analysis, Rasch model, and independent component analysis. Page rank, hubs, and authorities. Probabilistic latent semantic indexing. Recommender systems. Applications to genomics and information retrieval. Prerequisites: 315A,B, 305/306A,B, or consent of instructor.
| Units: 2-3

STATS 322: Function Estimation in White Noise

Gaussian white noise model sequence space form. Hyperrectangles, quadratic convexity, and Pinsker's theorem. Minimax estimation on Lp balls and Besov spaces. Role of wavelets and unconditional bases. Linear and threshold estimators. Oracle inequalities. Optimal recovery and universal thresholding. Stein's unbiased risk estimator and threshold choice. Complexity penalized model selection. Connecting fast wavelet algorithms and theory. Beyond orthogonal bases.
| Units: 3

STATS 329: Large-Scale Simultaneous Inference

Estimation, testing, and prediction for microarray-like data. Modern scientific technologies, typified by microarrays and imaging devices, produce inference problems with thousands of parallel cases to consider simultaneously. Topics: empirical Bayes techniques, James-Stein estimation, large-scale simultaneous testing, false discovery rates, local fdr, proper choice of null hypothesis (theoretical, permutation, empirical nulls), power, effects of correlation on tests and estimation accuracy, prediction methods, related sets of cases ("enrichment"), effect size estimation. Theory and methods illustrated on a variety of large-scale data sets.
| Units: 1-3

STATS 330: An Introduction to Compressed Sensing (CME 362)

Compressed sensing is a new data acquisition theory asserting that onenncan design nonadaptive sampling techniques that condense thenninformation in a compressible signal into a small amount of data.nnThis revelation may change the way engineers think about signalnnacquisition. Course covers fundamental theoretical ideas, numericalnnmethods in large-scale convex optimization, hardware implementations,nnconnections with statistical estimation in high dimensions, andnnextensions such as recovery of data matrices from few entries (famousnnNetflix Prize).
| Units: 3

STATS 338: Topics in Biostatistics

Data monitoring and interim analysis of clinical trials. Design of Phase I, II, III trials. Survival analysis. Longitudinal data analysis.
| Units: 3

STATS 341: Applied Multivariate Statistics

Theory, computational aspects, and practice of a variety of important multivariate statistical tools for data analysis. Topics include classicalnmultivariate Gaussian and undirected graphical models, graphical displays. PCA, SVD and generalizations including canonical correlation analysis, linear discriminant analysis, correspondence analysis, with focus on recent variants. Factor analysis and independent component analysis. Multidimensional scalingnand its variants (e.g. Isomap, spectral clustering). Students are expected to program in R. Prerequisite: STATS 305 or equivalent.
| Units: 3

STATS 351A: An Introduction to Random Matrix Theory (MATH 231A)

Patterns in the eigenvalue distribution of typical large matrices, which also show up in physics (energy distribution in scattering experiments), combinatorics (length of longest increasing subsequence), first passage percolation and number theory (zeros of the zeta function). Classical compact ensembles (random orthogonal matrices). The tools of determinental point processes.
| Units: 3

STATS 352: Spatial Statistics (STATS 253)

Statistical descriptions of spatial variability, spatial random functions, grid models, spatial partitions, spatial sampling, linear and nonlinear interpolation and smoothing with error estimation, Bayes methods and pattern simulation from posterior distributions, multivariate spatial statistics, spatial classification, nonstationary spatial statistics, space-time statistics and estimation of time trends from monitoring data, spatial point patterns, models of attraction and repulsion. Applications to earth and environmental sciences, meteorology, astronomy, remote-sensing, ecology, materials.
| Units: 3

STATS 362: Monte Carlo

Random numbers and vectors: inversion, acceptance-rejection, copulas. Variance reduction: antithetics, stratification, control variates, importance sampling. MCMC: Markov chains, detailed balance, Metropolis-Hastings, random walk Metropolis,nnindependence sampler, Gibbs sampling, slice sampler, hybrids of Gibbs and Metropolis, tempering. Sequential Monte Carlo. Quasi-Monte Carlo. Randomized quasi-Monte Carlo. Examples, problems and motivation from Bayesian statistics,nnmachine learning, computational finance and graphics.
| Units: 2-3
© Stanford University | Terms of Use | Copyright Complaints