STATS 48N: Riding the Data Wave
Imagine collecting a bit of your saliva and sending it in to one of the personalized genomics company: for very little money you will get back information about hundreds of thousands of variable sites in your genome. Records of exposure to a variety of chemicals in the areas you have lived are only a few clicks away on the web; as are thousands of studies and informal reports on the effects of different diets, to which you can compare your own. What does this all mean for you? Never before in history humans have recorded so much information about themselves and the world that surrounds them. Nor has this data been so readily available to the lay person. Expression as "data deluge'' are used to describe such wealth as well as the loss of proper bearings that it often generates. How to summarize all this information in a useful way? How to boil down millions of numbers to just a meaningful few? How to convey the gist of the story in a picture without misleading oversimplifications? To answer these questions we need to consider the use of the data, appreciate the diversity that they represent, and understand how people instinctively interpret numbers and pictures. During each week, we will consider a different data set to be summarized with a different goal. We will review analysis of similar problems carried out in the past and explore if and how the same tools can be useful today. We will pay attention to contemporary media (newspapers, blogs, etc.) to identify settings similar to the ones we are examining and critique the displays and summaries there documented. Taking an experimental approach, we will evaluate the effectiveness of different data summaries in conveying the desired information by testing them on subsets of the enrolled students.
Terms: Aut

Units: 3

UG Reqs: WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
Instructors:
Sabatti, C. (PI)
STATS 60: Introduction to Statistical Methods: Precalculus (PSYCH 10, STATS 160)
Techniques for organizing data, computing, and interpreting measures of central tendency, variability, and association. Estimation, confidence intervals, tests of hypotheses, ttests, correlation, and regression. Possible topics: analysis of variance and chisquare tests, computer statistical packages.
Terms: Aut, Win, Spr, Sum

Units: 5

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 101: Data Science 101
http://web.stanford.edu/class/stats101/ . This course will provide a handson introduction to statistics and data science. Students will engage with the fundamental ideas in inferential and computational thinking. Each week, we will explore a core topic comprising three lectures and two labs (a module), in which students will manipulate realworld data and learn about statistical and computational tools. Students will engage in statistical computing and visualization with current data analytic software (Jupyter, R). The objectives of this course are to have students (1) be able to connect data to underlying phenomena and to think critically about conclusions drawn from data analysis, and (2) be knowledgeable about programming abstractions so that they can later design their own computational inferential procedures. No programming or statistical background is assumed. Freshmen and sophomores interested in data science, computing and statistics are encouraged to attend. Open to graduates as well.
Terms: Aut, Spr

Units: 5

UG Reqs: GER: DBNatSci, WAYAQR

Grading: Letter or Credit/No Credit
Instructors:
Tibshirani, R. (PI)
;
Walther, G. (PI)
STATS 110: Statistical Methods in Engineering and the Physical Sciences
Introduction to statistics for engineers and physical scientists. Topics: descriptive statistics, probability, interval estimation, tests of hypotheses, nonparametric methods, linear regression, analysis of variance, elementary experimental design. Prerequisite: one year of calculus.
Terms: Aut, Sum

Units: 5

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
Instructors:
Miolane, N. (PI)
STATS 116: Theory of Probability
Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. Introduction to the laws of large numbers and central limit theorem. Prerequisites:
MATH 52 and familiarity with infinite series, or equivalent.
Terms: Aut, Spr, Sum

Units: 4

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 141: Biostatistics (BIO 141)
Introductory statistical methods for biological data: describing data (numerical and graphical summaries); introduction to probability; and statistical inference (hypothesis tests and confidence intervals). Intermediate statistical methods: comparing groups (analysis of variance); analyzing associations (linear and logistic regression); and methods for categorical data (contingency tables and odds ratio). Course content integrated with statistical computing in R.
Terms: Aut

Units: 5

UG Reqs: GER:DBMath, WAYAQR

Grading: Letter or Credit/No Credit
Instructors:
Holmes, S. (PI)
STATS 167: Probability: Ten Great Ideas About Chance (PHIL 166, PHIL 266, STATS 267)
Foundational approaches to thinking about chance in matters such as gambling, the law, and everyday affairs. Topics include: chance and decisions; the mathematics of chance; frequencies, symmetry, and chance; Bayes great idea; chance and psychology; misuses of chance; and harnessing chance. Emphasis is on the philosophical underpinnings and problems. Prerequisite: exposure to probability or a first course in statistics at the level of
STATS 60 or 116.
Terms: not given this year, last offered Spring 2016

Units: 4

UG Reqs: GER:DBMath, WAYAQR, WAYFR

Grading: Letter or Credit/No Credit
STATS 191: Introduction to Applied Statistics
Statistical tools for modern data analysis. Topics include regression and prediction, elements of the analysis of variance, bootstrap, and crossvalidation. Emphasis is on conceptual rather than theoretical understanding. Applications to social/biological sciences. Student assignments/projects require use of the software package R. Prerequisite: introductory statistical methods course. Recommended: 60, 110, or 141.
Terms: Aut

Units: 3

UG Reqs: GER:DBMath, WAYAQR

Grading: Letter or Credit/No Credit
Instructors:
Jeganathan, P. (PI)
THINK 3: Breaking Codes, Finding Patterns
Why are humans drawn to making and breaking codes? To what extent is finding patterns both an art and a science? Cryptography has been used for millennia for secure communications, and its counterpart, cryptanalysis, or code breaking, has been around for just slightly less time. In this course we will explore the history of cryptography and cryptanalysis including the Enigma code, Navajo windtalkers, early computer science and the invention of modern Bayesian inference. We will try our own hand at breaking codes using some basic statistical tools for which no prior experience is necessary. Finally, we will consider the topic of patterns more generally, raising such questions as why we impute meaning to patterns, such as Biblical codes, and why we assume a complexity within a pattern when it's not there, such as the coincidence of birthdays in a group.
Terms: not given this year, last offered Autumn 2018

Units: 4

UG Reqs: THINK, WAYAQR, WAYFR

Grading: Letter (ABCD/NP)
THINK 23: The Cancer Problem: Causes, Treatments, and Prevention
How has our approach to cancer been affected by clinical observations, scientific discoveries, social norms, politics, and economic interests? Approximately one in three Americans will develop invasive cancer during their lifetime; one in five Americans will die as a result of this disease. This course will expose you to multiple ways of approaching the cancer problem, including laboratory research, clinical trials, population studies, public health interventions, and health care economics. We will start with the 18th century discovery of the relationship between coal tar and cancer, and trace the role of scientific research in revealing the genetic basis of cancer. We will then discuss the development of new treatments for cancer as well as measures to screen for and prevent cancer, including the ongoing debate over tobacco control. Using cancer as a case study, you will learn important aspects of the scientific method including experimental design, data analysis, and the difference between correlation and causation. You will learn how science can be used and misused with regard to the public good. You will also learn about ways in which social, political, and economic forces shape our knowledge about and response to disease.
Terms: Spr

Units: 4

UG Reqs: THINK, WAYAQR, WAYSMA

Grading: Letter (ABCD/NP)
Filter Results: