## BIODS 48N: Riding the Data Wave (STATS 48N)

Imagine collecting a bit of your saliva and sending it in to one of the personalized genomics company: for very little money you will get back information about hundreds of thousands of variable sites in your genome. Records of exposure to a variety of chemicals in the areas you have lived are only a few clicks away on the web; as are thousands of studies and informal reports on the effects of different diets, to which you can compare your own. What does this all mean for you? Never before in history humans have recorded so much information about themselves and the world that surrounds them. Nor has this data been so readily available to the lay person. Expression as "data deluge'' are used to describe such wealth as well as the loss of proper bearings that it often generates. How to summarize all this information in a useful way? How to boil down millions of numbers to just a meaningful few? How to convey the gist of the story in a picture without misleading oversimplifications? To an
more »

Imagine collecting a bit of your saliva and sending it in to one of the personalized genomics company: for very little money you will get back information about hundreds of thousands of variable sites in your genome. Records of exposure to a variety of chemicals in the areas you have lived are only a few clicks away on the web; as are thousands of studies and informal reports on the effects of different diets, to which you can compare your own. What does this all mean for you? Never before in history humans have recorded so much information about themselves and the world that surrounds them. Nor has this data been so readily available to the lay person. Expression as "data deluge'' are used to describe such wealth as well as the loss of proper bearings that it often generates. How to summarize all this information in a useful way? How to boil down millions of numbers to just a meaningful few? How to convey the gist of the story in a picture without misleading oversimplifications? To answer these questions we need to consider the use of the data, appreciate the diversity that they represent, and understand how people instinctively interpret numbers and pictures. During each week, we will consider a different data set to be summarized with a different goal. We will review analysis of similar problems carried out in the past and explore if and how the same tools can be useful today. We will pay attention to contemporary media (newspapers, blogs, etc.) to identify settings similar to the ones we are examining and critique the displays and summaries there documented. Taking an experimental approach, we will evaluate the effectiveness of different data summaries in conveying the desired information by testing them on subsets of the enrolled students.

Terms: Aut
| Units: 3
| UG Reqs: WAY-AQR, WAY-FR

Instructors:
Sabatti, C. (PI)

## BIODS 232: Consulting Workshop on Biomedical Data Science

The Data Studio is a collaboration between Spectrum (The Stanford Center for Clinical and Translational research and Education) and the Department of Biomedical Data Science (DBDS). The educational goal of this workshop is to provide data science consultation training for students. Data Studio is open to the Stanford community, and we expect it to have educational value for students and postdocs interested in biomedical data science. Most sessions are workshops that provide an extensive and in-depth consultation for a Medical School researcher based on research questions, data, statistical models, and other material prepared by the researcher with the aid of our facilitator. At the workshop, the researcher explains the project, goals, and needs. Experts in the area across campus will be invited and contribute to the brainstorming. After the workshop, the facilitator will follow up,helping with immediate action items and summary of the discussion. The last session of each month is devot
more »

The Data Studio is a collaboration between Spectrum (The Stanford Center for Clinical and Translational research and Education) and the Department of Biomedical Data Science (DBDS). The educational goal of this workshop is to provide data science consultation training for students. Data Studio is open to the Stanford community, and we expect it to have educational value for students and postdocs interested in biomedical data science. Most sessions are workshops that provide an extensive and in-depth consultation for a Medical School researcher based on research questions, data, statistical models, and other material prepared by the researcher with the aid of our facilitator. At the workshop, the researcher explains the project, goals, and needs. Experts in the area across campus will be invited and contribute to the brainstorming. After the workshop, the facilitator will follow up,helping with immediate action items and summary of the discussion. The last session of each month is devoted to drop-in consulting. DBDS faculty are available to provide assistance with your research questions. Skills required of practicing biomedical consultants, including exposed to biomedical and health science applications, identification of data science related questions, selection or development of appropriate statistical and analytic approaches to answer research needs. Students are required to attend the regular workshops and participate one to two consulting projects as team members under the supervision of faculty members or senior staff. Depending on the nature of the consulting service, the students may need to conduct numerical simulation, plan sample size, design study, and analyze client data. the formal written report needs to be completed at the end of consulting projects. May be repeated for credit. Prerequisites: course work in applied statistics, data analysis, and consent of instructor.

Terms: Aut, Win, Spr
| Units: 1-2
| Repeatable for credit

Instructors:
Lu, Y. (PI)
;
Sabatti, C. (PI)
;
Tian, L. (PI)
;
Desai, M. (SI)
;
Efron, B. (SI)
;
Lavori, P. (SI)
;
Narasimhan, B. (SI)
;
Tamaresis, J. (SI)

## BIODS 237: Deep Learning in Genomics and Biomedicine (BIOMEDIN 273B, CS 273B, GENE 236)

Recent breakthroughs in high-throughput genomic and biomedical data are transforming biological sciences into "big data" disciplines. In parallel, progress in deep neural networks are revolutionizing fields such as image recognition, natural language processing and, more broadly, AI. This course explores the exciting intersection between these two advances. The course will start with an introduction to deep learning and overview the relevant background in genomics and high-throughput biotechnology, focusing on the available data and their relevance. It will then cover the ongoing developments in deep learning (supervised, unsupervised and generative models) with the focus on the applications of these methods to biomedical data, which are beginning to produced dramatic results. In addition to predictive modeling, the course emphasizes how to visualize and extract interpretable, biological insights from such models. Recent papers from the literature will be presented and discussed. Stude
more »

Recent breakthroughs in high-throughput genomic and biomedical data are transforming biological sciences into "big data" disciplines. In parallel, progress in deep neural networks are revolutionizing fields such as image recognition, natural language processing and, more broadly, AI. This course explores the exciting intersection between these two advances. The course will start with an introduction to deep learning and overview the relevant background in genomics and high-throughput biotechnology, focusing on the available data and their relevance. It will then cover the ongoing developments in deep learning (supervised, unsupervised and generative models) with the focus on the applications of these methods to biomedical data, which are beginning to produced dramatic results. In addition to predictive modeling, the course emphasizes how to visualize and extract interpretable, biological insights from such models. Recent papers from the literature will be presented and discussed. Students will be introduced to and work with popular deep learning software frameworks. Students will work in groups on a final class project using real world datasets. Prerequisites: College calculus, linear algebra, basic probability and statistics such as
CS 109, and basic machine learning such as
CS 229. No prior knowledge of genomics is necessary.

Terms: Aut
| Units: 3

Instructors:
Kundaje, A. (PI)
;
Zou, J. (PI)

## BIODS 260A: Workshop in Biostatistics (STATS 260A)

Applications of statistical techniques to current problems in medical science. To receive credit for one or two units, a student must attend every workshop. To receive two units, in addition to attending every workshop, the student is required to write an acceptable one page summary of two of the workshops, with choices made by the student.

Terms: Aut
| Units: 1-2
| Repeatable for credit

## BIODS 299: Directed Reading and Research

For students wishing to receive credit for directed reading or research time. Prerequisite: consent of instructor.

Terms: Aut, Win, Spr, Sum
| Units: 1-18
| Repeatable for credit

Instructors:
Bustamante, C. (PI)
;
Hastie, T. (PI)
;
Olshen, R. (PI)
...
more instructors for BIODS 299 »

Instructors:
Bustamante, C. (PI)
;
Hastie, T. (PI)
;
Olshen, R. (PI)
;
Rivas, M. (PI)
;
Sabatti, C. (PI)
;
Salzman, J. (PI)
;
Tian, L. (PI)
;
Tibshirani, R. (PI)
;
Zou, J. (PI)