Master of Science in Public Health Data Science

The Master of Science in Public Health Data Science integrates biostatistics, epidemiology, and computer science. Students will be prepared for careers where there is a growing need for individuals who can learn from data to address important questions in public health and biomedical sciences.

The MS in Public Health Data Science program is designed to provide students with rigorous quantitative training in statistical and computational skills needed to manage, analyze, and learn from health data. Students will learn to wrangle, scrape, create, and manage large health-related datasets; summarize, visualize, and interpret data; apply statistical methods to draw conclusions from the data; use machine learning to reveal features of large, complex health-related datasets; learn the statistical theory behind common data science methods; and effectively communicate results and findings to a broad audience. This program is designed to be a terminal degree, but for students interested in pursing further education, it can be used to lay the foundation for a PhD in Biostatistics, Statistics, Data Science or Computer Science.

Program Director

Trevor Pickering
Trevor Pickering, PhD

Assistant Professor of Clinical Population and Public Health Sciences

Application Deadlines


Priority: December 1st
Final: May 1st

For More Information

Core Coursework

This 32 unit degree program can be completed in 4 semesters. It consists of 6 core courses (22 units) in Population and Public Health Sciences, 1 core course in Computer Science (4 units), 1 elective in either Population and Public Health Sciences or Computer Science (3 units), and a Practicum (3 units)

An introduction to the toolsets needed to create workable and reproducible datasets, conduct exploratory analysis and visualizations, learn from data, summarize and communicate analytic results.

Through this course students will become familiar with data analysis and regression using R.

Terminology/uses of epidemiology and demography; sources/uses of population data; types of epidemiologic studies; risk assessment; common sources of bias in population studies; principles of screening. 

Techniques for the solution of statistical problems through intensive computing; iterative techniques, randomization tests, the bootstrap, Monte Carlo methods..

Theory of estimation and testing, inference, analysis of variance, theory of regression.

Introduces Masters and Ph.D. students in the Health Sciences to Machine Learning


Students must complete a semester-long practicum, which is an instructor-guided course culminating with a health data science project that combines and applies the knowledge acquired through the program (PM 606)

Program Faculty