Core Coursework
The online Master of Science degree in Health Data Science coursework develops the foundation needed to immediately contribute as a health data scientist. Coursework includes training in the theory, methods, and applications of data science with the health and biomedical sciences and is developed with the rigor of an Ivy League education. The program concludes with a capstone project, where students apply the skills gained over the coursework to develop and tackle a real-world health data science problem. The capstone also provides training in core critical skills needed for professional success, such as scientific writing, presentation skills, and the ability to translate data science findings to non-data science stakeholders.
- Foundations in Data Science | 0.75 Units
- Foundations of Data Science will provide students with a comprehensive introduction into data visualization, wrangling, and analytics in the R programming language. While students will be expected to enter the course with a foundational understanding of linear algebra and calculus, this course will serve as a review of sequences and series, derivatives, and integration. Students can expect to be exposed to basic Machine Learning, LaTeX, version control with Git/ GitHub, and High Performance Computing (HPC).
- Foundations of Biostatistics | 0.75 Units
- This course will cover foundational topics for biostatistics, including probability, random variables and probability distributions, sampling distributions, the central limit theorem, p-values and confidence intervals, hypothesis testing, parametric and non-parametric test statistics, and categorical data analyses. The course requires the R Language for Statistical Computing.
- Ethics in Health Data Science | 0.40 Units
- This short course will review core ethical concerns the modern health data scientist must face, including bias in data acquisition, model training and development, and application. This course relies heavily on reading, summarizing, and critiquing the primary literature.
- Data Wrangling | 0.75 Units
- Data wrangling is the process of mapping and transforming data into new formats for the increased ease and efficiency of downstream analysis. In this course, students will learn about the different types of data structures and formats, and how to create, merge, subset, and manipulate these structures.
- Data Visualization | 0.75 Units
- Data visualization is a critical and necessary step of data analysis. This course will teach best practices for visualizing data, including exploratory data visualization and effective communication of statistical analysis. Students will become competent users of R graphics and R-Shiny. Real-life biomedical and health-related data will be used when possible.
- Foundations of Regression for Health Data Science | 0.75 Units
- The two courses in the regression for health data science series cover the theory and applications of regression-based statistical modeling as practiced within the health data sciences. Both courses emphasize the dual goals of modeling which are (i) prediction and (ii) causal inference. This first course presents the foundations of statistical inference (estimation and hypothesis testing) for linear models and generalized linear models (i.e. logistic and Poisson regression). The course requires the R Language for Statistical Computing.
- Advanced Regression for Health Data Science | 0.75 Units
- This second course on regression will cover the theory and application of time-to-event modeling, hierarchical models for clustered and longitudinal data, generalized linear mixed models, and applications of penalized (regularized) regression. The course requires the R Language for Statistical Computing.
- Research Design for Health Data Science | 0.75 Units
- This course will cover key study designs used within health data science. Key epidemiological concepts, including incidence, prevalence, attributable risk, latency of disease occurrence, confounding, effect modification, bias, and generalizability, will also be covered. The course will also cover the foundations of classification metrics as applied to screening and diagnostic tests. The course requires the R Language for Statistical Computing.
- Systems Thinking for Health Data Science | 0.75 Units
- In this course, students will learn how to formulate a health data science research project and practice their scientific communication skills. Topics include defining and justifying the problem statement, developing a conceptual framework, identifying relevant data sources to address the problem, and identifying the strengths and limitations of the data sources.
- Programming for Health Data Sciences | 0.75 Units
- This course covers the essential concepts of programming needed for health data sciences. Computational approaches to problem solving using live code examples and in-class exercises in Python, Bash scripting and High Performance Computing (HPC) will be presented.
- Statistical Learning for Big Data | 0.75 Units
- The course will present an overview of many of the approaches used for big data, focusing on analytical methods and algorithms tailored to the health data sciences. The course will use R to apply data reduction, classification, and optimization techniques using big data. Special attention will be given to students’ active learning by programming in a statistical software package R.
- Genomic Data Science | 0.40 Units
- The sequencing of the complete genomes of many organisms has transformed biology into an information science. This means a health data scientist must possess both molecular and computational skills in order to mine biological data for insights. This course will review key data science topics as applied to genomic research, including current topics.
- Foundations of Machine Learning | 0.75 Units
- This course provides a comprehensive introduction to machine learning methods and techniques based on practical application. Various machine learning concepts and methods, such as natural language processing and deep learning, will be covered.
- Applied AI | 0.75 Units
- This course provides a working foundation of artificial intelligence models, with a major focus on generative AI and Large Language Models (LLMs). Topics include key concepts related to building AI-powered applications and pipelines, including pre-and post-processing, Retrieval-Augmented Generation (RAG), vector stores, tool use, and agentic behavior.
- Capstone | 2.2 Units
- In the capstone, students participate in an intensive, self-driven project within health data science. The goals of the capstone are to apply the skills acquired from the completed coursework, while also refining the professional skills needed to formulate, execute, and disseminate project findings.
REGISTER FOR A VIRTUAL INFORMATION SESSION:
DISCOVER THE LATEST FROM DARTMOUTH
Receive updates on applying to Dartmouth
TALK TO OUR ADMISSIONS TEAM
Courtney Theroux
DIRECTOR OF ENROLLMENT MANAGEMENT
Amanda Williams
ASSOCIATE DIRECTOR OF ADMISSIONS AND RECRUITMENT
Hannah Kassel
ASSISTANT DIRECTOR, ADMISSIONS AND RECRUITMENT
Mia Soucy
ADMISSIONS MANAGER
Geisel.MPH.MS.Admissions@Dartmouth.edu(603) 646-5834