Computational Tools for Reproducible Data Science

By: Harvard University
Beginner, Intermediate,
Duration: 8 Weeks

Course Name: Principles, Statistical and Computational Tools for Reproducible Data Science

Learn skills and tools that support data science and reproducible research, to ensure you can trust your own research results, reproduce them yourself, and communicate them to others.

Today the principles and techniques of reproducible research are more important than ever, across diverse disciplines from astrophysics to political science. No one wants to do research that can’t be reproduced. Thus, this course is really for anyone who is doing any intensive data research. While many of us come from a biomedical background, this course is for a broad audience of data scientists.

To meet the needs of the scientific community, this course will examine the fundamentals of methods and tools for reproducible research. Led by experienced faculty from the Harvard T.H. Chan School of Public Health, you will participate in six modules that will include several case studies that illustrate the significant impact of reproducible research methods on scientific discovery.

Check Out  Data Science: Linear Regression
Course ByInstructor
Harvard UniversityCurtis Huttenhower, John Quackenbush, Lorenzo Trippa, Christine Choirat

Course Details

This course will appeal to students and professionals in biostatistics, computational biology, bioinformatics, and data science. The course content will blend video lectures, case studies, peer-to-peer engagements and use of computational tools and platforms (such as R/RStudio, and Git/Github), culminating in a final presentation of a final reproducible research project.

We’ll cover Fundamentals of Reproducible Science; Case Studies; Data Provenance; Statistical Methods for Reproducible Science; Computational Tools for Reproducible Science; and Reproducible Reporting Science.

These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.

What you will learn?

  • Understand a series of concepts, thought patterns, analysis paradigms, and computational and statistical tools, that together support data science and reproducible research.
  • Fundamentals of reproducible science using case studies that illustrate various practices.
  • Key elements for ensuring data provenance and reproducible experimental design.
  • Statistical methods for reproducible data analysis.
  • Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.
  • How to develop new methods and tools for reproducible research and reporting, and how to write your own reproducible paper.

Other Details

Skills you learn from this course to create your own environment in which you can easily carry out reproducible research and to encourage and integrate with similar environments for your collaborators and colleagues.

Course Instructors

Curtis Huttenhower: Associate Professor of Computational Biology and Bioinformatics, Harvard University
John Quackenbush: Professor of Computational Biology and Bioinformatics, Harvard University
Lorenzo Trippa: Associate Professor of Biostatistics, Harvard University
Christine Choirat: Research Scientist, Harvard University

Check Course Content, Faqs, Rating and other important information about this course.

Details here


We have tried to provide the best updated information about this Computational Tools for Reproducible Data Science course. However, if you find this course is not available or if there are any changes to this free course of Computational Tools for Reproducible Data Science then do let us know. Our team will make the necessary changes.

0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments