This project is an open source repository containing the official course materials for the Johns Hopkins Data Science Specialization on Coursera. It covers the full specialization curriculum, including modules on R programming, data cleaning, exploratory analysis, reproducible research, statistical inference, regression models, practical machine learning, and developing data products. Each module folder includes lectures, assignments, and supporting resources for learners. The repository is designed to support self-learners and students enrolled in the Coursera track, with content authored by professors Brian Caffo, Jeff Leek, and Roger Peng. Materials are openly licensed under Creative Commons (CC-NC-SA), ensuring accessibility while maintaining academic integrity. With contributions from both instructors and the community, it has become a widely used resource for foundational data science education.
Features
- Provides structured lecture notes and assignments
- Includes R programming exercises and examples
- Covers reproducible research workflows
- Contains projects on statistical inference and regression
- Offers machine learning practice materials
- Includes modules for developing interactive data products