Data Analysis
Full course description
This course aims at preparing students on how to be a successful “data scientist”. The crucial processes of inspecting, cleaning, transforming, restoring and preparing data for modelling are tackled. Different types of data are going to be explored through case studies ("clinics") that a modern “data scientist” has to deal with. Furthermore, several techniques from machine learning and mathematical modelling (multiple regression, classification, tree-based models, dimensionality reduction, etc.) are presented from the data analysis perspective and students learn how to apply these techniques to different types of data. Finally, the cornerstone of data analysis is presented: correct communication of the analysis outcome (storytelling, visualization, etc.).
Prerequisites
None.
Desired prior knowledge: Simulation and Statistical Analysis.
Recommended reading
Selected chapters from the following textbooks:
- A. Downey, Think Stats: Exploratory Data Analysis
- James, G., Witten, D., Hastie, T., Tibshirani: An Introduction to Statistical Learning (with Applications in R)
- J. Vanderplans, Data Science Handbook
- S. Skiena, The Data Science Design Manual
- J W. McKinney, Python for Data Analysis
- Chris Albron, Machine Learning with Python Cookbook