[ 951SMDSSPDK17 ] KV Statistical Principles of Data Science

Workload Education level Study areas Responsible person Hours per week Coordinating university
6 ECTS M1 - Master's programme 1. year Statistics Helga Wagner 3 hpw Johannes Kepler University Linz
Detailed information
Original study plan Master's programme Statistics 2015W
Objectives Students know basic concepts and tools of statistics for data analysis. They can apply methods designed for big data and high dimensional inference and know about pitfalls to avoid in data analysis
Subject Basic concepts of statistics: estimation, testing, prediction and classification, clustering basic statistical tools: frequentist vs. Bayesian inference; common statistical models; model selection and model averaging

big data and large scale inference: big "n" vs. big "p"; sparse modelling and Lasso; Random forests, boosting, shrinkage and empirical Bayes;

pitfalls: correlation vs. causation; all models are wrong; garbage in - garbage out; common sources of bias; Simpson's paradoxy and the perils of aggregating data; data mining, multiple hypothesis testing and the false discovery rate ; curse of dimensionality, spurious correlation, incidental endogeneity

Criteria for evaluation Homework plus written exam.
Language English
Study material Bradley Efron and Trevor Hastie: Computer Age Statistical Inference. Cambridge University Press 2016.
Changing subject? No
On-site course
Maximum number of participants -
Assignment procedure Assignment according to priority