Back to overview

Multivariate Statistics

EC8
LocationUtrecht University
Weeks37 - 51
LectureThursday, 10:00 - 12:45
Provider Stochastics (Star)
LinksCourse page (requires login)

Summary

Prerequisites:
o Basic knowledge of probability theory
o Basic knowledge of statistics
o Basic knowledge of linear algebra
o Basic knowledge of calculus
o Basic knowledge in R (students can also use alternative programming languages like Python or Matlab, but solutions to problems will always be given in R)
Note that we will shortly repeat more unknown concepts like Lagrange multiplier so that some gaps in knowledge are not problematic
Aim of the course:
In multivariate statistics we observe multiple measurements for each individual observation. This can be vital signs like heart rate, blood pressure and respiratory rate of a patient or household expenditures for housing, food, education and entertainment. A focus lies on finding and modelling dependencies between these individual variables so that we can gain insights into the underlying mechanics.
A particular challenge is posed by the case where the dimension of the observations is large. Nowadays collecting data is much cheaper than in the past so that working with huge data sets is not unusual anymore. We will tackle this problem among others by means of dimension reduction. Graphical tools will help us to understand and visualize the structure of big and complex data sets.
Often, we cannot assume that all observations are homogeneous and follow the same probability model. In this case we want to discover groups within the data set and classify observations into them.
At the end of the course the student will be able to analyze complex datasets. They can handle large data sets, investigate the underlying dependence structure, identify subpopulations and test for the equality of their means and covariances. Students will understand the mathematical foundation of multivariate statistical procedures and thus understand their limitations. They will be able to derive theoretical properties like consistency and asymptotic normality of new estimators. Students can adapt existing hypothesis-tests to their needs or construct new tests based on general principles.
Rules about homework / exam:
Doing homework is voluntary but recommended. The final grade is based on the written exam only. To pass the course, the grade for the (retake) exam should be at least 5.5.
Lecture notes / literature:
Lectures notes are published online. They are based on the books:
  • Wolfgang Härdle and Leopold Simar (2024): Applied Multivariate Statistical Analysis
  • Theodore Anderson (2003): An Introduction to Multivariate Statistical Analysis