Back to overview

High-dimensional Probability

EC8
LocationUtrecht University
Weeks37 - 51
LectureThursday, 14:00 - 16:45
Provider Stochastics (Star)
LinksCourse page (requires login)

Summary

Description

High-dimensional probability theory studies random objects in high-dimensional spaces, such as random vectors, matrices, and processes. High-dimensional probabilistic problems arise in various fields, such as statistical analysis of high-dimensional data, theory of (large) random graphs, machine learning, numerical linear algebra, randomized algorithms, convex geometry, and compressed sensing. Although these application areas differ considerably, there is an important circle of probabilistic principles and techniques, including concentration inequalities and chaining methods, that is commonly used in all of them. The aim of this course is to give an introduction to these ideas, with a view towards applications in data science.

Topics

We aim to discuss the following topics:

During the theoretical development of these topics, we will discuss several applications in mathematical data science, such as data dimension reduction, community detection in networks, matrix completion, clustering, semidefinite programming, and sparse linear regression.

 

Prerequisites

It is required to have a firm grasp of undergraduate 

Knowledge of measure-theoretic probability theory is not required.

No prior knowledge from other Mastermath courses is required.

Primary literature.

Vershynin, Roman. High-dimensional probability: An introduction with applications in data science. Vol. 47. Cambridge university press, 2018. We will use a pre-publication version of the second edition of this book (freely available online: https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-2.pdf)

 

Van Handel, Ramon, Probability in high dimensions, lecture notes, Princeton university, 2016. (available online: http://web.math.princeton.edu/~rvan/APC550.pdf)

 

Sources for further reading.

 

Blum, Avrim, John Hopcroft, and Ravindran Kannan. Foundations of data science. Cambridge University Press, 2020. (available online: https://www.cs.cornell.edu/jeh/book.pdf)

 

Boucheron, Stéphane, Gábor Lugosi, and Pascal Massart. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.

 

Ledoux, Michel. The concentration of measure phenomenon. American Mathematical Society, 2001.

 

Talagrand, Michel. Upper and lower bounds for stochastic processes. Springer, 2014.


Examination. 

During the course there will be four homework assignments, which will be completed in pairs. At the end of the course there will be a final exam and a re-take exam. These exams will either be written or oral, depending on the number of participants. The final grade for the course is determined by the average homework grade (20%) and the exam/re-take exam grade (80%). A sufficient grade (5,5 or higher) for the exam/re-take exam is required to pass the course.