Mathematics for Machine Learning

Course

Lecturers:
Enrico Bozzo, Dimitri Breda, Paolo Vidoni (UNIUD), Leonardo Egidi (UNITS), Carla Manni (TOR VERGATA), Gabor Orosz (MICHIGAN)

Board Contact:
Dimitri Breda

SSD: MAT/08

CFU: 2 CFU (28-hour SUPE course) + 2 CFU for possible assignment

Period: May–June 2024

Lessons / Hours: 28 hours

Program:

Neglecting prehistory we can place the dawn of Artificial Intelligence in the 1950s. Machine Learning is a branch of Artificial Intelligence in which learning occurs directly from data and its first steps occurred in the 90s of the last century. Deep Learning is a particular Machine Learning technique that is based on the use of computing structures known as Neural Networks. In 2012 the AlexNet neural network won the Image-Net challenge (competitions in this area are very important) beating the competition (85% vs 75% recognition percentage). Since then, progress has been constant, so much that image recognition is now considered in some respects to be a solved problem. Recently “general purpose” networks like Chat GPT have invaded the daily news (or our lives) and promise an enormous series of applications. The development of these technologies has been so rapid that a satisfactory theoretical framework describing their functioning is currently lacking. However, linear algebra, optimization and statistical tools are certainly necessary (and certainly not sufficient) for the development of this theory.

To appropriately deal with all these aspects, the course is divided into three parts, coordinated respectively by Enrico Bozzo, Paolo Vidoni and Dimitri Breda.

In the first part we will see different aspects of the mathematics of neural networks. We will talk, among other things, about least squares, introducing the important notions of underfitting and overfitting, expressive power, gradient method and back propagation. We have invited a guest who will talk to us about wavelets and convolution (convolutional networks, like AlexNet itself, are an extremely important class of networks for problems relating to images and videos).

In the part coordinated by Paolo Vidoni we will talk about statistical aspects of optimization methods with particular attention to the stochastic gradient method and regularization techniques (lasso, ridge regression and boosting algorithms). Based estimation methods will be introduced on Monte Carlo simulations with Markov chains. Applications to sports data will be shown, an application sector currently in strong development.

In the part coordinated by Dimitri Breda, we will also look at other data-driven techniques, which have established themselves especially in the field of dynamic systems, in the last decade or more recently introduced. For example, we will talk about dynamic mode decomposition, sparse identification of nonlinear dynamics and neural differential equations. Both applications to classical dynamical systems generated by ordinary equations (e.g. Lorenz) and in infinite dimensions (e.g. delay problems, guest Gabor Orosz – Univ. Michigan) will be presented.