Seminar

Differential calculus for machine learning

Edouard Pauwels (IRIT, Université Toulouse 3 Paul Sabatier,)

April 6, 2023, 09:30–10:45

Auditorium A3

Maths Job Market Seminar

Abstract

The two pillars of training modern AI models are continuous optimization, with gradient type methods, and automatized derivative calculus, with automatic differentiation (AD). Yet many neural network models include non-differentiable components (75% of torchvision models), and, despite its world-wide empirical success, nonsmooth AD lacks a proper mathematical model. First I will illustrate how formal application of differential calculus to non-differentiable objects can generate problematic derivative artifacts. Then I will introduce a weak notion of generalized derivative, named conservative gradient, and show its compatibility with the two pillars: compositional calculus and gradient type optimization, thereby providing a rigourous model for nonsmooth AD applicable to virtually all existing network architectures. The end of the talk will be dedicated to extensions of conservative calculus beyond compositional operations.