Stochastic subgradient descent on weakly convex functions escapes active strict saddles

Sholom Schechtman (Télécom SudParis)

April 18, 2024, 11:00–12:15


Room Auditorium 3 - JJ Laffont

MAD-Stat. Seminar


It was established in the 90's by Pemantle, Brandière and Duflo that stochastic gradient descent (SGD) escapes saddle points of smooth function. Knowing that critical points of a typical/generic smooth function are either local minima or saddle points, we can interpret this result as a generic convergence to local minima of SGD. The purpose of this talk is to extend such a result to the class of non-smooth, weakly-convex functions. We will first investigate the generic properties of non-smooth, semi-algebraic (or more generally definable in an o-minimal structure) functions. As recently shown by Davis and Drusvyatskiy, active strict saddles form a generic type of critical points in this class. Second, we will show that on weakly-convex functions SGD avoids active strict saddles with probability one. As a consequence, "generically" on a weakly-convex, semi-algebraic function, SGD converges to a local minimum.