A study of the long-run behavior of stochastic gradient descent via large deviations

March 12, 2026, 11:00–12:15

Toulouse

Room Auditorium 3

MAD-Stat. Seminar

Abstract

We examine the long-run distribution of stochastic gradient descent (SGD) in general, non-convex problems. Specifically, we seek to understand which regions of the problem's state space are more likely to be visited by SGD, and by how much. Using an approach based on the theory of large deviations and randomly perturbed dynamical systems, we show that the long-run distribution of SGD resembles the Boltzmann-Gibbs distribution of equilibrium thermodynamics with temperature equal to the method's step-size and energy levels determined by the problem's objective and the statistics of the noise. Joint work w/ W. Azizian, J. Malick, P. Mertikopoulos

Search form

A study of the long-run behavior of stochastic gradient descent via large deviations

Abstract

See also

A study of the long-run behavior of stochastic gradient descent via large deviations

Abstract

See also

Share