Working paper

Asymptotic study of stochastic adaptive algorithm in non-convex landscape

Sébastien Gadat, and Ioana Gavra

Abstract

This paper studies some asymptotic properties of adaptive algorithms widely used in optimization and machine learning, and among them Adagrad and Rmsprop, which are involved in most of the blackbox deep learning algorithms. Our setup is the non-convex landscape optimization point of view, we consider a one time scale parametrization and we consider the situation where these algorithms may be used or not with mini-batches. We adopt the point of view of stochastic algorithms and establish the almost sure convergence of these methods when using a decreasing step-size towards the set of critical points of the target function. With a mild extra assumption on the noise, we also obtain the convergence towards the set of minimizers of the function. Along our study, we also obtain a \convergence rate" of the methods, in the vein of the works of [GL13].

Keywords

Stochastic optimization; Stochastic adaptive algorithm; Convergence of random variables;

Replaced by

Sébastien Gadat, and Ioana Gavra, Asymptotic study of stochastic adaptive algorithm in non-convex landscape, Journal of Machine Learning Research, n. 228, August 2022, pp. 1–54.

Reference

Sébastien Gadat, and Ioana Gavra, Asymptotic study of stochastic adaptive algorithm in non-convex landscape, TSE Working Paper, n. 21-1175, January 2021.

See also

Published in

TSE Working Paper, n. 21-1175, January 2021