Understanding the acceleration phenomenon via high-resolution differential equations

Ode Ball (mathematics)
DOI: 10.1007/s10107-021-01681-8 Publication Date: 2021-07-06T14:04:42Z
ABSTRACT
Abstract Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by fact that existing ODEs do not distinguish between two fundamentally different algorithms—Nesterov’s accelerated gradient method for strongly convex functions (NAG-) and Polyak’s heavy-ball method—we study an alternative process yields high-resolution . We show these permit a general Lyapunov function framework analysis convergence in both continuous discrete time. also are more accurate surrogates underlying algorithms; particular, they only NAG- method, but allow identification term we refer to as “gradient correction” is present responsible qualitative difference methods. use ODE Nesterov’s (non-strongly) functions, uncovering hitherto unknown result—that minimizes squared norm at inverse cubic rate. Finally, modifying NAG-, obtain family new methods shown maintain rates smooth functions.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (45)
CITATIONS (64)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....