About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INFORMS 2023
Invited talk
Stochastic Optimization Methods With Momentum
Abstract
The empirical risk minimization problem (ERM) arises in most machine learning tasks, including logistic regression and some neural networks. Stochastic Gradient Descent (SGD) has been widely used to solve this problem thanks to its scalability and efficiency in dealing with large-scale tasks. Many variants of SGD involve momentum techniques which incorporate the past gradient information to the descent direction. Since the momentum methods offer encouraging practical performance, it is desirable to study their theoretical aspects and apply that knowledge to algorithm design. In this talk, we provide an overview of some of the stochastic momentum methods for the ERM problem and highlight some practical algorithms and/or settings where the momentum methods may have theoretical/heuristic advantages compared to plain SGD.