Momentum

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Revision as of 19:43, 24 November 2021 by Tcl74 (talk | contribs) (→‎Introduction)
Jump to navigation Jump to search

Authors: Thomas Lee, Greta Gasswint, Elizabeth Henning (SYSEN5800 Fall 2021)

Introduction

Momentum is an extension to the gradient descent optimization algorithm that builds inertia in a search direction to overcome local minima and oscillation of noisy gradients (1). It is based on the same concept of momentum in physics. A classic example is a ball rolling down a hill that gathers enough momentum to overcome a plateau region and make it to a global minima (2). Momentum adds history to the parameter updates which significantly accelerates the optimization process. Momentum controls the amount of history to include in the update equation via a hyperparameter (1). This hyperparameter is a value ranging from 0 to 1. A momentum of 0 is equivalent to gradient descent without momentum (1). A higher momentum value means more gradients from the past (history) are considered (2).


(1) https://machinelearningmastery.com/gradient-descent-with-momentum-from-scratch/

(2) https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d

Theory, methodology, and/or algorithmic discussions

Definition

hi

Some heading

hi

Some heading

hi

Another heading

hi

Another heading

hi

Graphical Explanation

hi

hi

hi

another header

  • hi

another heading

  • Blah:

Some Example

  • More text

Numerical Example

Some header

hi

Applications

Some example

  • An example of this is

Conclusion

hi

References