Momentum
Authors: Thomas Lee, Greta Gasswint, Elizabeth Henning (SYSEN5800 Fall 2021)
Introduction
Momentum is an extension to the gradient descent optimization algorithm that builds inertia in a search direction to overcome local minima and oscillation of noisy gradients (1). It is based on the same concept of momentum in physics. A classic example is a ball rolling down a hill that gathers enough momentum to overcome a plateau region and make it to a global minima (2). Momentum adds history to the parameter updates which significantly accelerates the optimization process. Momentum controls the amount of history to include in the update equation via a hyperparameter (1). This hyperparameter is a value ranging from 0 to 1. A momentum of 0 is equivalent to gradient descent without momentum (1). A higher momentum value means more gradients from the past (history) are considered (2).
(1) https://machinelearningmastery.com/gradient-descent-with-momentum-from-scratch/
(2) https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d
Theory, methodology, and/or algorithmic discussions
Definition
hi
Some heading
hi
Some heading
hi
Another heading
hi
Another heading
hi
Graphical Explanation
hi
hi
hi
another header
- hi
another heading
- Blah:
Some Example
- More text
Numerical Example
Some header
hi
Applications
Some example
- An example of this is
Conclusion
hi