Momentum

Authors: Thomas Lee, Greta Gasswint, Elizabeth Henning (SYSEN5800 Fall 2021)

Introduction

Momentum is an extension to the gradient descent optimization algorithm that builds inertia in a search direction to overcome local minima and oscillation of noisy gradients (1). It is based on the same concept of momentum in physics. A classic example is a ball rolling down a hill that gathers enough momentum to overcome a plateau region and make it to a global minima (2). Momentum adds history to the parameter updates which significantly accelerates the optimization process. Momentum controls the amount of history to include in the update equation via a hyperparameter (1). This hyperparameter is a value ranging from 0 to 1. A momentum of 0 is equivalent to gradient descent without momentum (1). A higher momentum value means more gradients from the past (history) are considered (2).

(1) https://machinelearningmastery.com/gradient-descent-with-momentum-from-scratch/

(2) https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d

Theory, methodology, and/or algorithmic discussions

Definition

hi

Some heading

hi

Some heading

hi

Another heading

hi

Another heading

hi

Graphical Explanation

hi

another header

hi

another heading

Blah:

Some Example

More text

Numerical Example

Some header

hi

Applications

Some example

An example of this is

Conclusion

hi

Momentum

Contents

Introduction

Theory, methodology, and/or algorithmic discussions

Definition

Some heading

Some heading

Another heading

Another heading

hi

another heading

Numerical Example

Some header

Applications

Conclusion

References

Navigation menu