Stochastic gradient descent

Stochastic gradient descent (abbreviated as SGD) is an iterative method often used for machine learning, optimizing the gradient descent during each search once a random weight vector is picked. The gradient descent is a strategy that searches through a large or infinite hypothesis space whenever 1) there are hypotheses continuously being parameterized and 2) the errors are differentiable based on the parameters. The problem with gradient descent is that converging to a local minimum takes extensive time and determining a global minimum is not guaranteed (McGrawHill, 92). The gradient descent picks any random weight vector and continuously updates it incrementally when an error calculation is completed to improve convergence (Needell, Ward, Nati Srebro, 14). The method seeks to determine the steepest descent and it reduces the number of iterations and the time taken to search large quantities of data points. Over the recent years, the data sizes have increased immensely such that current processing capabilities are not enough (Bottou, 177). Stochastic gradient descent is being used in neural networks and decreases machine computation time while increasing complexity and performance for large-scale problems.

Authors: Jonathon Price, Alfred Wong, Tiancheng Yuan, Joshua Matthews, Taiwo Olorunniwo (SYSEN 5800 Fall 2020)
Steward: Fengqi You

Problem Formulation

Theory

[Insert text]

Methodology

[Insert text]

Algorithmic Discussion

[Insert text]

Numerical Example

[Insert text]

Application

[Insert text]

Conclusion

Stochastic Gradient Descent is an algorithm that seeks to find the steepest descent during each iteration. The process decreases the time it takes to search large data sets and determine local minima immensely. The process helps determine the global minimum. The SDG provides many applications in machine learning, geophysics, least mean squares (LMS), and other areas.

References

[Insert reference]

Stochastic gradient descent

Contents

Problem Formulation

Theory

Methodology

Algorithmic Discussion

Numerical Example

Application

Conclusion

References

Navigation menu