upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search

Why Momentum Really Works

  • Article
  • Apr 4, 2017
  • #Physics
distill.pub
Read on distill.pub
1 Recommender
1 Mention
Here’s a popular story about momentum: gradient descent is a man walking down a hill. He follows the steepest path downwards; his progress is slow, but steady. Momentum is a heavy b... Show More

Here’s a popular story about momentum: gradient descent is a man walking down a hill. He follows the steepest path downwards; his progress is slow, but steady. Momentum is a heavy ball rolling down the same hill. The added inertia acts both as a smoother and an accelerator, dampening oscillations and causing us to barrel through narrow valleys, small humps and local minima.

This standard story isn’t wrong, but it fails to explain many important behaviors of momentum. In fact, momentum can be understood far more precisely if we study it on the right model.

One nice model is the convex quadratic. This model is rich enough to reproduce momentum’s local dynamics in real problems, and yet simple enough to be understood in closed form. This balance gives us powerful traction for understanding this algorithm.

Show Less
Recommend
Post
Save
Complete
Collect
Mentions
See All
Nick @nickcammarata · May 10, 2023
  • Post
  • From Twitter
gradient descent and optimization in general I also find extremely cool. I recommend this paper to anyone interested in it
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta