Tuesday, June 02, 2026
3:00 PM -
4:00 PM
East Bridge 114
Mathematics & Machine Learning Seminar
Polyak Steps Sizes in GD Find Flat Minima
Modern machine learning relies on minimizing high dimensional loss functions that are typically non-convex but for which it is still easy to find global minima. In fact, the set of global minima is often itself a high dimensional manifold, and an important question is which minima a given optimization scheme will find. In this talk I will present some ongoing joint work with Jason Altschuler (Penn) and Francesco Caporali (Princeton), which proves a new global convergence result for minimizing such functions. Namely, I will explain how gradient descent with Polyak step sizes provably finds flat minima. I will show that our theoretically devised optimizer finds flat minima empirically both in toy models and in pre-trained LLMs.
Event Sponsors:
For more information, please contact Math Department by phone at 626-395-4335 or by email at [email protected].
