Mathematical Foundations
Optimisation Theory
Convexity, why SGD finds good solutions on non-convex losses, saddle points at scale, momentum as a damped oscillator, and learning-rate schedules as implicit regularisation.
advanced · 10 min read · Premium
This concept is for Pro members.
Unlock the full library, study plans, the AI mentor, and daily emails.
See plans