← Concept library

Deep Learning

Optimisers: SGD, Adam, AdamW, Lion

How the standard optimiser stack evolved from plain SGD through Adam to memory-cheaper variants like Lion and Muon, and which learning-rate schedules actually work at scale.

intermediate · 9 min read · Premium

This concept is for Pro members.

Unlock the full library, study plans, the AI mentor, and daily emails.

See plans