arXiv stat.ML
· Papers
Why Do We Need Warm-up? A Theoretical Perspective
arXiv:2510.03164v2 Announce Type: replace-cross Abstract: Learning rate warm-up -- increasing the learning rate at the beginning of training -- has become a ubiquitous heuristic in modern deep learning, yet its theoretical foundations remain poorly understood. In this work, we provide a principled explanation for why w