Skip to content
LessWrong AI · Communities

Power Laws in NNs: A Possible Mechanism for Inductive Bias towards Sparse Representations

This post was produced as part of the Iliad Fellowship under the mentorship of Dmitry Vaintrob. Tl;dr: Power-law ("heavy-tailed") distributions have universality theorems similar to those which make Gaussians common. We observe many things in ML are power-law distributed, most robustly and interestingly, the spectra of