Skip to content
Lilian Weng · Tech Media

Some Math behind Neural Tangent Kernel

Neural networks are well known to be over-parameterized and can often easily fit data with near-zero training loss with decent generalization performance on test dataset. Although all these parameters are initialized at random, the optimization process can consistently lead to similarly good outcomes. And this is true