A Theory of Deep Learning
↗A theory reframes deep learning via Neural Tangent Kernel and population-risk dynamics, arguing benign overfitting and grokking are explained by spectral flow, and touting a practical one-line Adam tweak that could speed training by up to 5× in some setups.
May 6, 20261%