arXiv stat.ML
· Papers
Derivation of effective gradient flow equations and dynamical truncation of training data in Deep Learning
arXiv:2501.07400v2 Announce Type: replace-cross Abstract: We derive explicit equations governing the cumulative biases and weights in Deep Learning with ReLU activation function, based on gradient descent for the Euclidean loss in the input layer, and under the assumption that the weights are, in a precise sense, adapt