0 citations
An initial alignment between neural network and target is needed for gradient descent to learn
arXiv (Cornell University)2022
Citations Over Time
Abstract
This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in [AS20]. The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.
Related Papers
- → Reducing Neural Network Parameter Initialization Into an SMT Problem (Student Abstract)(2021)2 cited
- → A New Initialization Method for Neural Networks with Weight Sharing(2021)2 cited
- → Remarks on the initialization of Caputo derivative(2012)4 cited
- The Distributed Initialization Algorithm Based on Known n MSs(2004)
- → Comparison of Random Weight Initialization to New Weight Initialization CONEXP(2020)