0 citations0 references

Convergence of gradient descent for learning linear neural networks

Advances in Continuous and Discrete Models2024Vol. 2024(1)

Citations Over TimeTop 10% of 2024 papers

Gabin Maxime Nguegnang, Holger Rauhut, Ulrich Terstiege

Abstract

Abstract We study the convergence properties of gradient descent for training deep linear neural networks, i.e., deep matrix factorizations, by extending a previous analysis for the related gradient flow. We show that under suitable conditions on the stepsizes gradient descent converges to a critical point of the loss function, i.e., the square loss in this article. Furthermore, we demonstrate that for almost all initializations gradient descent converges to a global minimum in the case of two layers. In the case of three or more layers, we show that gradient descent converges to a global minimum on the manifold matrices of some fixed rank, where the rank cannot be determined a priori.

Related Papers

→ STKVS: secure technique for keyframes-based video summarization model(2024)7 cited
Study and Two Types of Typical Usage of DataGrid Web Server Control(2005)
Using DataGrid Control to Realize DataBase of Querying in VB6.0(2000)
Susquehanna Chorale Spring Concert "Roots and Wings"(2017)
→ ИСПОЛЬЗОВAНИЕ ПОТЕНЦИAЛA СОЦИAЛЬНЫХ ПAРТНЕРОВ В ПОДГОТОВКЕ БУДУЩИХ ПЕДAГОГОВ(2024)