paint-brush
Stable Nonconvex-Nonconcave Training via Linear Interpolation: Last iterate under cohypomonotonicityby@interpolation

Stable Nonconvex-Nonconcave Training via Linear Interpolation: Last iterate under cohypomonotonicity

tldt arrow

Too Long; Didn't Read

This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training.
featured image - Stable Nonconvex-Nonconcave Training via Linear Interpolation: Last iterate under cohypomonotonicity
The Interpolation Publication HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Thomas Pethick, EPFL (LIONS) [email protected];

(2) Wanyun Xie, EPFL (LIONS) [email protected];

(3) Volkan Cevher, EPFL (LIONS) [email protected].

6 Last iterate under cohypomonotonicity


The above lemma allows us to obtain last iterate convergence for IKM on the inexact resolvent by combing the lemma with Theorem C.1.




Remark 6.3. Notice that the rate in Theorem 6.2 has no dependency on ρ. Specifically, it gets rid of the factor γ/(γ + 2ρ) which Gorbunov et al. (2022b, Thm. 3.2) shows is unimprovable for PP. Theorem 6.2 requires that the iterates stays bounded. In Corollary 6.4 we will assume bounded diameter for simplicity, but it is relatively straightforward to show that the iterates can be guaranteed to be bounded by controlling the inexactness (see Lemma E.2).