paint-brush
Stable Nonconvex-Nonconcave Training via Linear Interpolation: Conclusion & limitationsby@interpolation

Stable Nonconvex-Nonconcave Training via Linear Interpolation: Conclusion & limitations

tldt arrow

Too Long; Didn't Read

This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training.
featured image - Stable Nonconvex-Nonconcave Training via Linear Interpolation: Conclusion & limitations
The Interpolation Publication HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Thomas Pethick, EPFL (LIONS) [email protected];

(2) Wanyun Xie, EPFL (LIONS) [email protected];

(3) Volkan Cevher, EPFL (LIONS) [email protected].

9 Conclusion & limitations

We have precisely characterized the stabilizing effect of linear interpolation by analyzing it under cohypomonotonicity. We proved last iterate convergence rates for our proposed method RAPP. The algorithm is double-looped, which introduces a log factor in the rate as mentioned in Remark E.4. It thus remains open whether last iterate is possible using only τ = 2 inner iterations (for which RAPP reduces to EG+ in the unconstrained case). By replacing the inner solver we subsequently rediscovered and analyzed Lookahead using nonexpansive operators. In that regard, we have only dealt with compositions of operators. It would be interesting to further extend the idea to understanding and developing both Federated Averaging and the meta-learning algorithm Reptile (of which Lookahead can be seen as a single client and single task instance respectively), which we leave for future work.