Publication
ICML 2022
Poster

Towards Understanding Convergence of Simultaneous Gradient Descent-Ascent in Minimax Optimization

Abstract

Gradient Descent-Ascent (GDA) methods are the mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically, for minxmaxyf(x;y)\min_{x} \max_{y} f(x;y) where ff is strongly-concave in yy and possibly nonconvex in xx, (Lin et al., 2020) proved the convergence of GDA with a stepsize ratio ηy/ηx=Θ(κ2)\eta_y/\eta_x=\Theta(\kappa^2) where ηx\eta_x and ηy\eta_y are the stepsizes for xx and yy and κ\kappa is the condition number for yy. While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the \emph{local convergence} of general \emph{nonconvex-nonconcave} minimax problems. We demonstrate that a stepsize ratio of Θ(κ)\Theta(\kappa) is necessary and sufficient for local convergence of GDA to a Stackelberg Equilibrium, where κ\kappa is the local condition number for yy. We prove a nearly tight convergence rate with a matching lower bound. We further extend the convergence guarantees to stochastic GDA and extra-gradient methods (EG). Finally, we conduct several numerical experiments to support our theoretical findings.

Date

Publication

ICML 2022