# On the Equivalence between Neural Network and Support Vector Machine

## Abstract

Recent research shows that the dynamics of an infinitely wide neural network (NN) trained by gradient descent can be characterized by Neural Tangent Kernel (NTK) \citep{jacot2018neural}. Under the squared loss, the infinite-width NN trained by gradient descent with an infinitely small learning rate is equivalent to kernel regression with NTK \citep{arora2019exact}. However, the equivalence is only known for ridge regression currently \citep{arora2019harnessing}. The equivalence between NN and other kernel machines (KMs), e.g. support vector machine (SVM), remains unknown. In this work, we establish the equivalence between NN and SVM, specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent. We show such NN and SVM have the same dynamics, characterized by an inhomogeneous linear differential (difference) equation, and thus converge to the same solution with same linear convergence rate under reasonable assumption of NN. We further generalize our theory to general loss functions with $\ell_2$ regularization and show the equivalence between NN and a family of ℓ2 regularized KMs with non-asymptotic bounds, which previous results cannot handle. Additionally, we show every finite-width NN trained by such regularized loss functions is approximately a KM. Finally, we demonstrate three practical applications of our theory: (i) computing \textit{non-vacuous} generalization bound of NN via the corresponding KM; (ii) it enables \textit{nontrivial} robustness certificate for the infinite-width NN while existing robustness verification methods would fail; (iii) we could obtain infinite-width NNs that are intrinsically more robust than those from previous kernel regression.