Rethinking Bi-Level Optimization in Neural Architecture Search: A Gibbs Sampling Perspective
One-Shot architecture search, aiming to explore all possible operations jointly based on a single model, has been an active direction of Neural Architecture Search (NAS). As a well- known one-shot solution, Differentiable Architecture Search (DARTS) performs continuous relaxation on the architecture‚Äôs importance and results in a bi-level optimization problem. As many recent studies have shown, DARTS cannot always work robustly for new tasks, which is mainly due to the approximate solution of the bi-level optimization. In this paper, one-shot neural architecture search is addressed by adopting a directed probabilistic graphical model to represent the joint probability distribution over data and model. Then, neural architectures are searched for and optimized by Gibbs sampling. We re- think the bi-level optimization problem as the task of Gibbs sampling from the posterior distribution, which expresses the preferences for different models given the observed dataset. We evaluate our proposed NAS method ‚Äì GibbsNAS on the search space used in DARTS/ENAS as well as the search space of NAS-Bench-201. Experimental results on multiple search spaces show the efficacy and stability of our approach.