Pareto Rank Surrogate Model for Hardware-aware Neural Architecture Search
Abstract
Hardware-aware Neural Architecture Search (HWNAS) has recently gained much attention by automating the design of efficient deep learning models with tiny resources and reduced inference time requirements. However, HW-NAS inherits and exacerbates the expensive computational complexity of general NAS due to its significantly increased search spaces and more complex NAS evaluation component. To speed up HWNAS, existing efforts use surrogate models to predict a neural architecture's accuracy and hardware performance on a specific platform. Thereby reducing the expensive training process and significantly reducing search time. We show that using multiple surrogate models to estimate the different objectives does not achieve the true Pareto front. Therefore, we propose HW-PRNAS, a novel Pareto Rank-preserving surrogate model. HWPR-NAS training is based on a new loss function that ranks the architectures according to their Pareto front. We evaluate our approach on seven different hardware platforms, including ASIC, FPGA, GPU and multi-cores. Our results show that we can achieve up to 2. 5x speedup while achieving better Pareto-front results than state of the art surrogate models.