Bayesian optimization combined with incremental evaluation for neural network architecture optimization

Martin Wistuba

AutoML 2017

Conference paper

22 Sep 2017

Bayesian optimization combined with incremental evaluation for neural network architecture optimization

Abstract

The choice of hyperparameters and the selection of algorithms is a crucial part in machine learning. Bayesian optimization methods and successive halving have been applied successfully to optimize hyperparameters automatically. Therefore, we propose to combine both methods by estimating the initial population of incremental evaluation, our variation of successive halving, by means of Bayesian optimization. We apply the proposed methodology to the challenging problem of optimizing neural network architectures automatically and investigate how state of the art hyperparameter optimization methods perform for this task. In our evaluation of these automatic methods, we are able to achieve human expert performance on the MNIST data set but we are not able to achieve similar good results for the CIFAR-10 data set. However, the automated methods find shallow convolutional neural networks that outperform human crafted shallow neural networks with respect to classification error and training time.

Conference paper