Weight features for predicting future model performance of deep neural networks
Deep neural networks frequently require the careful tuning of model hyperparameters. Recent research has shown that automated early termination of underperformance runs can speed up hyperparameter searches. However, these studies have used only learning curve for predicting the eventual model performance. In this study, we propose using weight features extracted from network weights at an early stage of the learning process as explanation variables for predicting the eventual model performance. We conduct experiments on hyperparameter searches with various types of convolutional neural network architecture on three image datasets and apply the random forest method for predicting the eventual model performance. The results show that use of the weight features improves the predictive performance compared with use of the learning curve. In all three datasets, the most important feature for the prediction was related to weight changes in the last convolutional layers. Our findings demonstrate that using weight features can help construct prediction models with a smaller number of training samples and terminate underperformance runs at an earlier stage of the learning process of DNNs than the conventional use of learning curve, thus facilitating the speed-up of hyperparameter searches.