Diagnosis of Neural Network via Backward Deduction
Although widely used in various areas, the Deep Neural Network suffers from the lack of interpretability. Existing works usually focus on one data instance and the found explanations are thus limited. We argue that a good understanding of a model should contain both systematic explanations of model behavior and effective detection of its vulnerability. Particularly we propose to use backward deduction to achieve these two goals. Given a constraint on the model output, the deduction backtracks the architecture to find corresponding data ranges in the input. In each layer, depending on the type, specific rules and/or algorithms are developed. The resulted ranges in the end can be interpreted by sampling exemplary data from them. In experiment we show that with different strategies in selecting the data ranges, the sampled fake data can either explain the model or reflect its vulnerability.