Cross-project software defect prediction using feature-based transfer learning
Cross-project defect prediction is taken as an effective means of predicting software defects when the data shortage exists in the early phase of software development. Unfortunately, the precision of cross-project defect prediction is usually poor, largely because of the differences between the reference and the target projects. Having realized the project differences, this paper proposes CPDP, a featurebased transfer learning approach to cross-project defect prediction. The core insight of CPDP is to (1) filter and transfer highlycorrelated data based on data samples in the target projects, and (2) evaluate and choose learning schemas for transferring data sets. Models are then built for predicting defects in the target projects. We have also conducted an evaluation of the proposed approach on PROMISE datasets. The evaluation results show that, the proposed approach adapts to cross-project defect prediction in that f-measure of 81.8% of projects can get improved, and AUC of 54.5% projects improved. It also achieves similar f-measure and AUC as some inner-project defect prediction approaches.