Deep mining generation of lung cancer malignancy models from chest x-ray images
Lung cancer is the leading cause of cancer death and morbidity worldwide. Many studies have shown machine learning models to be effective in detecting lung nodules from chest X-ray images. However, these techniques have yet to be embraced by the medical community due to several practical, ethical, and regulatory constraints stemming from the “black-box” nature of deep learning models. Additionally, most lung nodules visible on chest X-rays are benign; therefore, the narrow task of computer vision-based lung nodule detection cannot be equated to automated lung cancer detection. Addressing both concerns, this study introduces a novel hybrid deep learning and decision tree-based computer vision model, which presents lung cancer malignancy predictions as interpretable decision trees. The deep learning component of this process is trained using a large publicly available dataset on pathological biomarkers associated with lung cancer. These models are then used to inference biomarker scores for chest X-ray images from two independent data sets, for which malignancy metadata is available. Next, multi-variate predictive models were mined by fitting shallow decision trees to the malignancy stratified datasets and interrogating a range of metrics to determine the best model. The best decision tree model achieved sensitivity and specificity of 86.7% and 80.0%, respectively, with a positive predictive value of 92.9%. Decision trees mined using this method may be considered as a starting point for refinement into clinically useful multi-variate lung cancer malignancy models for implementation as a workflow augmentation tool to improve the efficiency of human radiologists.