Early detection of heart failure with varying prediction windows by structured and unstructured data in electronic health records

Heart failure (HF) prevalence is increasing and is among the most costly diseases to society. Early detection of HF would provide the means to test lifestyle and pharmacologic interventions that may slow disease progression and improve patient outcomes. This study used structured and unstructured data from electronic health records (EHR) to predict onset of HF with a particular focus on how prediction accuracy varied in relation to time before diagnosis. EHR data were extracted from a single health care system and used to identify incident HF among primary care patients who received care between 2001 and 2010. A total of 1,684 incident HF cases were identified and 13,525 controls were selected from the same primary care practices. Models were compared by varying the beginning of the prediction window from 60 to 720 days before HF diagnosis. As the prediction window decreased, the performance [AUC (95% CIs)] of the predictive HF models increased from 65% (63%-66%) to 74% (73%-75%) for the unstructured, from 73% (72%-75%) to 81% (80%-83%) for the structured, and from 76% (74%-77%) to 83% (77%-85%) for the combined data.