Automated problem list generation from electronic medical records in IBM Watson
Identifying a patient's important medical problems requires broad and deep medical expertise, as well as significant time to gather all the relevant facts from the patient's medical record and assess the clinical importance of the facts in reaching the final conclusion. A patient's medical problem list is by far the most critical information that a physician uses in treatment and care of a patient. In spite of its critical role, its curation, manual or automated, has been an unmet need in clinical practice. We developed a machine learning technique in IBM Watson to automatically generate a patient's medical problem list. The machine learning model uses lexical and medical features extracted from a patient's record using NLP techniques. We show that the automated method achieves 70% recall and 67% precision based on the gold standard that medical experts created on a set of deidentified patient records from a major hospital system in the US. To the best of our knowledge this is the first successful machine learning/NLP method of extracting an open-ended patient's medical problems from an Electronic Medical Record (EMR). This paper also contributes a methodology for assessing accuracy of a medical problem list generation technique.