Data-driven approach for assessing utility of medical tests using electronic medical records
Abstract
Objective: To precisely define the utility of tests in a clinical pathway through data-driven analysis of the electronic medical record (EMR). Materials and methods: The information content was defined in terms of the entropy of the expected value of the test related to a given outcome. A kernel density classifier was used to estimate the necessary distributions. To validate the method, we used data from the EMR of the gastrointestinal department at a university hospital. Blood tests from patients undergoing surgery for gastrointestinal surgery were analyzed with respect to second surgery within 30. days of the index surgery. Results: The information content is clearly reflected in the patient pathway for certain combinations of tests and outcomes. C-reactive protein tests coupled to anastomosis leakage, a severe complication show a clear pattern of information gain through the patient trajectory, where the greatest gain from the test is 3-4. days post index surgery. Discussion: We have defined the information content in a data-driven and information theoretic way such that the utility of a test can be precisely defined. The results reflect clinical knowledge. In the case we used the tests carry little negative impact. The general approach can be expanded to cases that carry a substantial negative impact, such as in certain radiological techniques.