Identifying a medical department based on unstructured data: A big data application in healthcare
Abstract
Health is an individual's most precious asset and healthcare is one of the vehicles for preserving it. The Indian government's spend on healthcare system is relatively low (1.2% of GDP). Consequently, Secondary and Tertiary government healthcare centers in India (that are presumed to be of above average ratings) are always crowded. In Tertiary healthcare centers, like the All India Institute of Medical Science (AIIMS), patients are often unable to articulate their problems correctly to the healthcare center's reception staff, so that these patients to be directed to the correct healthcare department. In this paper, we propose a system that will scan prescriptions, referral letters and medical diagnostic reports of a patient, process the input using OCR (Optical Character Recognition) engines, coupled with image processing tools, to direct the patient to the most relevant department. We have implemented and tested parts of this system wherein a patient enters his symptoms and/or provisional diagnosis; the system suggests a department based on this user input. Our system suggests the correct department 70.19% of the time. On further investigation, we found that one particular department of the hospital was over-represented. We eliminated the department from the data and performance of the system improved to 92.7%. Our system presently makes its suggestions using random forest algorithm that has been trained using two information repositories-symptoms and disease data, functional description of each medical department. It is our informed assumption that, once we have incorporated medicine information and diagnostics imaging data to train the system; and the complete medical history of the patient, performance of the system will improve further.