Measuring patient similarities via a deep architecture with medical concept embedding
Evaluating the clinical similarities between pairwisepatients is a fundamental problem in healthcare informatics. Aproper patient similarity measure enables various downstreamapplications, such as cohort study and treatment comparative effectiveness research. One major carrier for conductingpatient similarity research is the Electronic Health Records(EHRs), which are usually heterogeneous, longitudinal, andsparse. Though existing studies on learning patient similarityfrom EHRs have shown being useful in solving real clinicalproblems, their applicability is limited due to the lack of medicalinterpretations. Moreover, most previous methods assume avector based representation for patients, which typically requiresaggregation of medical events over a certain time period. As aconsequence, the temporal information will be lost. In this paper, we propose a patient similarity evaluation framework based ontemporal matching of longitudinal patient EHRs. Two efficientmethods are presented, unsupervised and supervised, both ofwhich preserve the temporal properties in EHRs. The supervisedscheme takes a convolutional neural network architecture, andlearns an optimal representation of patient clinical recordswith medical concept embedding. The empirical results on real-world clinical data demonstrate substantial improvement overthe baselines.