Publication
Big Data 2020
Conference paper

Prognostication and Outcome-specific Risk Factor Identification for Diabetes Care via Private-shared Multi-task Learning

View publication

Abstract

Diabetes is a chronic diseases that affects nearly half a billion people around the globe, and is almost always associated with a number of complications, including kidney failure, blindness, stroke, and heart attack. An important step towards improved diabetes care is to accurately predict the risk of diabetes complications and to identify the corresponding risk factors associated with the onset of each complication. In this paper, we study the problem of risk prediction and outcome-specific risk factor identification from readily available patient medical record data. We adopt a private-shared multi-task learning (MTL) model, which jointly models multiple complications with each task corresponding to the risk modeling of one complication. The MTL formulation not only boosts prediction performance but also enables identification of outcome-specific risk factors. Specifically, we decompose the coefficient matrix, in which each column (vector) corresponds to the coefficient of one complication risk model, into a shared component and an outcome-specific private component. The shared component is assumed to be low-rank to capture the relationships among complications in terms of overall diabetes health condition. The private component is assumed to be non-overlapping and sparse so that they are discriminative among the different complication outcomes. Further, the shared component and the private component for the same complication are assumed to be orthogonal. Extensive experimental results on a type 2 diabetes cohort extracted from a large electronic medical claims database show that the proposed method outperforms baseline models by a significant margin. Also the identified outcome-specific risk factors provide meaningful clinical insights. The results demonstrate that simultaneously modeling multiple risks through MTL not only improves prediction performance but also enables identification of outcome-specific risk factors.

Date

Publication

Big Data 2020

Authors

Share