Journal of Infectious Diseases

Cohort-Derived Machine Learning Models for Individual Prediction of Chronic Kidney Disease in People Living with Human Immunodeficiency Virus: A Prospective Multicenter Cohort Study

Download paper


Background: It is unclear whether data-driven machine learning models, which are trained on large epidemiological cohorts, may improve prediction of comorbidities in people living with human immunodeficiency virus (HIV). Methods: In this proof-of-concept study, we included people living with HIV in the prospective Swiss HIV Cohort Study with a first estimated glomerular filtration rate (eGFR) >60 mL/minute/1.73 m2 after 1 January 2002. Our primary outcome was chronic kidney disease (CKD) - defined as confirmed decrease in eGFR ≤60 mL/minute/1.73 m2 over 3 months apart. We split the cohort data into a training set (80%), validation set (10%), and test set (10%), stratified for CKD status and follow-up length. Results: Of 12 761 eligible individuals (median baseline eGFR, 103 mL/minute/1.73 m2), 1192 (9%) developed a CKD after a median of 8 years. We used 64 static and 502 time-changing variables: Across prediction horizons and algorithms and in contrast to expert-based standard models, most machine learning models achieved state-of-the-art predictive performances with areas under the receiver operating characteristic curve and precision recall curve ranging from 0.926 to 0.996 and from 0.631 to 0.956, respectively. Conclusions: In people living with HIV, we observed state-of-the-art performances in forecasting individual CKD onsets with different machine learning algorithms.