Discovery of Parkinson's disease states and disease progression modelling: a longitudinal data study using machine learning
BACKGROUND: Parkinson's disease is heterogeneous in symptom presentation and progression. Increased understanding of both aspects can enable better patient management and improve clinical trial design. Previous approaches to modelling Parkinson's disease progression assumed static progression trajectories within subgroups and have not adequately accounted for complex medication effects. Our objective was to develop a statistical progression model of Parkinson's disease that accounts for intra-individual and inter-individual variability and medication effects. METHODS: In this longitudinal data study, data were collected for up to 7-years on 423 patients with early Parkinson's disease and 196 healthy controls from the Parkinson's Progression Markers Initiative (PPMI) longitudinal observational study. A contrastive latent variable model was applied followed by a novel personalised input-output hidden Markov model to define disease states. Clinical significance of the states was assessed using statistical tests on seven key motor or cognitive outcomes (mild cognitive impairment, dementia, dyskinesia, presence of motor fluctuations, functional impairment from motor fluctuations, Hoehn and Yahr score, and death) not used in the learning phase. The results were validated in an independent sample of 610 patients with Parkinson's disease from the National Institute of Neurological Disorders and Stroke Parkinson's Disease Biomarker Program (PDBP). FINDINGS: PPMI data were download July 25, 2018, medication information was downloaded on Sept 24, 2018, and PDBP data were downloaded between June 15 and June 24, 2020. The model discovered eight disease states, which are primarily differentiated by functional impairment, tremor, bradykinesia, and neuropsychiatric measures. State 8, the terminal state, had the highest prevalence of key clinical outcomes including 18 (95%) of 19 recorded instances of dementia. At study outset 4 (1%) of 333 patients were in state 8 and 138 (41%) of 333 patients reached stage 8 by year 5. However, the ranking of the starting state did not match the ranking of reaching state 8 within 5 years. Overall, patients starting in state 5 had the shortest time to terminal state (median 2·75 [95% CI 1·75-4·25] years). INTERPRETATION: We developed a statistical progression model of early Parkinson's disease that accounts for intra-individual and inter-individual variability and medication effects. Our predictive model discovered non-sequential, overlapping disease progression trajectories, supporting the use of non-deterministic disease progression models, and suggesting static subtype assignment might be ineffective at capturing the full spectrum of Parkinson's disease progression. FUNDING: Michael J Fox Foundation.