Impact of Clinical and Genomic Factors on COVID-19 Disease Severity
Abstract
To date, there have been 180 million confirmed cases of COVID-19, with more than 3.8 million deaths, reported to WHO worldwide. In this paper we address the problem of understanding the host genome’s influence, in concert with clinical variables, on the severity of COVID-19 manifestation in the patient. Leveraging positive-unlabeled machine learning algorithms coupled with RubricOE, a state-of-the-art genomic analysis framework, on UK BioBank data we extract novel insights on the complex interplay. The algorithm is also sensitive enough to detect the changing influence of the emergent B.1.1.7 SARS-CoV-2 (alpha) variant on disease severity, and, changing treatment protocols. The genomic component also implicates biological pathways that can help in understanding the disease etiology. Our work demonstrates that it is possible to build a robust and sensitive model despite significant bias, noise and incompleteness in both clinical and genomic data by a careful interleaving of clinical and genomic methodologies.