Probing omics data via harmonic persistent homology
Abstract
Identifying molecular signatures from complex disease patients with underlying symptomatic similarities is a significant challenge in the analysis of high dimensional multi-omics data. Topological data analysis (TDA) provides a way of extracting such information from the geometric structure of the data and identifying multiway higher-order relationships. Here, we propose an application of Harmonic persistent homology, which overcomes the limitations of ambiguous assignment of the topological information to the original elements in a representative topological cycle from the data. When applied to multi-omics data, this leads to the discovery of hidden patterns highlighting the relationships between different omic profiles, while allowing for common tasks in multi-omics analyses, such as disease subtyping, and most importantly biomarker identification for similar latent biological pathways that are associated with complex diseases. Our experiments on multiple cancer data show that harmonic persistent homology effectively dissects multi-omics data to identify biomarkers by detecting representative cycles predictive of disease subtypes.