Driver gene classification reveals a substantial overrepresentation of tumor suppressors among very large chromatin-regulating proteins
Compiling a comprehensive list of cancer driver genes is imperative for oncology diagnostics and drug development. While driver genes are typically discovered by analysis of tumor genomes, infrequently mutated driver genes often evade detection due to limited sample sizes. Here, we address sample size limitations by integrating tumor genomics data with a wide spectrum of gene-specific properties to search for rare drivers, functionally classify them, and detect features characteristic of driver genes. We show that our approach, CAnceR geNe similarity-based Annotator and Finder (CARNAF), enables detection of potentially novel drivers that eluded over a dozen pan-cancer/multi-tumor type studies. In particular, feature analysis reveals a highly concentrated pool of known and putative tumor suppressors among the <1% of genes that encode very large, chromatin-regulating proteins. Thus, our study highlights the need for deeper characterization of very large, epigenetic regulators in the context of cancer causality.