Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics

Mi Yang; Francesca Petralia; Zhi Li; Hongyang Li; Weiping Ma; Xiaoyu Song; Sunkyu Kim; Heewon Lee; Han Yu; Bora Lee; Seohui Bae; Eunji Heo; Jan Kaczmarczyk; Piotr Stępniak; Michał Warchoł; Thomas Yu; Anna P. Calinawan; Paul C. Boutros; Samuel H. Payne; Boris Reva; Tunde Aderinwale; Ebrahim Afyounian; Piyush Agrawal; Mehreen Ali; Alicia Amadoz; Francisco Azuaje; John Bachman; Sherry Bhalla; José Carbonell-Caballero; Priyanka Chakraborty; Kumardeep Chaudhary; Yonghwa Choi; Yoonjung Choi; Cankut Çubuk; Sandeep Kumar Dhanda; Joaquín Dopazo; Laura L. Elo; Ábel Fóthi; Olivier Gevaert; Kirsi Granberg; Russell Greiner; Marta R. Hidalgo; Vivek Jayaswal; Hwisang Jeon; Minji Jeon; Sunil V. Kalmady; Yasuhiro Kambara; Jaewoo Kang; Keunsoo Kang; Tony Kaoma; Harpreet Kaur; Hilal Kazan; Devishi Kesar; Juha Kesseli; Daehan Kim; Keonwoo Kim; Sang Yoon Kim; Sajal Kumar; Yunpeng Liu; Roland Luethy; Swapnil Mahajan; Mehrad Mahmoudian; Arnaud Muller; Petr V. Nazarov; Hien Nguyen; Matti Nykter; Shujiro Okuda; Sungsoo Park; G. P.S. Raghava; Jagath C. Rajapakse; Tommi Rantapero; Hobin Ryu; Francisco Salavert; Sohrab Saraei; Ruby Sharma; Ari Siitonen; Artem Sokolov; Kartik Subramanian; Veronika Suni; Tomi Suomi; Léon Charles Tranchevent; Salman Sadullah Usmani; Tommi Välikangas; Roberto Vega; Hua Zhong; Emily Boja; Henry Rodriguez; Gustavo Stolovitzky; Yuanfang Guan; Pei Wang; David Fenyö; Julio Saez-Rodriguez

doi:10.1016/j.cels.2020.06.013

Cell Systems

Paper

26 Aug 2020

Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics

View publication

Abstract

A major manifestation of cancer is the alteration of protein measurements. However, proteins are harder and more expensive to measure than genes and transcripts. To address this problem, we crowdsourced it via the NCI-CPTAC DREAM proteogenomics challenge. We provided participants data to build models to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. We then asked participants to use such models to predict unseen (phospho)protein data from given genomic and transcriptomic data in other patients. This experiment allowed us to assess the predictive performance of the proposed methods in an unbiased and “double-blinded” manner. We found that ensemble methods perform better, and we identified which proteins and biological processes are easier or harder to predict. In general, performance was limited, suggesting that (phospho)proteomic cannot be replaced, at least yet, by genomic and transcriptomic profiling.

Paper