About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Proteomics
Paper
Centralized data analysis of a large interlaboratory proteomics project: A feasibility study
Abstract
The human Plasma Proteome Project (PPP) is a large-scale collaboration between many laboratories. One of the most demanding tasks in the PPP involved the analysis of very large amounts of raw MS/MS data produced by the participants. The main approach for managing this task was letting the participants analyze their own data and submit the results to the central PPP repository as lists of identified proteins and peptides. To complement this distributed approach, we also performed centralized analysis of the raw MS/MS data provided by the participants. Due to the data redundancy inherent in such a project, centralized analysis has the potential to reduce the computational effort by reducing redundancy before the analysis. Centralized analysis can also unify the process and take advantage of data sharing among laboratories to improve protein identification and validation. The process we employed included removing low-quality spectra, clustering spectra by mutual similarity, and applying uniform peptide and protein identification procedures. To demonstrate the process, we analyzed 5.28 million MS/MS spectra derived by eight laboratories from tryptic peptides of serum and plasma proteins. © 2005 Wiley-VCH Verlag GmbH & Co. KGaA.