A PROTEOGENOMICS DATA-DRIVEN KNOWLEDGE BASE OF HUMAN CANCER
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 treatment naïve primary tumors spanning 10 cancer types, including many with matched normal adjacent tissues. Each sample is characterized by whole genome sequencing, whole exome sequencing, methylation array, RNA-Seq, miRNA-Seq, proteomics, and phosphoproteomics. We performed harmonized and systematic computational analyses on these data, both within each cancer type and across all cancer types. Precomputed results are organized into ~40,000 gene-, protein-, mutation-, and phenotype-centric web pages, which can be browsed in this portal. Information in the portal includes phosphosites detectability across all studies, tumor vs normal difference at mRNA, protein, and phosphosite levels, respectively, mutation and phenotype associations for individual mRNAs, proteins, and phosphosites, cis-association across omics layers, and pair-wise associations between mRNAs, proteins, and phosphosites and kinases/phosphatases. Results are displayed in sortable, searchable, and filterable tables, zoomable lollipop plots, heatmaps, and various types of statistical plots, such as survival plots, scatter plots comparing phosphosite and protein level changes, and Manhattan plots visualizing pan-cancer, multi-omics results. mRNA, protein, and phosphosite-level results are directly fed into WebGestalt for pathway and network interpretation. With easily browsable data on 19,701 coding genes, 126,547 phosphosites, and 256 genotypes and phenotypes from 10 cancer types and pan-cancer analyses, LinkedOmicsKB makes CPTAC data readily useful to the broad cancer research community.
Please cite: A proteogenomics data-driven knowledge base of human cancer, Yuxing Liao, Sara R. Savage, Yongchao Dou, Zhiao Shi, Xinpei Yi, Wen Jiang, Jonathan T. Lei, Bing Zhang, Cell Systems, 2023.