Systematic Analysis of Missing Proteins Provides Clues to Help Define All of the Protein-Coding Genes on Human Chromosome 1
Citations Over TimeTop 11% of 2013 papers
Abstract
Our first proteomic exploration of human chromosome 1 began in 2012 (CCPD 1.0), and the genome-wide characterization of the human proteome through public resources revealed that 32–39% of proteins on chromosome 1 remain unidentified. To characterize all of the missing proteins, we applied an OMICS-integrated analysis of three human liver cell lines (Hep3B, MHCC97H, and HCCLM3) using mRNA and ribosome nascent-chain complex-bound mRNA deep sequencing and proteome profiling, contributing mass spectrometric evidence of 60 additional chromosome 1 gene products. Integration of the annotation information from public databases revealed that 84.6% of genes on chromosome 1 had high-confidence protein evidence. Hierarchical analysis demonstrated that the remaining 320 missing genes were either experimentally or biologically explainable; 128 genes were found to be tissue-specific or rarely expressed in some tissues, whereas 91 proteins were uncharacterized mainly due to database annotation diversity, 89 were genes with low mRNA abundance or unsuitable protein properties, and 12 genes were identifiable theoretically because of a high abundance of mRNAs/RNC-mRNAs and the existence of proteotypic peptides. The relatively large contribution made by the identification of enriched transcription factors suggested specific enrichment of low-abundance protein classes, and SRM/MRM could capture high-priority missing proteins. Detailed analyses of the differentially expressed genes indicated that several gene families located on chromosome 1 may play critical roles in mediating hepatocellular carcinoma invasion and metastasis. All mass spectrometry proteomics data corresponding to our study were deposited in the ProteomeXchange under the identifiers PXD000529, PXD000533, and PXD000535.
Related Papers
- → The potential clinical impact of the tissue-based map of the human proteome(2015)68 cited
- → Recent advances in proteomics: towards the human proteome(2014)30 cited
- → Assessing the precision of high-throughput computational and laboratory approaches for the genome-wide identification of protein subcellular localization in bacteria(2005)52 cited
- → From the genome sequence to the proteome and back: Evaluation of E. coli genome annotation with a 2‐D gel‐based proteomics approach(2007)27 cited
- → Unraveling the Complex Proteome for Biomarker Discovery in Gastrointestinal and Liver Diseases(2006)11 cited