BioXpress: an integrated RNA-seq-derived gene expression database for pan-cancer analysis
Citations Over TimeTop 10% of 2015 papers
Abstract
BioXpress is a gene expression and cancer association database in which the expression levels are mapped to genes using RNA-seq data obtained from The Cancer Genome Atlas, International Cancer Genome Consortium, Expression Atlas and publications. The BioXpress database includes expression data from 64 cancer types, 6361 patients and 17 469 genes with 9513 of the genes displaying differential expression between tumor and normal samples. In addition to data directly retrieved from RNA-seq data repositories, manual biocuration of publications supplements the available cancer association annotations in the database. All cancer types are mapped to Disease Ontology terms to facilitate a uniform pan-cancer analysis. The BioXpress database is easily searched using HUGO Gene Nomenclature Committee gene symbol, UniProtKB/RefSeq accession or, alternatively, can be queried by cancer type with specified significance filters. This interface along with availability of pre-computed downloadable files containing differentially expressed genes in multiple cancers enables straightforward retrieval and display of a broad set of cancer-related genes.
Related Papers
- → RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation(2020)1,075 cited
- → Gene3D: merging structure and function for a Thousand genomes(2009)56 cited
- → Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors(2011)24 cited
- → MisPred: a resource for identification of erroneous protein sequences in public databases(2013)17 cited
- → Enriched atlas of lncRNA and protein-coding genes for the GRCg7b chicken assembly and its functional annotation across 47 tissues(2023)5 cited