The establishment of reference sequence for SARS‐CoV‐2 and variation analysis
Citations Over TimeTop 10% of 2020 papers
Abstract
Starting around December 2019, an epidemic of pneumonia, which was named COVID-19 by the World Health Organization, broke out in Wuhan, China, and is spreading throughout the world. A new coronavirus, named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by the Coronavirus Study Group of the International Committee on Taxonomy of Viruses was soon found to be the cause. At present, the sensitivity of clinical nucleic acid detection is limited, and it is still unclear whether it is related to genetic variation. In this study, we retrieved 95 full-length genomic sequences of SARAS-CoV-2 strains from the National Center for Biotechnology Information and GISAID databases, established the reference sequence by conducting multiple sequence alignment and phylogenetic analyses, and analyzed sequence variations along the SARS-CoV-2 genome. The homology among all viral strains was generally high, among them, 99.99% (99.91%-100%) at the nucleotide level and 99.99% (99.79%-100%) at the amino acid level. Although overall variation in open-reading frame (ORF) regions is low, 13 variation sites in 1a, 1b, S, 3a, M, 8, and N regions were identified, among which positions nt28144 in ORF 8 and nt8782 in ORF 1a showed mutation rate of 30.53% (29/95) and 29.47% (28/95), respectively. These findings suggested that there may be selective mutations in SARS-COV-2, and it is necessary to avoid certain regions when designing primers and probes. Establishment of the reference sequence for SARS-CoV-2 could benefit not only biological study of this virus but also diagnosis, clinical monitoring and intervention of SARS-CoV-2 infection in the future.
Related Papers
- → Nucleotide sequence of the gene ereA encoding the erythromycin esterase in Escherichia coli(1985)95 cited
- → I. Yeast sequencing reports. LTE1 of Saccharomyces cerevisiae is a 1435 codon open reading frame that has sequence similarities to guanine nucleotide releasing factors(1994)29 cited
- → Conservation of a long open reading frame in two Neurospora mitochondrial plasmids.(1986)23 cited
- → Complete Nucleotide Sequence of a Cryptic Plasmid from the Marine Bacterium Vibrio splendidus and Identification of Open Reading Frames(2000)7 cited
- Cloning and sequence analysis of tobacco antimicrobial peptide genes,novel members of GASA gene superfamily(2012)