• Country / Location Selection

United States (US)

A Review of the SARS-CoV-2 (COVID-19) Genome and Proteome

SARS-CoV-2 (COVID-19) Genome and Protein Functions


The SARS-CoV-2 has a ~29.9 kilobase positive-sense RNA genome that contains as many as 29 open reading frames. Though the exact number of functional proteins remains to be established, there are at least 16 nonstructural proteins (nsp), four structural proteins, and at least six or seven accessory proteins. Based on previous work with SARS-CoV and other coronaviruses, scientists have identified functions for the majority of these factors, though work is ongoing. A schematic of the SARS-CoV-2 genome is shown in the figure above, while the known or hypothesized functions of the viral proteins, based on studies of SARS-CoV and other coronaviruses, are summarized below in Table 1.

In December 2020, the United Kingdom reported a new variant from the B.1.1.7 lineage, designated “VUI-202012/01” by Public Health England, that is significantly more transmissible. It is spreading throughout the UK but primarily in London and Southeast England. The genome includes 17 mutations with many of those in the viral spike protein (most notably N501Y in the receptor-binding domain that was shown to increase ACE2 receptor affinity), as well as a second mutation adjacent to the furin cleavage site (P681H). In addition, there are mutations in the ORF1ab, ORF8, and N sequences.


See Product List


Table 1. Putative Functions of SARS-CoV-2 Proteins

Protein Functions   References
Spike (S) (ORF2) Spike full-length (~1273 a.a. in SARS-CoV-2) protein precursor is cleaved into glycosylated subunits, S1 and S2 (S2’). S1 binds to the host’s receptor, ACE2, while S2 mediates viral and host membrane fusion.   1
Nucleocapsid (N) (ORF9a) Nucleocapsid (~419 a.a. in SARS-CoV-2) binds viral genomic RNA and forms a helical ribonucleocapsid. Involved in genome protection, viral RNA replication, virion assembly, and immune evasion (including IFN-I suppression). Interacts with M and nsp3 proteins.   2
Membrane (M) (ORF5) Membrane/matrix protein (~222 a.a. in SARS-CoV-2) is the most abundant structural component of the virion, and very conserved. Mediates assembly and budding of viral particles through recruitment of other structural proteins to “ER-Golgi-intermediate compartment (ERGIC)”. Interaction with N for RNA packaging into virion. Interacts with accessory proteins 3a and 7a. Mitigation of immune response?   3
Envelope (E) (ORF4) Envelope small membrane protein (~75 a.a. in SARS-CoV-2) is a single-pass type III membrane protein involved in viral assembly, budding, and pathogenesis. Localizes to ERGIC. Forms a homopentameric ion channel and is a viroporin. Interacts with M, N, 3a, and 7a.   4
nsp1 Nonstructural protein 1 (nsp1; ~180 a.a. in SARS-CoV-2) likely inhibits host translation by interacting with 40S ribosomal subunit, leading to host mRNA degradation through cleavage near their 5’UTRs. Promotes viral gene expression and immunoevasion in part by interfering with interferon-mediated signaling.   5
nsp2 nsp2 (~638 a.a. in SARS-CoV-2) interacts with host factors prohibitin 1 and prohibitin 2, which are involved in many cellular processes including mitochondrial biogenesis. It appears that nsp2 may change the intracellular milieu and perturb host intracellular signaling.   6
nsp3 nsp3 (~1945 a.a. in SARS-CoV-2) is a papain-like protease (PLpro) and multi-pass membrane protein that processes the viral polyprotein to release nsp1, nsp2, and nsp3. It also exhibits deubiquitinating and deISGylating activities. Interacts with nsp4 and nsp6.   7
nsp4 nsp4 (~500 a.a. in SARS-CoV-2) is required for viral replication by inducing (with nsp3) assembly of, and localizing to, double-membrane cytoplasmic vesicles. Multi-pass membrane protein.   8
nsp5 nsp5 (3CLpro; ~306 a.a. in SARS-CoV-2) cleaves at 11 sites in the polyprotein to release nsp4-nsp16. It is also responsible for nsp maturation.   9
nsp6 nsp6 (~290 a.a. in SARS-CoV-2) is a multi-pass membrane protein that induces double-membrane vesicles in infected cells with nsp 3 and nsp4. It also limits autophagosome expansion and interferes with autophagosome delivery of viral factors to lysosomes for destruction.   10,11
nsp7 nsp7 (~83 a.a. in SARS-CoV-2) forms a hexadecamer with nsp8 as a cofactor for the RNA-dependent RNA polymerase nsp12. May have processivity or RNA primase function.   12
nsp8 nsp8 (~198 a.a. in SARS-CoV-2) forms a hexadecamer with nsp7 as a cofactor for the RNA-dependent RNA polymerase nsp12. May have processivity or RNA primase function. Mutation of certain residues in nsp8 is lethal to SARS-CoV by impacting RNA synthesis.   13
nsp9 nsp9 (~113 a.a. in SARS-CoV-2) functions in viral replication as a dimeric ssRNA-binding protein.   13
nsp10 nsp10 (~139 a.a. in SARS-CoV-2) forms a dodecamer and interacts with both nsp14 and nsp16 to stimulate their respective 3’-5’ exoribonuclease and 2’-O-methyltransferase activities in the formation of the viral mRNA capping machinery.   13
nsp11 nsp11 (~13-23 a.a., depending on the CoV species) is a pp1a cleavage product at the nsp10/11 boundary. For pp1ab, it is a frameshift product that becomes the N-terminal of nsp12. Its function, if any, is unknown.   13
nsp12 nsp12 (~932 a.a. in SARS-CoV-2) is the RNA-dependent RNA polymerase (RdRp) performing both replication and transcription of the viral genome. It has >95% identity to the SARS-CoV polymerase and is inhibited by the nucleoside analogue Remdesivir.   13
nsp13 nsp13 (~601 a.a. in SARS-CoV-2) is a multifunctional superfamily 1 helicase capable of using both dsDNA and dsRNA as substrates with 5’-3’ polarity. In addition to working with nsp12 in viral genome replication, it is also involved in viral mRNA capping. It associates with nucleoprotein in membranous complexes.   14
nsp14 nsp14 (~527 a.a. in SARS-CoV-2) has both 3’-5’ exoribonuclease (proofreading during RNA replication) and N7-guanine methyltransferase (viral mRNA capping) activities. Interacts with nsp10.   13
nsp15 nsp15 (~346 a.a. in SARS-CoV-2) is an endoribonuclease that favors cleavage of RNA at the 3’-ends of uridylates. Loss of nsp15 affects both viral replication and pathogenesis. It is also required for evasion of host cell dsRNA sensors.   15
nsp16 nsp16 (~298 a.a. in SARS-CoV-2) interacts with and is activated by nsp10. Its 2’-O-methyltransferase activity is essential for viral mRNA capping. It may also work against host cell antiviral sensors.   13
ORF3a ORF3a (~275 a.a. in SARS-CoV-2) is a multi-pass membrane protein that forms a homotetrameric viroporin in SARS-CoV. It interacts with accessory protein 7a, M, S and E. May be involved in viral release. Importantly, it also activates both NF-kB and NLRP3 inflammasome and contributes to the generation of cytokine storm.   16
ORF3b ORF3b (~22 a.a. in SARS-CoV-2) differs from its 154 a.a. SARS-CoV ortholog due to the presence of four premature stop codons. Along with N and ORF6, ORF3b appears to block induction of IFN-I. This 22-residue variant is also present in SARS-CoV-2-related viral genomes in bats and pangolins.   17
ORF6 ORF6 (~61 a.a. in SARS-CoV-2) appears to be a virulence factor in SARS-CoV. It was shown to be an antagonist of type I interferons (IFNs) and is involved in viral escape from the host innate immune system.   18
ORF7a ORF7a (~121 a.a. in SARS-CoV-2) is a type I membrane protein that interacts with bone marrow stromal antigen 2 (BST-2) in SARS-CoV. BST-2 tethers virions to the host’s plasma membrane. ORF7a binding inhibits BST-2 glycosylation and interferes with this restriction activity. ORF7a also interacts with S, M, E, and ORF3a in SARS-CoV.   19
ORF7b ORF7b (~43 a.a. in SARS-CoV-2) is a type III integral transmembrane protein in the Golgi apparatus. In SARS-CoV, it appears to be a viral attenuation factor. It may be involved in human infectivity of SARS-CoV-2.   20
ORF8 ORF8 (~121 a.a. in SARS-CoV-2) has only 30% identity to the intact ORF8 of SARS-CoV and might be a luminal ER membrane-associated protein. It may trigger ATF6 activation and affect the unfolded protein response (UPR). Like ORF7b, it may be involved in human infectivity of SARS-CoV-2.   20,21,22
ORF9b ORF9b (~97 a.a. in SARS-CoV-2) is coded for in an alternative ORF within the N gene. In SARS-CoV, it localizes to mitochondria and affects mitochondrial morphology and function, ultimately undermining host cell interferon responses.   24
ORF9c ORF9c (~70 a.a. in SARS-CoV), also located in the N coding region, interacts with various host proteins including Sigma receptors, implying involvement in lipid remodeling and the ER stress response. It also might target NF-kB signaling.   25
ORF10 ORF10 (~38 a.a. in SARS-CoV-2) interacts with factors in the CUL2 RING E3 ligase complex and thus may modulate ubiquitination.   25



  1. Cell. 2020 Mar 4. pii: S0092-8674(20)30229-4.
  2. Sci China Life Sci. 2020 Apr 10.
  3. Virol J. 2009 Jun 18;6:79.
  4. Virol J. 2019 May 27;16(1):69.
  5. PLoS One. 2013 Apr 29;8(4):e62416.
  6. J Virol. 2009 Oct;83(19):10314-8.
  7. Antiviral Res. 2015 Mar;115:21-38.
  8. Virology. 2017 Oct; 510: 165–174.
  9. Acta Pharm Sin B. 2020 Feb 27.
  10. mBio. 2013 Aug 13;4(4).
  11. Autophagy. 2014 Aug 1; 10(8): 1426–1441.
  12. Nat Commun. 2019 May 28;10(1):2342.
  13. Adv Virus Res. 2016;96:59-126.
  14. Sci Rep. 2020 Mar 11;10(1):4481.
  15. Proc Natl Acad Sci U S A. 2017 May 23;114(21):E4251-E4260.
  16. FASEB J. 2019 Aug;33(8):8865-8877.
  17. bioRxiv 2020 Epub.
  18. J Microbiol Immunol Infect. 2017 Jun;50(3):277-285.
  19. J Virol. 2015 Dec;89(23):11820-33.
  20. Virol J. 2009 Aug 24;6:131.
  21. Sci Rep. 2018 Oct 11;8(1):15177.
  22. Virology. 2009 May 10;387(2):402-13.
  23. J Virol. 2020 Apr 1;JVI.00411-20.
  24. J Immunol. 2014 Sep 15;193(6):3080-9.
  25. Nature. 2020 Apr 30.