Back to top

The current mutation

ID: V8594
DNA: 20997A>G
Protein: S6999S
Position: 21261








COV2Var annotation categories







Summary information of mutation (20997A>G)

Basic Information about Mutation.

  Gene Information   Gene ID   GU280_gp01_pp1ab
  Gene Name   ORF1ab_pp1ab
  Gene Type   protein_coding
  Genome position   21261
  Reference genome   GenBank ID: NC_045512.2
  Mutation type   synonymous_variant
  DNA Level   DNA Mutation: 20997A>G
  Ref Seq: A
  Mut Seq: G
  Protein Level   Protein 1-letter Mutation: S6999S
  Protein 3-letter Mutation: Ser6999Ser

Overview of the genomic positions of Mutation.
Note: The annotated 12 genes were retrieved from GeneBank (Accession: NC_045512.2). "MP" represents genomic position of mutation.





Analyzing the distribution of mutation (20997A>G) across geographic regions, temporal trends, and lineages

The count of genome sequences harboring this mutation and its distribution across global regions offer insights into regional variations.
Note: The distribution of mutation across 218 geographical regions. Color representation of genome sequence counts. The data is obtained from GISAID's metadata, specifically capturing the regional distribution of genomic sequences.



The dynamic count of genome sequences containing this mutation over time.
Note: Clicking the "Count" or "Cumulative Count" button toggles the view. Count represents the number of genome sequences per month. Cumulative count represents the accumulated total count up to the respective month. The data is obtained from GISAID's metadata, specifically capturing the collection date of genomic sequences.



For every time point represented in the graph above, identifying the top 3 lineages with the highest count of genome sequences carrying this mutation aids in pinpointing noteworthy lineages for further analysis.
Note: Users can filter the lineages by entering a "Year-Month" term in the search box. For example, entering 2020-01 will display lineages that appeared in January 2020. The data is obtained from GISAID's metadata, specifically capturing the collection date of genomic sequences.

Collection date Lineage Total lineage monthly counts Lineage-specific monthly counts Lineage-specific monthly frequency
2020-10 B.1.177 4 3 7.50e-1
2020-10 B.1.1.70 4 1 2.50e-1
2020-11 B.1.2 2 1 5.00e-1
2020-11 B.1.258.17 2 1 5.00e-1
2020-12 B.1.2 14 6 4.29e-1
2020-12 B.1.1.63 14 2 1.43e-1
2020-12 B.1.177 14 2 1.43e-1
2020-04 B.1.371 2 1 5.00e-1
2020-04 B.1.595 2 1 5.00e-1
2020-07 B.1.546 1 1 1.00e+0
2020-08 B.1.1.28 5 5 1.00e+0
2020-09 B.1.1.28 3 3 1.00e+0
2021-01 B.1.2 43 18 4.19e-1
2021-01 B.1.1.63 43 8 1.86e-1
2021-01 B.1.1.25 43 4 9.30e-2
2021-10 AY.39.1 672 628 9.35e-1
2021-10 AY.4 672 10 1.49e-2
2021-10 B.1.617.2 672 6 8.93e-3
2021-11 AY.39.1 767 648 8.45e-1
2021-11 AY.109 767 33 4.30e-2
2021-11 AY.4.2 767 19 2.48e-2
2021-12 AY.39.1 514 417 8.11e-1
2021-12 AY.109 514 21 4.09e-2
2021-12 AY.4.2 514 17 3.31e-2
2021-02 B.1.2 32 21 6.56e-1
2021-02 B.1.427 32 4 1.25e-1
2021-02 B.1.1.7 32 2 6.25e-2
2021-03 B.1.427 32 12 3.75e-1
2021-03 B.1.2 32 11 3.44e-1
2021-03 B.1.1.7 32 8 2.50e-1
2021-04 B.1.1.7 17 12 7.06e-1
2021-04 B.1.1.318 17 2 1.18e-1
2021-04 B.1.2 17 2 1.18e-1
2021-05 B.1.1.7 7 5 7.14e-1
2021-05 B.1.617.2 7 1 1.43e-1
2021-05 B.1.626 7 1 1.43e-1
2021-06 AY.39.1 13 3 2.31e-1
2021-06 B.1.1.7 13 3 2.31e-1
2021-06 P.1.4 13 3 2.31e-1
2021-07 AY.39.1 210 195 9.29e-1
2021-07 P.1.4 210 5 2.38e-2
2021-07 AY.25 210 2 9.52e-3
2021-08 AY.39.1 665 632 9.50e-1
2021-08 B.1.617.2 665 10 1.50e-2
2021-08 AY.43 665 5 7.52e-3
2021-09 AY.39.1 890 849 9.54e-1
2021-09 B.1.617.2 890 7 7.87e-3
2021-09 AY.4 890 6 6.74e-3
2022-01 BA.1.1 76 26 3.42e-1
2022-01 AY.39.1 76 25 3.29e-1
2022-01 AY.109 76 7 9.21e-2
2022-10 BA.5.5 21 7 3.33e-1
2022-10 XBB.1 21 6 2.86e-1
2022-10 BQ.1.1.1 21 4 1.90e-1
2022-11 BQ.1.1.1 66 41 6.21e-1
2022-11 BQ.1.1.3 66 17 2.58e-1
2022-11 BA.5.6.4 66 2 3.03e-2
2022-12 BQ.1.1.1 125 96 7.68e-1
2022-12 BQ.1.1.3 125 8 6.40e-2
2022-12 BQ.1.1 125 5 4.00e-2
2022-02 BA.1.1 50 32 6.40e-1
2022-02 BA.1 50 6 1.20e-1
2022-02 AY.109 50 3 6.00e-2
2022-03 BA.1.1 37 26 7.03e-1
2022-03 BA.2 37 6 1.62e-1
2022-03 BA.1.15 37 3 8.11e-2
2022-04 BA.1.1 14 4 2.86e-1
2022-04 BA.2 14 4 2.86e-1
2022-04 BA.2.36 14 3 2.14e-1
2022-05 BA.2 12 8 6.67e-1
2022-05 BA.2.9 12 2 1.67e-1
2022-05 BA.1.1 12 1 8.33e-2
2022-06 BA.5.5 13 5 3.85e-1
2022-06 BA.2 13 3 2.31e-1
2022-06 BA.2.3.12 13 2 1.54e-1
2022-07 BA.5.5 72 58 8.06e-1
2022-07 BA.5.6 72 5 6.94e-2
2022-07 BA.4.6 72 3 4.17e-2
2022-08 BA.5.5 38 34 8.95e-1
2022-08 BA.5.6 38 2 5.26e-2
2022-08 BA.5.1.3 38 1 2.63e-2
2022-09 BA.5.5 20 13 6.50e-1
2022-09 BA.5 20 3 1.50e-1
2022-09 BA.4 20 1 5.00e-2
2023-01 BQ.1.1.1 42 23 5.48e-1
2023-01 BA.2.75.4 42 5 1.19e-1
2023-01 BQ.1.1.3 42 4 9.52e-2
2023-02 BN.1.3 2 1 5.00e-1
2023-02 BQ.1.1.1 2 1 5.00e-1

The count of genome sequences and the frequency of this mutation in each lineage.
Note: Displaying mutation frequencies (>0.01) among 2,735 lineages. Mutation Count represents the count of sequences carrying this mutation. Users can filter the lineages by entering a search term in the search box. For example, entering "A.1" will display A.1 lineages. The data is obtained from GISAID's metadata, specifically capturing the lineage of genomic sequences. Mutation count: Count of sequences carrying this mutation.

Mutation ID Lineage Mutation frequency Mutation count Earliest lineage emergence Latest lineage emergence
V8594 AY.39.1 1.52e-1 3398 2021-1-6 2022-2-9
V8594 AY.109 1.65e-2 64 2021-1-10 2022-2-21
V8594 P.1.4 1.02e-2 11 2021-3-19 2021-11-11
V8594 BQ.1.1.1 3.55e-2 165 2022-1-4 2023-2-20
V8594 BA.5.6.4 1.88e-2 3 2022-7-3 2023-1-24
V8594 Y.1 2.63e-2 2 2020-11-3 2021-3-26






Examining mutation (20997A>G) found in abundant sequences of non-human animal hosts

Exploring mutation presence across 35 non-human animal hosts for cross-species transmission.
Note: We retained the mutation that appear in at least three non-human animal hosts' sequences. The data is obtained from GISAID's metadata, specifically capturing the host of genomic sequences.

Animal host Lineage Source region Collection date Accession ID




Association between mutation (20997A>G) and patients of different ages, genders, and statuses

Note: The logistic regression model was employed to examine changes in patient data before and after the mutation. The logistic regression model was conducted using the glm function in R. The data is obtained from GISAID's metadata, specifically capturing the patient status, gender, and age of genomic sequences.

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient statuses (ambulatory, deceased, homebound, hospitalized, mild, and recovered) based on GISAID classifications. In the analysis exploring the association between mutation and patient status, the model included mutation, patient status, patient age, gender, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient status Ambulatory -1.63e+1 2.88e+3 -5.68e-3 9.95e-1 Decrease
Deceased 3.23e+1 8.79e+4 3.67e-4 1.00e+0 Increase
Homebound -1.19e-13 1.87e+5 -6.36e-19 1.00e+0 Decrease
Hospitalized 2.96e+0 1.77e+0 1.67e+0 9.41e-2 Increase
Mild -1.41e+1 3.14e+3 -4.50e-3 9.96e-1 Decrease
Recovered -1.85e+1 1.51e+3 -1.22e-2 9.90e-1 Decrease

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient age (0-17, 18-39, 40-64, 65-84, and 85+). In the analysis exploring the association between mutation and patient age, the model included mutation, patient age, gender, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient age, years 0-17 2.81e-1 8.44e-2 3.33e+0 8.72e-4 Increase
18-39 -2.79e-1 4.85e-2 -5.75e+0 9.11e-9 Decrease
40-64 4.12e-2 5.01e-2 8.22e-1 4.11e-1 Increase
65-84 2.30e-1 7.20e-2 3.19e+0 1.43e-3 Increase
>=85 1.35e-2 1.50e-1 8.99e-2 9.28e-1 Increase

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient gender (male and female). In the analysis exploring the association between mutation and patient gender, the model included mutation, patient gender, patient age, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient gender Male 8.23e-3 4.75e-2 1.73e-1 8.63e-1 Increase





Investigating natural selection at mutation (20997A>G) site for genetic adaptation and diversity

Note: Investigating the occurrence of positive selection or negative selection at this mutation site reveals implications for genetic adaptation and diversity.

The MEME method within the HyPhy software was employed to analyze positive selection. MEME: episodic selection.
Note: List of sites found to be under episodic selection by MEME (p < 0.05). "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site P-value Lineage Method

The FEL method within the HyPhy software was employed to analyze both positive and negative selection. FEL: pervasive selection on samll datasets.
Note: List of sites found to be under pervasive selection by FEL (p < 0.05). A beta value greater than alpha signifies positive selection, while a beta value smaller than alpha signifies negative selection. "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site Alpha Beta P-value Lineage Method

The FUBAR method within the HyPhy software was employed to analyze both positive and negative selection. FUBAR: pervasive selection on large datasets.
Note: List of sites found to be under pervasive selection by FUBAR (prob > 0.95). A prob[alpha < beta] value exceeding 0.95 indicates positive selection, while a prob[alpha > beta] value exceeding 0.95 indicates negative selection. "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site Prob[alpha>beta] Prob[alpha<beta] Lineage Method




Alterations in protein physicochemical properties induced by mutation (20997A>G)

Understanding the alterations in protein physicochemical properties can reveal the evolutionary processes and adaptive changes of viruses
Note: ProtParam software was used for the analysis of physicochemical properties. Significant change threshold: A change exceeding 10% compared to the reference is considered a significant change. "GRAVY" is an abbreviation for "grand average of hydropathicity".

Group Protein name Molecular weight Theoretical PI Extinction coefficients Aliphatic index GRAVY




Alterations in protein stability induced by mutation (20997A>G)

The impact of mutations on protein stability directly or indirectly affects the biological characteristics, adaptability, and transmission capacity of the virus
Note: iMutant 2.0 was utilized to analyze the effects of mutations on protein stability. pH 7 and a temperature of 25°C are employed to replicate the in vitro environment. pH 7.4 and a temperature of 37°C are utilized to simulate the in vivo environment.

Mutation Protein name Mutation type Position ΔDDG Stability pH Temperature Condition




Impact on protein function induced by mutation (20997A>G)

The impact of mutations on protein function
Note: The MutPred2 software was used to predict the pathogenicity of a mutation and gives the molecular mechanism of pathogenicity. A score above 0.5 indicates an increased likelihood of pathogenicity. "Pr" is the abbreviation for "proportion. P" is the abbreviation for "p-value.

Mutation Protein name Mutation type Score Molecular mechanisms




Exploring mutation (20997A>G) distribution within intrinsically disordered protein regions

Intrinsically Disordered Proteins (IDPs) which refers to protein regions that have no unique 3D structure. In viral proteins, mutations in the disordered regions s are critical for immune evasion and antibody escape, suggesting potential additional implications for vaccines and monoclonal therapeutic strategies.
Note: The iupred3 software was utilized for analyzing IDPs. A score greater than 0.5 is considered indicative of an IDP. In the plot, "POS" represents the position of the mutation.





Alterations in enzyme cleavage sites induced by mutation (20997A>G)

Exploring the impact of mutations on the cleavage sites of 28 enzymes.
Note: The PeptideCutter software was used for detecting enzymes cleavage sites. The increased enzymes cleavage sites refer to the cleavage sites in the mutated protein that are added compared to the reference protein. Conversely, the decreased enzymes cleavage sites indicate the cleavage sites in the mutated protein that are reduced compared to the reference protein.

Mutation Protein name Genome position Enzyme name Increased cleavage sites Decreased cleavage sites




Impact of spike protein mutation (20997A>G) on antigenicity and immunogenicity

Investigating the impact of mutations on antigenicity and immunogenicity carries important implications for vaccine design and our understanding of immune responses.
Note: An absolute change greater than 0.0102 (three times the median across sites) in antigenicity score is considered significant. An absolute changegreater than 0.2754 (three times the median across sites) in immunogenicity score is considered significant. The VaxiJen tool was utilized for antigenicity analysis. The IEDB tool was used for immunogenicity analysis. Antigens with a prediction score of more than 0.4 for this tool are considered candidate antigens. MHC I immunogenicity score >0, indicating a higher probability to stimulate an immune response.

Group Protein name Protein region Antigenicity score Immunogenicity score




Impact of mutation (20997A>G) on viral transmissibility by the affinity between RBD and ACE2 receptor

Unraveling the impact of mutations on the interaction between the receptor binding domain (RBD) and ACE2 receptor using deep mutational scanning (DMS) experimental data to gain insights into their effects on viral transmissibility.
Note: The ΔBinding affinity represents the disparity between the binding affinity of a mutation and the reference binding affinity. A positive Δbinding affinity value (Δlog10(KD,app) > 0) signifies an increased affinity between RBD and ACE2 receptor due to the mutation. Conversely, a negative value (Δlog10(KD,app) < 0) indicates a reduced affinity between RBD and ACE2 receptor caused by the mutation. A p-value smaller than 0.05 indicates significance. "Ave mut bind" represents the average binding affinity of this mutation. "Ave ref bind" refers to the average binding affinity at a site without any mutation (reference binding affinity).

;
Mutation Protein name Protein region Mutation Position Ave mut bind Ave ref bind ΔBinding affinity P-value Image


The interface between the receptor binding domain (RBD) and ACE2 receptor is depicted in the crystal structure 6JM0.
Note: The structure 6M0J encompasses the RBD range of 333 to 526. The binding sites (403-406, 408, 417, 439, 445-447, 449, 453, 455-456, 473-478, 484-498, and 500-506) on the RBD that interface with ACE2 are indicated in magenta. The binding sites on the RBD that have been identified through the interface footprints experiment. The ACE2 binding sites within the interface are shown in cyan, representing residues within 5Å proximity to the RBD binding sites. The mutation within the RBD range of 333 to 526 is depicted in red.

        Show as:

        Show interface residues:





Impact of mutation (20997A>G) on immune escape by the affinity between RBD and antibody/serum

By utilizing experimental data from deep mutational scanning (DMS), we can uncover how mutations affect the interaction between the receptor binding domain (RBD) and antibodies/serum. This approach provides valuable insights into strategies for evading the host immune response.
Note: We considered a mutation to mediate strong escape if the escape score exceeded 0.1 (10% of the maximum score of 1). A total of 1,504 antibodies/serum data were collected for this analysis. "Condition name" refers to the name of the antibodies/serum. "Mut escape score" represents the escape score of the mutation in that specific condition. "Avg mut escape score" indicates the average escape score of the mutation site in that condition, considering the occurrence of this mutation and other mutations. Class 1 antibodies bind to an epitope only in the RBD “up” conformation, and are the most abundant. Class 2 antibodies bind to the RBD both in “up” and “down” conformations. Class 3 and class 4 antibodies both bind outside the ACE2 binding site. Class 3 antibodies bind the RBD in both the open and closed conformation, while class 4 antibodies bind only in the open conformation.

Mutation Condition name Condition type Condition subtype Condition year Mut escape score Avg mut escape score




Investigating the co-mutation patterns of mutation (20997A>G) across 2,735 viral lineages

Investigating the co-mutation patterns of SARS-CoV-2 across 2,735 viral lineages to unravel the cooperative effects of different mutations. In biological research, correlation analysis of mutation sites helps us understand whether there is a close relationship or interaction between certain mutations.
Note: The Spearman correlation coefficient is used to calculate the correlation between two mutations within each Pango lineage. Holm–Bonferroni method was used for multiple test adjustment. We retained mutation pairs with correlation values greater than 0.6 or less than -0.6 and Holm–Bonferroni corrected p-values less than 0.05.

Associated mutation ID DNA mutation Mutation type Protein name Protein mutation correlation coefficient Lineage
V1552 6542C>T missense_variant ORF1ab_pp1a T2181I 7.07e-1 BA.1.1.2
V2454 13217A>G splice_region_variant&stop_retained_variant ORF1ab_pp1a Ter4406Ter 6.32e-1 AY.20
V4204 3380A>G missense_variant S D1127G 6.74e-1 AY.20
V8673 324T>C synonymous_variant S T108T 7.75e-1 AY.20
V2062 10463C>T missense_variant ORF1ab_pp1a T3488I 9.16e-1 AY.109
V8587 20967A>G synonymous_variant ORF1ab_pp1ab T6989T 9.92e-1 AY.109
V864 2912C>T missense_variant ORF1ab_pp1a P971L 9.42e-1 AY.109
V9127 3771T>C synonymous_variant S D1257D 9.49e-1 AY.109
V9348 385C>T synonymous_variant M L129L 8.67e-1 AY.109
V2454 13217A>G splice_region_variant&stop_retained_variant ORF1ab_pp1a Ter4406Ter 7.56e-1 AY.112
V7585 13107C>T synonymous_variant ORF1ab_pp1a V4369V 7.71e-1 AY.112
V968 3287C>T missense_variant ORF1ab_pp1a P1096L 8.02e-1 AY.112
V3132 18638A>G missense_variant ORF1ab_pp1ab K6213R 1.00e+0 AY.116
V5141 180G>T missense_variant ORF8 L60F 1.00e+0 AY.116
V9060 3342C>T synonymous_variant S I1114I 1.00e+0 AY.116
V1694 7664A>G missense_variant ORF1ab_pp1a K2555R 7.07e-1 AY.121
V5155 197_202delGTTCTA disruptive_inframe_deletion ORF8 G66_K68delinsE 7.07e-1 AY.1
V2976 17561C>T missense_variant ORF1ab_pp1ab T5854I 1.00e+0 AY.34.2
V4398 160G>T missense_variant ORF3a A54S 7.07e-1 AY.34.2
V4955 251C>T missense_variant ORF7a P84L 1.00e+0 AY.34.2
V826 2782G>T missense_variant ORF1ab_pp1a D928Y 1.00e+0 AY.34.2
V6197 2328T>C synonymous_variant ORF1ab_pp1a A776A 1.00e+0 AY.34
V5696 31T>C missense_variant ORF10 F11L 1.00e+0 AY.38
V6375 3627G>A synonymous_variant ORF1ab_pp1a E1209E 8.53e-1 AY.4.5
V236 352T>C missense_variant ORF1ab_pp1a Y118H 8.43e-1 AY.46.6
V2575 13986G>T missense_variant ORF1ab_pp1ab L4662F 6.04e-1 AY.46.6
V6033 1200T>C synonymous_variant ORF1ab_pp1a G400G 1.00e+0 AY.4.6
V2516 13633G>T missense_variant ORF1ab_pp1ab D4545Y 7.07e-1 AY.46
V4061 2240C>T missense_variant S T747I 7.07e-1 AY.62
V5278 47C>T missense_variant N T16M 1.00e+0 AY.62
V909 3058G>A missense_variant ORF1ab_pp1a V1020I 1.00e+0 AY.62
V9323 279C>T synonymous_variant M L93L 7.07e-1 AY.62
V5329 118C>T missense_variant N R40C 6.67e-1 AY.75
V6194 2286T>C synonymous_variant ORF1ab_pp1a D762D 1.00e+0 B.1.1.207
V8779 1167T>C synonymous_variant S D389D 1.00e+0 B.1.1.207
V1017 3472C>T missense_variant ORF1ab_pp1a P1158S 6.32e-1 B.1.1.25
V5772 *4354C>T downstream_gene_variant S None 1.00e+0 B.1.1.25
V6413 3942G>T synonymous_variant ORF1ab_pp1a A1314A 6.32e-1 B.1.1.28
V2067 10496A>G missense_variant ORF1ab_pp1a K3499R 1.00e+0 B.1.1.416
V2443 13159C>T missense_variant ORF1ab_pp1a R4387C 7.07e-1 B.1.1.416
V3228 19322C>T missense_variant ORF1ab_pp1ab T6441I 1.00e+0 B.1.1.416
V5399 455C>T missense_variant N A152V 1.00e+0 B.1.1.416
V5513 702G>T missense_variant N M234I 8.94e-1 B.1.1.416
V6730 6276C>T synonymous_variant ORF1ab_pp1a H2092H 1.00e+0 B.1.1.416
V7239 10329C>T synonymous_variant ORF1ab_pp1a N3443N 8.94e-1 B.1.1.416
V9448 168G>T synonymous_variant ORF7a L56L 8.16e-1 B.1.1.416
V3267 19631C>T missense_variant ORF1ab_pp1ab A6544V 8.55e-1 B.1.1.63
V5682 8A>G missense_variant ORF10 Y3C 8.04e-1 B.1.1.63
V6141 1932C>T synonymous_variant ORF1ab_pp1a D644D 8.04e-1 B.1.1.63
V7644 13596C>T synonymous_variant ORF1ab_pp1ab D4532D 6.38e-1 B.1.1.63
V3924 1480T>C missense_variant S S494P 1.00e+0 B.1.1.70
V1302 5279C>T missense_variant ORF1ab_pp1a T1760I 7.07e-1 B.1.177.44
V2767 15958C>T missense_variant ORF1ab_pp1ab H5320Y 7.07e-1 B.1.177.44
V4340 80A>G missense_variant ORF3a D27G 7.07e-1 B.1.239
V6996 8517C>T synonymous_variant ORF1ab_pp1a S2839S 1.00e+0 B.1.239
V9762 972C>T synonymous_variant N V324V 1.00e+0 B.1.239
V4313 41C>T missense_variant ORF3a T14I 7.07e-1 B.1.258.17
V6666 5880C>T synonymous_variant ORF1ab_pp1a F1960F 7.07e-1 BA.2.13
V6035 1206C>T synonymous_variant ORF1ab_pp1a R402R 8.16e-1 BA.2.3.12
V6164 2133G>A synonymous_variant ORF1ab_pp1a T711T 1.00e+0 BA.2.3.12
V7027 8721C>T synonymous_variant ORF1ab_pp1a D2907D 1.00e+0 BA.2.3.12
V7384 11568C>T synonymous_variant ORF1ab_pp1a A3856A 1.00e+0 BA.2.3.12
V7414 11760C>T synonymous_variant ORF1ab_pp1a S3920S 1.00e+0 BA.2.3.12
V7611 13284C>T synonymous_variant ORF1ab_pp1ab D4428D 6.32e-1 BA.2.3.12
V5984 834T>C synonymous_variant ORF1ab_pp1a N278N 1.00e+0 BA.2.36
V6890 7671G>T synonymous_variant ORF1ab_pp1a A2557A 1.00e+0 BA.2.75.4
V3602 250C>A missense_variant S L84I 1.00e+0 BA.4.6.4
V7765 14454G>A synonymous_variant ORF1ab_pp1ab K4818K 1.00e+0 BA.4.6.4
V9828 75C>T synonymous_variant ORF10 N25N 1.00e+0 BA.4.6.4
V227 334G>T missense_variant ORF1ab_pp1a G112C 8.16e-1 BA.4
V6083 1588C>T synonymous_variant ORF1ab_pp1a L530L 6.55e-1 BA.4
V6925 7893T>C synonymous_variant ORF1ab_pp1a S2631S 7.07e-1 BA.4
V214 296G>A missense_variant ORF1ab_pp1a R99H 7.07e-1 BA.5.1.3
V46 -92G>A upstream_gene_variant ORF1ab_pp1a None 7.07e-1 BA.5.2.13
V1836 8549C>T missense_variant ORF1ab_pp1a A2850V 6.71e-1 BA.5
V5726 *4304G>T downstream_gene_variant S None 8.94e-1 BA.5
V1840 8570T>C missense_variant ORF1ab_pp1a V2857A 1.00e+0 BE.4.1
V2831 16595C>T missense_variant ORF1ab_pp1ab A5532V 1.00e+0 BE.4.1
V714 2245A>G missense_variant ORF1ab_pp1a T749A 1.00e+0 BE.4.1
V8972 2637G>A synonymous_variant S A879A 1.00e+0 BE.4.1
V9668 525G>T synonymous_variant N G175G 1.00e+0 BF.8
V9757 954G>A synonymous_variant N S318S 1.00e+0 BF.8
V1162 4390C>T missense_variant ORF1ab_pp1a R1464W 7.07e-1 B
V3730 635T>C missense_variant S L212S 1.00e+0 B
V818 2739G>T missense_variant ORF1ab_pp1a E913D 7.07e-1 BN.1.2
V714 2245A>G missense_variant ORF1ab_pp1a T749A 7.07e-1 BQ.1.11
V6488 4518A>G synonymous_variant ORF1ab_pp1a S1506S 1.00e+0 BQ.1.1.2
V6995 8508A>G synonymous_variant ORF1ab_pp1a T2836T 1.00e+0 BQ.1.1.2
V6996 8517C>T synonymous_variant ORF1ab_pp1a S2839S 1.00e+0 BQ.1.1.2
V714 2245A>G missense_variant ORF1ab_pp1a T749A 1.00e+0 BQ.1.1.2
V7573 12996C>T synonymous_variant ORF1ab_pp1a C4332C 1.00e+0 BQ.1.1.2
V163 184C>T missense_variant ORF1ab_pp1a P62S 7.07e-1 BQ.1.1.5
V3058 18119T>C missense_variant ORF1ab_pp1ab V6040A 7.07e-1 BQ.1.1.5
V7347 11322T>C synonymous_variant ORF1ab_pp1a N3774N 1.00e+0 BQ.1.1.5
V7463 12135C>A synonymous_variant ORF1ab_pp1a L4045L 1.00e+0 BQ.1.1.7
V9106 3597T>C synonymous_variant S D1199D 1.00e+0 BQ.1.1.7
V3528 42G>T missense_variant S Q14H 7.07e-1 BQ.1.8
V3447 20813C>T missense_variant ORF1ab_pp1ab T6938I 7.06e-1 DR.1
V522 1493C>T missense_variant ORF1ab_pp1a A498V 8.16e-1 DR.1
V7844 15102T>C synonymous_variant ORF1ab_pp1ab H5034H 7.06e-1 DR.1
V1267 4958C>T missense_variant ORF1ab_pp1a T1653I 1.00e+0 P.1.4
V3442 20770C>T missense_variant ORF1ab_pp1ab L6924F 6.01e-1 P.1.4
V3934 1562_1563insTCC disruptive_inframe_insertion S P521dup 1.00e+0 B.1.546
V1539 6461C>T missense_variant ORF1ab_pp1a T2154I 1.00e+0 B.1.566
V1810 8390C>T missense_variant ORF1ab_pp1a S2797F 1.00e+0 B.1.566
V1811 8393A>G missense_variant ORF1ab_pp1a K2798R 1.00e+0 B.1.566
V3267 19631C>T missense_variant ORF1ab_pp1ab A6544V 1.00e+0 B.1.566
V3526 35C>T missense_variant S S12F 1.00e+0 B.1.566
V5682 8A>G missense_variant ORF10 Y3C 1.00e+0 B.1.566
V6446 4191C>T synonymous_variant ORF1ab_pp1a A1397A 1.00e+0 B.1.566
V8924 2298T>C synonymous_variant S A766A 1.00e+0 B.1.566
V8994 2820C>T synonymous_variant S S940S 1.00e+0 B.1.566
V9736 871C>T synonymous_variant N L291L 1.00e+0 B.1.566
V531 1555G>A missense_variant ORF1ab_pp1a G519S 1.00e+0 B.1.626
V5368 307G>T missense_variant N D103Y 1.00e+0 B.1.626
V7256 10461C>T synonymous_variant ORF1ab_pp1a T3487T 1.00e+0 B.1.626
V9133 3792G>T synonymous_variant S V1264V 1.00e+0 B.1.626
V3747 662C>T missense_variant S S221L 1.00e+0 BN.1.1.1





Manual curation of mutation (20997A>G)-related literature from PubMed

The pubmed.mineR and pubmed-mapper were utilized for extracting literature from PubMed, followed by manual filtering.
Note: PubMed: (COVID-19 [Title/Abstract] OR SARS-COV-2 [Title/Abstract]) AND (DNA mutation [Title/Abstract] OR Protein mutation-1 letter [Title/Abstract] OR Protein mutation-3 letter [Title/Abstract]).

DNA level Protein level Paper title Journal name Publication year Pubmed ID