Back to top

The current mutation

ID: V169
DNA: 221C>T
Protein: S74L
Position: 486








COV2Var annotation categories







Summary information of mutation (221C>T)

Basic Information about Mutation.

  Gene Information   Gene ID   GU280_gp01_pp1a
  Gene Name   ORF1ab_pp1a
  Gene Type   protein_coding
  Genome position   486
  Reference genome   GenBank ID: NC_045512.2
  Mutation type   missense_variant
  DNA Level   DNA Mutation: 221C>T
  Ref Seq: C
  Mut Seq: T
  Protein Level   Protein 1-letter Mutation: S74L
  Protein 3-letter Mutation: Ser74Leu

Overview of the genomic positions of Mutation.
Note: The annotated 12 genes were retrieved from GeneBank (Accession: NC_045512.2). "MP" represents genomic position of mutation.





Analyzing the distribution of mutation (221C>T) across geographic regions, temporal trends, and lineages

The count of genome sequences harboring this mutation and its distribution across global regions offer insights into regional variations.
Note: The distribution of mutation across 218 geographical regions. Color representation of genome sequence counts. The data is obtained from GISAID's metadata, specifically capturing the regional distribution of genomic sequences.



The dynamic count of genome sequences containing this mutation over time.
Note: Clicking the "Count" or "Cumulative Count" button toggles the view. Count represents the number of genome sequences per month. Cumulative count represents the accumulated total count up to the respective month. The data is obtained from GISAID's metadata, specifically capturing the collection date of genomic sequences.



For every time point represented in the graph above, identifying the top 3 lineages with the highest count of genome sequences carrying this mutation aids in pinpointing noteworthy lineages for further analysis.
Note: Users can filter the lineages by entering a "Year-Month" term in the search box. For example, entering 2020-01 will display lineages that appeared in January 2020. The data is obtained from GISAID's metadata, specifically capturing the collection date of genomic sequences.

Collection date Lineage Total lineage monthly counts Lineage-specific monthly counts Lineage-specific monthly frequency
2020-10 B.1.2 2 2 1.00e+0
2020-11 AM.4 3 2 6.67e-1
2020-11 B.1.2 3 1 3.33e-1
2020-12 B.1.177 7 2 2.86e-1
2020-12 B.1.1.63 7 1 1.43e-1
2020-12 B.1.2 7 1 1.43e-1
2020-02 B 2 2 1.00e+0
2020-03 B.1.1 1 1 1.00e+0
2020-04 B.1.110 3 2 6.67e-1
2020-04 B.1 3 1 3.33e-1
2020-05 B.1.1.220 3 2 6.67e-1
2020-05 B.1.147 3 1 3.33e-1
2020-06 B.1.1.301 4 2 5.00e-1
2020-06 B.1.369 4 1 2.50e-1
2020-06 B.1.401 4 1 2.50e-1
2020-07 B.1.1 10 6 6.00e-1
2020-07 B.1 10 1 1.00e-1
2020-07 B.1.582 10 1 1.00e-1
2020-08 B.1 2 1 5.00e-1
2020-08 B.1.1.284 2 1 5.00e-1
2020-09 B.1.128 4 3 7.50e-1
2020-09 B.1.1.284 4 1 2.50e-1
2021-01 B.1.1.7 5 4 8.00e-1
2021-01 B 5 1 2.00e-1
2021-10 AY.4 176 54 3.07e-1
2021-10 AY.43 176 28 1.59e-1
2021-10 AY.46 176 20 1.14e-1
2021-11 AY.4 295 69 2.34e-1
2021-11 AY.46.6 295 57 1.93e-1
2021-11 AY.126 295 34 1.15e-1
2021-12 AY.4 198 22 1.11e-1
2021-12 AY.4.2 198 18 9.09e-2
2021-12 AY.46.6 198 15 7.58e-2
2021-02 B.1.2 15 5 3.33e-1
2021-02 B.1.1.416 15 3 2.00e-1
2021-02 B.1.1.519 15 3 2.00e-1
2021-03 B.1.1.7 18 8 4.44e-1
2021-03 B.1.1.519 18 4 2.22e-1
2021-03 B.1.36.35 18 2 1.11e-1
2021-04 B.1.1.7 18 13 7.22e-1
2021-04 B.1.617.2 18 3 1.67e-1
2021-04 AY.112 18 1 5.56e-2
2021-05 B.1.1.7 17 13 7.65e-1
2021-05 C.36.3 17 2 1.18e-1
2021-05 B.1.427 17 1 5.88e-2
2021-06 AY.24 12 3 2.50e-1
2021-06 B.1.1.7 12 3 2.50e-1
2021-06 AY.4 12 2 1.67e-1
2021-07 AY.42 34 8 2.35e-1
2021-07 AY.4 34 7 2.06e-1
2021-07 AY.122 34 4 1.18e-1
2021-08 AY.4 169 77 4.56e-1
2021-08 B.1.617.2 169 17 1.01e-1
2021-08 AY.122 169 14 8.28e-2
2021-09 AY.4 211 92 4.36e-1
2021-09 AY.46 211 38 1.80e-1
2021-09 B.1.617.2 211 10 4.74e-2
2022-01 BA.1.1 189 61 3.23e-1
2022-01 BA.1 189 31 1.64e-1
2022-01 BA.1.17.2 189 17 8.99e-2
2022-10 BE.10 242 159 6.57e-1
2022-10 BA.5.3.1 242 25 1.03e-1
2022-10 BA.4.1 242 5 2.07e-2
2022-11 BE.10 1124 1038 9.23e-1
2022-11 BA.5.3.1 1124 21 1.87e-2
2022-11 BQ.1.1 1124 12 1.07e-2
2022-12 BE.10 669 540 8.07e-1
2022-12 BQ.1.1 669 24 3.59e-2
2022-12 BA.5.3.1 669 17 2.54e-2
2022-02 BA.1.1 211 93 4.41e-1
2022-02 BA.2 211 46 2.18e-1
2022-02 BA.1 211 24 1.14e-1
2022-03 BA.2 224 157 7.01e-1
2022-03 BA.1.1 224 22 9.82e-2
2022-03 BA.2.9 224 9 4.02e-2
2022-04 BA.2 139 98 7.05e-1
2022-04 BA.2.3 139 10 7.19e-2
2022-04 BA.5.3.1 139 9 6.47e-2
2022-05 BA.2 134 61 4.55e-1
2022-05 BA.5.3.1 134 44 3.28e-1
2022-05 BA.2.3 134 7 5.22e-2
2022-06 BA.5.3.1 145 105 7.24e-1
2022-06 BA.2 145 10 6.90e-2
2022-06 BA.5.5 145 6 4.14e-2
2022-07 BA.5.3.1 176 90 5.11e-1
2022-07 BA.5.2.20 176 8 4.55e-2
2022-07 BE.1 176 8 4.55e-2
2022-08 BA.5.3.1 95 37 3.89e-1
2022-08 BE.1 95 13 1.37e-1
2022-08 BA.5.1.23 95 4 4.21e-2
2022-09 BA.5.3.1 114 44 3.86e-1
2022-09 BE.10 114 15 1.32e-1
2022-09 BA.5 114 7 6.14e-2
2023-01 BE.10 242 157 6.49e-1
2023-01 XBB.1.5 242 22 9.09e-2
2023-01 BN.1.3 242 6 2.48e-2
2023-02 BE.10 45 14 3.11e-1
2023-02 XBB.1.5 45 9 2.00e-1
2023-02 CH.1.1.3 45 7 1.56e-1

The count of genome sequences and the frequency of this mutation in each lineage.
Note: Displaying mutation frequencies (>0.01) among 2,735 lineages. Mutation Count represents the count of sequences carrying this mutation. Users can filter the lineages by entering a search term in the search box. For example, entering "A.1" will display A.1 lineages. The data is obtained from GISAID's metadata, specifically capturing the lineage of genomic sequences. Mutation count: Count of sequences carrying this mutation.

Mutation ID Lineage Mutation frequency Mutation count Earliest lineage emergence Latest lineage emergence
V169 BA.5.3.1 4.05e-2 394 2022-2-11 2023-2-16
V169 BE.10 9.84e-1 1928 2022-1-12 2023-2-15
V169 DC.1 1.36e-2 3 2022-8-8 2023-1-9






Examining mutation (221C>T) found in abundant sequences of non-human animal hosts

Exploring mutation presence across 35 non-human animal hosts for cross-species transmission.
Note: We retained the mutation that appear in at least three non-human animal hosts' sequences. The data is obtained from GISAID's metadata, specifically capturing the host of genomic sequences.

Animal host Lineage Source region Collection date Accession ID




Association between mutation (221C>T) and patients of different ages, genders, and statuses

Note: The logistic regression model was employed to examine changes in patient data before and after the mutation. The logistic regression model was conducted using the glm function in R. The data is obtained from GISAID's metadata, specifically capturing the patient status, gender, and age of genomic sequences.

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient statuses (ambulatory, deceased, homebound, hospitalized, mild, and recovered) based on GISAID classifications. In the analysis exploring the association between mutation and patient status, the model included mutation, patient status, patient age, gender, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient status Ambulatory 1.92e+0 1.05e+0 1.83e+0 6.68e-2 Increase
Deceased -1.55e+0 1.28e+0 -1.21e+0 2.25e-1 Decrease
Homebound -1.33e-16 9.11e+4 -1.46e-21 1.00e+0 Decrease
Hospitalized 2.12e+0 6.31e-1 3.37e+0 7.63e-4 Increase
Mild -1.30e+1 3.13e+3 -4.16e-3 9.97e-1 Decrease
Recovered -2.30e+0 6.28e-1 -3.67e+0 2.46e-4 Decrease

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient age (0-17, 18-39, 40-64, 65-84, and 85+). In the analysis exploring the association between mutation and patient age, the model included mutation, patient age, gender, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient age, years 0-17 -7.11e-2 1.59e-1 -4.48e-1 6.54e-1 Decrease
18-39 -4.61e-2 7.27e-2 -6.34e-1 5.26e-1 Decrease
40-64 -2.79e-3 7.14e-2 -3.90e-2 9.69e-1 Decrease
65-84 3.64e-2 9.65e-2 3.77e-1 7.06e-1 Increase
>=85 3.65e-1 1.82e-1 2.01e+0 4.45e-2 Increase

Analyzing the association between mutation and patient status.
Note: we categorized the data into different patient gender (male and female). In the analysis exploring the association between mutation and patient gender, the model included mutation, patient gender, patient age, sequence region of origin, and sequence collection time point. In the 'increase' direction of the mutation, it means that when this mutation occurs, it increases the corresponding effect proportion. In the 'decrease' direction of the mutation, it means that when this mutation occurs, it decreases the corresponding effect proportion. A p-value lower than 0.001 signifies a notable differentiation between the population with and without the mutation.

Attribute Effect Estimate SE Z-value P-value Direction
Patient gender Male -5.29e-2 7.15e-2 -7.40e-1 4.59e-1 Decrease





Investigating natural selection at mutation (221C>T) site for genetic adaptation and diversity

Note: Investigating the occurrence of positive selection or negative selection at this mutation site reveals implications for genetic adaptation and diversity.

The MEME method within the HyPhy software was employed to analyze positive selection. MEME: episodic selection.
Note: List of sites found to be under episodic selection by MEME (p < 0.05). "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site P-value Lineage Method

The FEL method within the HyPhy software was employed to analyze both positive and negative selection. FEL: pervasive selection on samll datasets.
Note: List of sites found to be under pervasive selection by FEL (p < 0.05). A beta value greater than alpha signifies positive selection, while a beta value smaller than alpha signifies negative selection. "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site Alpha Beta P-value Lineage Method
leader 266 805 180 74 4.79 0.00 4.00e-2 BA.5.2.3 FEL
leader 266 805 180 74 3.05 0.00 3.00e-2 C.37 FEL
leader 266 805 180 74 5.97 0.00 2.00e-2 P.1.15 FEL
leader 266 805 180 74 5.23 0.00 3.00e-2 P.1.7 FEL
leader 266 805 180 74 4.61 0.00 2.00e-2 P.2 FEL
leader 266 805 180 74 7.66 0.00 3.00e-2 XBB.1.2 FEL
leader 266 805 180 74 7.96 0.00 3.00e-2 A.2.5 FEL
leader 266 805 180 74 5.86 0.00 2.00e-2 AY.101 FEL
leader 266 805 180 74 6.79 0.00 3.00e-2 AY.107 FEL
leader 266 805 180 74 7.94 0.00 2.00e-2 AY.114 FEL
leader 266 805 180 74 9.50 0.00 3.00e-2 AY.122.1 FEL
leader 266 805 180 74 31.52 0.00 1.00e-2 AY.134 FEL
leader 266 805 180 74 10.00 0.00 2.00e-2 B.1.1.25 FEL
leader 266 805 180 74 5.63 0.00 2.00e-2 B.1.1.28 FEL
leader 266 805 180 74 1.50 0.00 4.00e-2 B.1.36 FEL
leader 266 805 180 74 9.38 0.00 3.00e-2 B.1.575 FEL
leader 266 805 180 74 38.84 0.00 5.00e-2 B.1.587 FEL
leader 266 805 180 74 3.00 0.00 2.00e-2 B.1.258 FEL
leader 266 805 180 74 3.58 0.00 1.00e-2 B.1.2 FEL
leader 266 805 180 74 1.96 0.00 3.00e-2 B.1 FEL
leader 266 805 180 74 7.86 0.00 2.00e-2 BA.1.16 FEL
leader 266 805 180 74 3.85 0.00 3.00e-2 P.1.14 FEL
leader 266 805 180 74 3.69 0.00 1.00e-2 P.1 FEL
leader 266 805 180 74 4.77 0.52 2.00e-2 AY.100 FEL
leader 266 805 180 74 5.90 1.15 4.00e-2 AY.102 FEL
leader 266 805 180 74 3.57 0.45 3.00e-2 AY.112 FEL
leader 266 805 180 74 12.44 0.75 0.00e+0 AY.118 FEL
leader 266 805 180 74 4.38 0.00 1.00e-2 AY.120 FEL
leader 266 805 180 74 6.46 0.63 2.00e-2 AY.121 FEL
leader 266 805 180 74 3.71 0.00 3.00e-2 AY.129 FEL
leader 266 805 180 74 5.26 0.00 3.00e-2 AY.14 FEL
leader 266 805 180 74 3.21 0.00 3.00e-2 AY.20 FEL
leader 266 805 180 74 4.10 0.00 1.00e-2 AY.39 FEL
leader 266 805 180 74 2.81 0.00 3.00e-2 AY.43 FEL
leader 266 805 180 74 2.81 0.00 4.00e-2 AY.9.2 FEL
leader 266 805 180 74 4.40 0.00 1.00e-2 AY.99.2 FEL
leader 266 805 180 74 3.95 0.00 2.00e-2 B.1.160 FEL
leader 266 805 180 74 5.82 0.40 0.00e+0 B.1.1 FEL
leader 266 805 180 74 4.46 0.00 2.00e-2 B.1.221 FEL
leader 266 805 180 74 34.58 0.00 5.00e-2 B.1.597 FEL
leader 266 805 180 74 29.21 0.00 3.00e-2 BA.2.10.3 FEL

The FUBAR method within the HyPhy software was employed to analyze both positive and negative selection. FUBAR: pervasive selection on large datasets.
Note: List of sites found to be under pervasive selection by FUBAR (prob > 0.95). A prob[alpha < beta] value exceeding 0.95 indicates positive selection, while a prob[alpha > beta] value exceeding 0.95 indicates negative selection. "Protein Start" corresponds to the protein's starting genomic position. "Protein End" corresponds to the protein's ending genomic position. The term 'site' represents a selection site within the protein.

Protein name Protein start Protein end Protein length Site Prob[alpha>beta] Prob[alpha<beta] Lineage Method
leader 266 805 180 74 9.50e-1 2.00e-2 BA.1.21 FUBAR
leader 266 805 180 74 9.60e-1 3.00e-2 P.1.7 FUBAR
leader 266 805 180 74 9.50e-1 3.00e-2 P.2 FUBAR
leader 266 805 180 74 9.70e-1 2.00e-2 A.2.5 FUBAR
leader 266 805 180 74 9.60e-1 3.00e-2 AY.114 FUBAR
leader 266 805 180 74 9.90e-1 1.00e-2 AY.134 FUBAR
leader 266 805 180 74 9.80e-1 1.00e-2 B.1.1.25 FUBAR
leader 266 805 180 74 9.60e-1 2.00e-2 B.1.1.28 FUBAR
leader 266 805 180 74 9.80e-1 1.00e-2 B.1.575 FUBAR
leader 266 805 180 74 9.70e-1 2.00e-2 B.1.2 FUBAR
leader 266 805 180 74 9.70e-1 2.00e-2 P.1 FUBAR
leader 266 805 180 74 9.80e-1 1.00e-2 AY.100 FUBAR
leader 266 805 180 74 9.80e-1 0.00e+0 AY.102 FUBAR
leader 266 805 180 74 1.00e+0 0.00e+0 AY.118 FUBAR
leader 266 805 180 74 9.50e-1 3.00e-2 AY.120 FUBAR
leader 266 805 180 74 9.80e-1 1.00e-2 AY.121 FUBAR
leader 266 805 180 74 9.70e-1 2.00e-2 AY.39 FUBAR
leader 266 805 180 74 9.60e-1 2.00e-2 AY.99.2 FUBAR
leader 266 805 180 74 1.00e+0 0.00e+0 B.1.1 FUBAR
leader 266 805 180 74 9.80e-1 1.00e-2 B.1.597 FUBAR
leader 266 805 180 74 9.70e-1 2.00e-2 BA.2.10.3 FUBAR
leader 266 805 180 74 2.00e-2 9.60e-1 CH.1 FUBAR
leader 266 805 180 74 2.00e-2 9.70e-1 CM.3 FUBAR




Alterations in protein physicochemical properties induced by mutation (221C>T)

Understanding the alterations in protein physicochemical properties can reveal the evolutionary processes and adaptive changes of viruses
Note: ProtParam software was used for the analysis of physicochemical properties. Significant change threshold: A change exceeding 10% compared to the reference is considered a significant change. "GRAVY" is an abbreviation for "grand average of hydropathicity".

Group Protein name Molecular weight Theoretical PI Extinction coefficients Aliphatic index GRAVY
Mutation ORF1ab_pp1a 490015 6.04 543550 89.08 -0.022
Reference ORF1ab_pp1a 489988.91 6.04 543550 88.99 -0.023




Alterations in protein stability induced by mutation (221C>T)

The impact of mutations on protein stability directly or indirectly affects the biological characteristics, adaptability, and transmission capacity of the virus
Note: iMutant 2.0 was utilized to analyze the effects of mutations on protein stability. pH 7 and a temperature of 25°C are employed to replicate the in vitro environment. pH 7.4 and a temperature of 37°C are utilized to simulate the in vivo environment.

Mutation Protein name Mutation type Position ΔDDG Stability pH Temperature Condition
S74L ORF1ab_pp1a Point 74 0.43 Increase 7 25 Environment
S74L ORF1ab_pp1a Point 74 0.53 Increase 7.4 37 Internal




Impact on protein function induced by mutation (221C>T)

The impact of mutations on protein function
Note: The MutPred2 software was used to predict the pathogenicity of a mutation and gives the molecular mechanism of pathogenicity. A score above 0.5 indicates an increased likelihood of pathogenicity. "Pr" is the abbreviation for "proportion. P" is the abbreviation for "p-value.

Mutation Protein name Mutation type Score Molecular mechanisms
S74L ORF1ab_pp1a Point 0.335 Loss of Strand (Pr = 0.26 | P = 0.04)
Altered DNA_binding (Pr = 0.17 | P = 0.04)
Altered Cytoplasmic_loop (Pr = 0.15 | P = 0.02)




Exploring mutation (221C>T) distribution within intrinsically disordered protein regions

Intrinsically Disordered Proteins (IDPs) which refers to protein regions that have no unique 3D structure. In viral proteins, mutations in the disordered regions s are critical for immune evasion and antibody escape, suggesting potential additional implications for vaccines and monoclonal therapeutic strategies.
Note: The iupred3 software was utilized for analyzing IDPs. A score greater than 0.5 is considered indicative of an IDP. In the plot, "POS" represents the position of the mutation.





Alterations in enzyme cleavage sites induced by mutation (221C>T)

Exploring the impact of mutations on the cleavage sites of 28 enzymes.
Note: The PeptideCutter software was used for detecting enzymes cleavage sites. The increased enzymes cleavage sites refer to the cleavage sites in the mutated protein that are added compared to the reference protein. Conversely, the decreased enzymes cleavage sites indicate the cleavage sites in the mutated protein that are reduced compared to the reference protein.

Mutation Protein name Genome position Enzyme name Increased cleavage sites Decreased cleavage sites
S74L ORF1ab_pp1a 486 Proteinase K IKRLDARTAP (pos: 74)
NA
S74L ORF1ab_pp1a 486 Thermolysin FIKRLDARTA (pos: 73)
NA
S74L ORF1ab_pp1a 486 Chymotrypsin-low specificity IKRLDARTAP (pos: 74)
NA




Impact of spike protein mutation (221C>T) on antigenicity and immunogenicity

Investigating the impact of mutations on antigenicity and immunogenicity carries important implications for vaccine design and our understanding of immune responses.
Note: An absolute change greater than 0.0102 (three times the median across sites) in antigenicity score is considered significant. An absolute changegreater than 0.2754 (three times the median across sites) in immunogenicity score is considered significant. The VaxiJen tool was utilized for antigenicity analysis. The IEDB tool was used for immunogenicity analysis. Antigens with a prediction score of more than 0.4 for this tool are considered candidate antigens. MHC I immunogenicity score >0, indicating a higher probability to stimulate an immune response.

Group Protein name Protein region Antigenicity score Immunogenicity score




Impact of mutation (221C>T) on viral transmissibility by the affinity between RBD and ACE2 receptor

Unraveling the impact of mutations on the interaction between the receptor binding domain (RBD) and ACE2 receptor using deep mutational scanning (DMS) experimental data to gain insights into their effects on viral transmissibility.
Note: The ΔBinding affinity represents the disparity between the binding affinity of a mutation and the reference binding affinity. A positive Δbinding affinity value (Δlog10(KD,app) > 0) signifies an increased affinity between RBD and ACE2 receptor due to the mutation. Conversely, a negative value (Δlog10(KD,app) < 0) indicates a reduced affinity between RBD and ACE2 receptor caused by the mutation. A p-value smaller than 0.05 indicates significance. "Ave mut bind" represents the average binding affinity of this mutation. "Ave ref bind" refers to the average binding affinity at a site without any mutation (reference binding affinity).

;
Mutation Protein name Protein region Mutation Position Ave mut bind Ave ref bind ΔBinding affinity P-value Image


The interface between the receptor binding domain (RBD) and ACE2 receptor is depicted in the crystal structure 6JM0.
Note: The structure 6M0J encompasses the RBD range of 333 to 526. The binding sites (403-406, 408, 417, 439, 445-447, 449, 453, 455-456, 473-478, 484-498, and 500-506) on the RBD that interface with ACE2 are indicated in magenta. The binding sites on the RBD that have been identified through the interface footprints experiment. The ACE2 binding sites within the interface are shown in cyan, representing residues within 5Å proximity to the RBD binding sites. The mutation within the RBD range of 333 to 526 is depicted in red.

        Show as:

        Show interface residues:





Impact of mutation (221C>T) on immune escape by the affinity between RBD and antibody/serum

By utilizing experimental data from deep mutational scanning (DMS), we can uncover how mutations affect the interaction between the receptor binding domain (RBD) and antibodies/serum. This approach provides valuable insights into strategies for evading the host immune response.
Note: We considered a mutation to mediate strong escape if the escape score exceeded 0.1 (10% of the maximum score of 1). A total of 1,504 antibodies/serum data were collected for this analysis. "Condition name" refers to the name of the antibodies/serum. "Mut escape score" represents the escape score of the mutation in that specific condition. "Avg mut escape score" indicates the average escape score of the mutation site in that condition, considering the occurrence of this mutation and other mutations. Class 1 antibodies bind to an epitope only in the RBD “up” conformation, and are the most abundant. Class 2 antibodies bind to the RBD both in “up” and “down” conformations. Class 3 and class 4 antibodies both bind outside the ACE2 binding site. Class 3 antibodies bind the RBD in both the open and closed conformation, while class 4 antibodies bind only in the open conformation.

Mutation Condition name Condition type Condition subtype Condition year Mut escape score Avg mut escape score




Investigating the co-mutation patterns of mutation (221C>T) across 2,735 viral lineages

Investigating the co-mutation patterns of SARS-CoV-2 across 2,735 viral lineages to unravel the cooperative effects of different mutations. In biological research, correlation analysis of mutation sites helps us understand whether there is a close relationship or interaction between certain mutations.
Note: The Spearman correlation coefficient is used to calculate the correlation between two mutations within each Pango lineage. Holm–Bonferroni method was used for multiple test adjustment. We retained mutation pairs with correlation values greater than 0.6 or less than -0.6 and Holm–Bonferroni corrected p-values less than 0.05.

Associated mutation ID DNA mutation Mutation type Protein name Protein mutation correlation coefficient Lineage
V2825 16531G>T missense_variant ORF1ab_pp1ab V5511L 1.00e+0 B.1.147
V404 1041G>T missense_variant ORF1ab_pp1a E347D 7.07e-1 AY.29
V6669 5886C>T synonymous_variant ORF1ab_pp1a D1962D 1.00e+0 B.1.429
V9260 12C>T synonymous_variant E F4F 7.07e-1 AY.109
V9821 21C>T synonymous_variant ORF10 F7F 7.07e-1 AY.109
V1044 3606G>T missense_variant ORF1ab_pp1a K1202N 1.00e+0 AY.110
V5694 28C>T missense_variant ORF10 P10S 7.07e-1 AY.110
V7322 11100G>T synonymous_variant ORF1ab_pp1a V3700V 7.07e-1 AY.110
V8171 17583G>A synonymous_variant ORF1ab_pp1ab Q5861Q 7.07e-1 AY.118
V5196 302G>T missense_variant ORF8 R101L 7.07e-1 AY.119.2
V6041 1251T>C synonymous_variant ORF1ab_pp1a H417H 6.32e-1 AY.119
V403 1037C>T missense_variant ORF1ab_pp1a T346I 1.00e+0 AY.128
V4100 2533G>T missense_variant S A845S 7.07e-1 AY.128
V5382 404C>T missense_variant N T135I 1.00e+0 AY.128
V8915 2196C>T synonymous_variant S T732T 1.00e+0 AY.128
V1748 7975C>T missense_variant ORF1ab_pp1a H2659Y 7.07e-1 AY.14
V2037 10184C>A missense_variant ORF1ab_pp1a P3395H 7.07e-1 AY.14
V779 2576C>T missense_variant ORF1ab_pp1a A859V 7.07e-1 AY.14
V9716 783A>G synonymous_variant N K261K 7.07e-1 AY.34
V2069 10510C>T missense_variant ORF1ab_pp1a P3504S 7.07e-1 AY.36.1
V6675 5925C>T synonymous_variant ORF1ab_pp1a Y1975Y 7.07e-1 AY.36.1
V2353 12398C>T missense_variant ORF1ab_pp1a A4133V 1.00e+0 AY.39.1.1
V2738 15644G>T missense_variant ORF1ab_pp1ab G5215V 6.32e-1 AY.4.2.2
V3935 1564G>T missense_variant S A522S 7.07e-1 AY.43.6
V785 2606A>G missense_variant ORF1ab_pp1a N869S 1.00e+0 AY.45
V4845 -3A>G upstream_gene_variant ORF7a None 9.11e-1 AY.46.6
V7373 11502C>T synonymous_variant ORF1ab_pp1a S3834S 8.98e-1 AY.46.6
V9531 186G>T synonymous_variant ORF8 V62V 7.54e-1 AY.46.6
V9579 102G>T synonymous_variant N G34G 8.98e-1 AY.46.6
V444 1189C>T missense_variant ORF1ab_pp1a L397F 6.67e-1 AY.4.6
V2421 12911C>T missense_variant ORF1ab_pp1a T4304I 8.68e-1 AY.46
V2918 17146C>T missense_variant ORF1ab_pp1ab R5716C 8.86e-1 AY.46
V4846 -1C>T upstream_gene_variant ORF7a None 9.12e-1 AY.46
V6578 5142C>T synonymous_variant ORF1ab_pp1a I1714I 9.18e-1 AY.46
V9668 525G>T synonymous_variant N G175G 7.38e-1 AY.46
V6518 4737C>T synonymous_variant ORF1ab_pp1a L1579L 7.07e-1 AY.4.8
V5963 684A>G synonymous_variant ORF1ab_pp1a V228V 1.00e+0 AY.59
V6520 4743G>A synonymous_variant ORF1ab_pp1a T1581T 1.00e+0 AY.59
V9164 168T>C synonymous_variant ORF3a F56F 1.00e+0 AY.59
V4562 569C>T missense_variant ORF3a T190I 7.07e-1 AY.74
V2431 13064C>T missense_variant ORF1ab_pp1a T4355I 7.07e-1 AY.80
V3067 18167C>T missense_variant ORF1ab_pp1ab T6056I 1.00e+0 AY.80
V8268 18312C>T synonymous_variant ORF1ab_pp1ab D6104D 1.00e+0 AY.80
V3623 394G>C missense_variant S E132Q 1.00e+0 AY.85
V1081 3722C>T missense_variant ORF1ab_pp1a T1241I 7.07e-1 AY.86
V3804 851C>T missense_variant S T284I 1.00e+0 AY.86
V4478 311C>T missense_variant ORF3a P104L 7.07e-1 AY.86
V4538 515G>A missense_variant ORF3a G172D 7.07e-1 AY.86
V9198 336C>T synonymous_variant ORF3a V112V 7.07e-1 AY.86
V9314 213C>T synonymous_variant M Y71Y 7.07e-1 AY.86
V2672 15013C>T missense_variant ORF1ab_pp1ab H5005Y 1.00e+0 B.1.1.176
V1183 4529C>T missense_variant ORF1ab_pp1a S1510F 1.00e+0 B.1.1.416
V3451 20855G>T missense_variant ORF1ab_pp1ab C6952F 1.00e+0 B.1.1.416
V4950 242C>T missense_variant ORF7a S81L 8.66e-1 B.1.1.416
V702 2188C>T missense_variant ORF1ab_pp1a L730F 1.00e+0 B.1.1.416
V7472 12174C>T synonymous_variant ORF1ab_pp1a P4058P 7.06e-1 B.1.1.416
V9656 474G>T synonymous_variant N V158V 6.11e-1 B.1.1.416
V9761 963C>T synonymous_variant N G321G 1.00e+0 B.1.1.416
V47 -92G>C upstream_gene_variant ORF1ab_pp1a None 1.00e+0 B.1.1.63
V1013 3463G>T missense_variant ORF1ab_pp1a G1155C 7.07e-1 B.1.177.4
V185 245G>A missense_variant ORF1ab_pp1a G82D 7.07e-1 B.1.258.17
V9644 432T>C synonymous_variant N D144D 1.00e+0 B.1.582
V3538 62G>A missense_variant S R21K 1.00e+0 B.1.612
V9644 432T>C synonymous_variant N D144D 1.00e+0 B.1.612
V4129 2806G>A missense_variant S D936N -1.00e+0 B.1.619.1
V8971 2632T>C synonymous_variant S L878L -7.07e-1 B.1.619.1
V7457 12057G>A synonymous_variant ORF1ab_pp1a E4019E 7.07e-1 B.1.621
V7806 14751T>C synonymous_variant ORF1ab_pp1ab D4917D 7.07e-1 B.1.621
V1824 8491C>T missense_variant ORF1ab_pp1a H2831Y 7.07e-1 BA.1.1.11
V8819 1491C>T synonymous_variant S F497F 7.07e-1 BA.1.1.11
V9195 318C>T synonymous_variant ORF3a L106L 1.00e+0 BA.1.1.11
V7348 11355C>T synonymous_variant ORF1ab_pp1a F3785F 7.07e-1 BA.1.1.13
V8893 2028T>C synonymous_variant S T676T 1.00e+0 BA.1.1.16
V4560 555G>T missense_variant ORF3a Q185H 1.00e+0 BA.1.15.2
V4486 329C>T missense_variant ORF3a A110V 7.07e-1 BA.2.3.12
V7435 11874T>C synonymous_variant ORF1ab_pp1a A3958A 7.98e-1 BA.2.3.1
V6553 4956C>T synonymous_variant ORF1ab_pp1a H1652H 1.00e+0 BA.2.31
V7887 15435C>T synonymous_variant ORF1ab_pp1ab F5145F 1.00e+0 BA.2.3.20
V9429 91T>C synonymous_variant ORF7a L31L 1.00e+0 BA.2.38.1
V4288 3778G>A missense_variant S D1260N 1.00e+0 BA.2.38
V7522 12625C>T synonymous_variant ORF1ab_pp1a L4209L 7.07e-1 BA.2.38
V5318 101G>A missense_variant N G34E 7.07e-1 BA.2.48
V7846 15108G>T synonymous_variant ORF1ab_pp1ab T5036T 1.00e+0 BA.2.48
V7613 13290C>T synonymous_variant ORF1ab_pp1ab Y4430Y 1.00e+0 BA.2.50
V1510 6364C>T missense_variant ORF1ab_pp1a L2122F 8.16e-1 BA.2.6
V2961 17482C>T missense_variant ORF1ab_pp1ab P5828S 8.16e-1 BA.2.71
V1054 3628G>A missense_variant ORF1ab_pp1a E1210K 1.00e+0 BA.2.75.2
V8492 20148A>G synonymous_variant ORF1ab_pp1ab E6716E 8.16e-1 BA.2.75.2
V3009 17851C>T missense_variant ORF1ab_pp1ab H5951Y 8.01e-1 BA.4.6.4
V4187 3301C>T missense_variant S H1101Y 8.66e-1 BA.4.6.4
V6232 2544G>A synonymous_variant ORF1ab_pp1a R848R 1.00e+0 BA.4.6.5
V4775 373C>T missense_variant M H125Y 1.00e+0 BA.4.7
V7239 10329C>T synonymous_variant ORF1ab_pp1a N3443N 7.07e-1 BA.5.1.10
V6568 5067G>A synonymous_variant ORF1ab_pp1a K1689K 7.07e-1 BA.5.1.2
V1348 5561C>T missense_variant ORF1ab_pp1a T1854I 7.07e-1 BA.5.1.4
V2170 11033A>G missense_variant ORF1ab_pp1a K3678R 1.00e+0 BA.5.1.4
V7654 13701C>T synonymous_variant ORF1ab_pp1ab Y4567Y 1.00e+0 BA.5.1.4
V398 1009G>A missense_variant ORF1ab_pp1a V337I 7.07e-1 BA.5.1.6
V4844 -4G>T upstream_gene_variant ORF7a None 8.68e-1 BA.5.2.20
V8597 21018A>G synonymous_variant ORF1ab_pp1ab G7006G 7.88e-1 BA.5.2.20
V9642 426A>T synonymous_variant N P142P 8.32e-1 BA.5.2.20
V2639 14610G>T missense_variant ORF1ab_pp1ab K4870N 1.00e+0 BA.5.2.22
V1601 6821C>T missense_variant ORF1ab_pp1a T2274I 7.07e-1 BA.5.2.26
V3882 1348A>G missense_variant S N450D 1.00e+0 BA.5.2.26
V7873 15324C>A synonymous_variant ORF1ab_pp1ab A5108A 1.00e+0 BA.5.2.26
V1730 7858C>T missense_variant ORF1ab_pp1a L2620F 7.07e-1 BA.5.2.28
V6067 1464G>A synonymous_variant ORF1ab_pp1a V488V 7.07e-1 BA.5.2.28
V436 1154C>T missense_variant ORF1ab_pp1a A385V 1.00e+0 BA.5.2.2
V9680 594T>C synonymous_variant N T198T 1.00e+0 BA.5.2.2
V3803 841G>C missense_variant S E281Q 7.07e-1 BA.5.2.34
V7231 10272C>T synonymous_variant ORF1ab_pp1a Y3424Y 7.07e-1 BA.5.2.34
V9707 747A>G synonymous_variant N K249K 9.13e-1 BA.5.2.7
V9792 1131T>C synonymous_variant N D377D 7.07e-1 BA.5.2.9
V1341 5531C>T missense_variant ORF1ab_pp1a T1844I 6.96e-1 BA.5.3.1
V4846 -1C>T upstream_gene_variant ORF7a None 9.46e-1 BA.5.3.1
V4955 251C>T missense_variant ORF7a P84L 7.18e-1 BA.5.3.1
V4187 3301C>T missense_variant S H1101Y 1.00e+0 BA.5.3.2
V8957 2529T>C synonymous_variant S D843D 1.00e+0 BA.5.3.2
V6630 5547C>T synonymous_variant ORF1ab_pp1a D1849D 1.00e+0 BA.5.3.3
V9645 435C>T synonymous_variant N H145H 7.07e-1 BA.5.3.3
V1341 5531C>T missense_variant ORF1ab_pp1a T1844I 1.00e+0 BA.5.3
V4441 249G>T missense_variant ORF3a L83F 7.07e-1 BA.5.3
V4846 -1C>T upstream_gene_variant ORF7a None 8.16e-1 BA.5.3
V4955 251C>T missense_variant ORF7a P84L 8.16e-1 BA.5.3
V6978 8361C>T synonymous_variant ORF1ab_pp1a F2787F 7.07e-1 BA.5.3
V2416 12854C>T missense_variant ORF1ab_pp1a A4285V 1.00e+0 BA.5.7
V6629 5541C>T synonymous_variant ORF1ab_pp1a C1847C 7.07e-1 BA.5.7
V7778 14526C>T synonymous_variant ORF1ab_pp1ab I4842I 8.16e-1 BA.5.7
V1820 8465C>T missense_variant ORF1ab_pp1a S2822F 1.00e+0 BE.1.1.1
V4955 251C>T missense_variant ORF7a P84L 7.07e-1 BE.1.1.1
V6729 6273C>T synonymous_variant ORF1ab_pp1a G2091G 1.00e+0 BE.1.4
V4747 205G>T missense_variant M A69S 1.00e+0 BE.3
V3079 18234G>T missense_variant ORF1ab_pp1ab M6078I 7.07e-1 BE.4.1
V4488 334G>T missense_variant ORF3a V112F 7.07e-1 BE.4.1
V6133 1881C>T synonymous_variant ORF1ab_pp1a V627V 7.07e-1 BE.4.1
V6344 3345T>C synonymous_variant ORF1ab_pp1a L1115L 7.07e-1 BE.4.1
V4766 254C>T missense_variant M A85V 1.00e+0 BF.13
V330 694C>T missense_variant ORF1ab_pp1a R232C 7.07e-1 BF.16
V3698 544A>G missense_variant S K182E 1.00e+0 BF.16
V8515 20364C>T synonymous_variant ORF1ab_pp1ab G6788G 1.00e+0 BF.16
V3889 1379A>G missense_variant S N460S 1.00e+0 BF.18
V5096 79C>T stop_gained ORF8 Q27* 1.00e+0 BF.18
V7755 14385C>T synonymous_variant ORF1ab_pp1ab N4795N 7.38e-1 BF.21
V2071 10511C>T missense_variant ORF1ab_pp1a P3504L 1.00e+0 BF.24
V9720 813A>T synonymous_variant N T271T 7.07e-1 BF.28
V4458 289G>A missense_variant ORF3a V97I 6.70e-1 BF.4
V640 1920G>T missense_variant ORF1ab_pp1a E640D 8.66e-1 BF.4
V3532 55A>C missense_variant S T19P 1.00e+0 BF.7.14
V6655 5766C>T synonymous_variant ORF1ab_pp1a N1922N 1.00e+0 BF.7.14
V3783 767C>T missense_variant S S256L 7.07e-1 BF.7.8
V9006 2904C>T synonymous_variant S S968S 1.00e+0 BM.1.1
V5080 32C>A missense_variant ORF8 T11K 7.07e-1 BN.1.3.1
V2408 12773C>T missense_variant ORF1ab_pp1a T4258I 7.07e-1 BN.1.4
V8707 594T>C synonymous_variant S D198D 1.00e+0 BN.1.4
V6126 1854C>T synonymous_variant ORF1ab_pp1a G618G 6.32e-1 BQ.1.1.10
V6260 2727T>C synonymous_variant ORF1ab_pp1a D909D 7.07e-1 BQ.1.1.24
V6877 7569C>T synonymous_variant ORF1ab_pp1a N2523N 7.07e-1 BQ.1.1.24
V8632 96C>T synonymous_variant S F32F 1.00e+0 BQ.1.1.31
V3608 284C>T missense_variant S T95I 1.00e+0 BQ.1.14
V1440 6128A>G missense_variant ORF1ab_pp1a D2043G 1.00e+0 BQ.1.1.6
V2549 13855C>T missense_variant ORF1ab_pp1ab P4619S 1.00e+0 BQ.1.1.6
V3711 556T>C missense_variant S F186L 1.00e+0 BQ.1.1.6
V5182 254C>T missense_variant ORF8 P85L 1.00e+0 BQ.1.25
V13 -188T>G upstream_gene_variant ORF1ab_pp1a None 1.00e+0 BR.2.1
V4169 3209C>T missense_variant S A1070V 7.07e-1 BR.2.1
V6016 1041G>A synonymous_variant ORF1ab_pp1a E347E 1.00e+0 BR.2.1
V3823 1037G>A missense_variant S R346K 7.05e-1 BT.2
V7744 14335C>T synonymous_variant ORF1ab_pp1ab L4779L 7.05e-1 BT.2
V1799 8347G>T missense_variant ORF1ab_pp1a V2783F 7.07e-1 C.36.3
V813 2708C>T missense_variant ORF1ab_pp1a A903V 1.00e+0 C.36.3
V3090 18304C>T missense_variant ORF1ab_pp1ab L6102F 7.07e-1 CH.1.1.2
V5841 48C>T synonymous_variant ORF1ab_pp1a L16L 8.16e-1 CH.1.1.2
V7371 11493C>T synonymous_variant ORF1ab_pp1a P3831P 1.00e+0 CH.1.1.2
V9247 759C>A synonymous_variant ORF3a S253S 8.16e-1 CH.1.1.2
V3305 19840C>T missense_variant ORF1ab_pp1ab L6614F 8.16e-1 CJ.1
V5393 433C>T missense_variant N H145Y 7.06e-1 CJ.1
V6958 8124C>T synonymous_variant ORF1ab_pp1a N2708N 8.16e-1 CJ.1
V4918 184C>T stop_gained ORF7a Q62* 7.07e-1 CL.1
V7693 13971C>T synonymous_variant ORF1ab_pp1ab Y4657Y 7.07e-1 CL.1
V167 218G>A missense_variant ORF1ab_pp1a R73H 7.07e-1 CP.1
V6364 3531C>T synonymous_variant ORF1ab_pp1a V1177V 7.07e-1 CP.1
V1538 6458C>T missense_variant ORF1ab_pp1a T2153I 1.00e+0 DR.1
V7951 15957G>T synonymous_variant ORF1ab_pp1ab P5319P 1.00e+0 DR.1
V1632 7093A>G missense_variant ORF1ab_pp1a M2365V 7.07e-1 XAZ
V5383 408G>T missense_variant N E136D 7.07e-1 XAZ
V6821 7038C>T synonymous_variant ORF1ab_pp1a I2346I 7.07e-1 XAZ
V9640 420T>C synonymous_variant N N140N 7.07e-1 XAZ
V2518 13640A>G missense_variant ORF1ab_pp1ab D4547G 7.07e-1 XBB.1.9
V2561 13922G>T missense_variant ORF1ab_pp1ab R4641M 7.07e-1 XBB.1.9
V7588 13116C>T synonymous_variant ORF1ab_pp1a V4372V 7.07e-1 XBB.1.9
V7851 15147T>C synonymous_variant ORF1ab_pp1ab N5049N 7.07e-1 XBB.1.9
V2587 14054C>T missense_variant ORF1ab_pp1ab T4685I 7.07e-1 XBB.2
V7578 13032C>T synonymous_variant ORF1ab_pp1a D4344D 7.07e-1 XBB.2
V7987 16197T>C synonymous_variant ORF1ab_pp1ab H5399H 1.00e+0 XBB.2
V4027 2050G>A missense_variant S A684T 7.07e-1 XBK
V6452 4269C>T synonymous_variant ORF1ab_pp1a Y1423Y 7.07e-1 XBK
V933 3163A>G missense_variant ORF1ab_pp1a T1055A 6.31e-1 XBK
V1389 5849C>T missense_variant ORF1ab_pp1a P1950L 1.00e+0 AM.4
V2699 15271G>T missense_variant ORF1ab_pp1ab A5091S 1.00e+0 B.1.1.220
V5652 1238G>T missense_variant N S413I 1.00e+0 B.1.1.220
V4683 172G>A missense_variant E V58I 8.65e-1 BA.1.3
V5600 1093C>T missense_variant N P365S 1.00e+0 BA.1.3
V6862 7434A>G synonymous_variant ORF1ab_pp1a K2478K 6.52e-1 BA.1.3
V6939 8025C>T synonymous_variant ORF1ab_pp1a L2675L 7.06e-1 BA.2.67
V1044 3606G>T missense_variant ORF1ab_pp1a K1202N 1.00e+0 BA.4.1.10
V291 544T>C missense_variant ORF1ab_pp1a Y182H 1.00e+0 BA.4.1.10
V613 1816G>A missense_variant ORF1ab_pp1a V606I 1.00e+0 BA.4.1.10
V6304 3108C>T synonymous_variant ORF1ab_pp1a D1036D 1.00e+0 BA.4.1.10
V7585 13107C>T synonymous_variant ORF1ab_pp1a V4369V 1.00e+0 BA.4.1.10
V9559 363C>T synonymous_variant ORF8 I121I 7.06e-1 BA.4.1.10
V3147 18720G>T missense_variant ORF1ab_pp1ab M6240I 7.06e-1 BA.5.5.3
V4203 3371G>T missense_variant S G1124V 7.06e-1 BA.5.5.3
V8887 1995C>T synonymous_variant S P665P 7.06e-1 BA.5.5.3
V194 252_254delTAT disruptive_inframe_deletion ORF1ab_pp1a M85del 1.00e+0 CK.1.2
V9438 120C>T synonymous_variant ORF7a Y40Y -1.00e+0 CK.1.2
V548 1613C>T missense_variant ORF1ab_pp1a S538L 1.00e+0 CM.2.1
V4218 3440C>T missense_variant S S1147L 1.00e+0 DC.1
V9475 291C>T synonymous_variant ORF7a Y97Y 1.00e+0 XBD
V995 3388C>T missense_variant ORF1ab_pp1a L1130F 1.00e+0 XBD





Manual curation of mutation (221C>T)-related literature from PubMed

The pubmed.mineR and pubmed-mapper were utilized for extracting literature from PubMed, followed by manual filtering.
Note: PubMed: (COVID-19 [Title/Abstract] OR SARS-COV-2 [Title/Abstract]) AND (DNA mutation [Title/Abstract] OR Protein mutation-1 letter [Title/Abstract] OR Protein mutation-3 letter [Title/Abstract]).

DNA level Protein level Paper title Journal name Publication year Pubmed ID