Anonymous ID: 5424bc Feb. 9, 2022, 10:13 p.m. No.15591817   🗄️.is 🔗kun   >>1819 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

https://www.frontiersin.org/articles/10.3389/fimmu.2022.801915/full

 

Front. Immunol., 08 February 2022 | https://doi.org/10.3389/fimmu.2022.801915

Are There Hidden Genes in DNA/RNA Vaccines?

Christopher A. Beaudoin1, Martin Bartas2, Adriana Volná3, Petr Pečinka2 and Tom L. Blundell1*

1Department of Biochemistry, Sanger Building, University of Cambridge, Cambridge, United Kingdom

2Department of Biology and Ecology, University of Ostrava, Ostrava, Czechia3Department of Physics, University of Ostrava, Ostrava, Czechia

 

Due to the fast global spreading of the Severe Acute Respiratory Syndrome Coronavirus – 2 (SARS-CoV-2), prevention and treatment options are direly needed in order to control infection-related morbidity, mortality, and economic losses. Although drug and inactivated and attenuated virus vaccine development can require significant amounts of time and resources, DNA and RNA vaccines offer a quick, simple, and cheap treatment alternative, even when produced on a large scale. The spike protein, which has been shown as the most antigenic SARS-CoV-2 protein, has been widely selected as the target of choice for DNA/RNA vaccines. Vaccination campaigns have reported high vaccination rates and protection, but numerous unintended effects, ranging from muscle pain to death, have led to concerns about the safety of RNA/DNA vaccines. In parallel to these studies, several open reading frames (ORFs) have been found to be overlapping SARS-CoV-2 accessory genes, two of which, ORF2b and ORF-Sh, overlap the spike protein sequence. Thus, the presence of these, and potentially other ORFs on SARS-CoV-2 DNA/RNA vaccines, could lead to the translation of undesired proteins during vaccination. Herein, we discuss the translation of overlapping genes in connection with DNA/RNA vaccines. Two mRNA vaccine spike protein sequences, which have been made publicly-available, were compared to the wild-type sequence in order to uncover possible differences in putative overlapping ORFs. Notably, the Moderna mRNA-1273 vaccine sequence is predicted to contain no frameshifted ORFs on the positive sense strand, which highlights the utility of codon optimization in DNA/RNA vaccine design to remove undesired overlapping ORFs. Since little information is available on ORF2b or ORF-Sh, we use structural bioinformatics techniques to investigate the structure-function relationship of these proteins. The presence of putative ORFs on DNA/RNA vaccine candidates implies that overlapping genes may contribute to the translation of smaller peptides, potentially leading to unintended clinical outcomes, and that the protein-coding potential of DNA/RNA vaccines should be rigorously examined prior to administration.

 

pt 1

Anonymous ID: 5424bc Feb. 9, 2022, 10:13 p.m. No.15591819   🗄️.is 🔗kun   >>1823 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591817

Introduction

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a positive-sense single-stranded RNA virus that was first described in late 2019 (1). SARS-CoV-2 is phylogenetically related to the causative agent of the 2002 SARS-CoV epidemic and causes many of the same symptoms, such as fever and myalgia (2). Because of the high transmissibility of SARS-CoV-2 and rapid spreading throughout the world, by March 2020, the World Health Organization declared the global outbreak as the COVID-19 pandemic (3). The health and economic-related losses accruing as a result of the pandemic led to the prioritization of prevention and treatment options with the quickest route to safe clinical application (4). Although small molecule inhibitors and inactivated or live attenuated virus vaccine candidates have been used to successfully treat infection by pathogenic viruses, the pipelines to bring these products into clinical use can require significant time and resources with potentially low success rates (5, 6). However, among novel vaccine delivery platforms developed in recent years, DNA and RNA vaccines have become of interest due to their potential to be inexpensively and quickly produced at a large scale (7). Only the nucleotide sequence of the selected antigenic protein is required to begin production, which can be derived from DNA/RNA sequencing of the virus. Thus, DNA/RNA vaccines have been suggested as prime candidates for mitigating COVID-19 transmission.

 

The SARS-CoV-2 genome codes for at least 30 proteins, three of which are exposed on the virion surface and can be recognized by the immune cell system (8–10). The spike protein is a large trimeric glycoprotein (1,273 amino acid long protomers) that protrudes from the virion surface to bind to cell surface receptors on host cells, such as angiotensin converting enzyme II (ACE2), in order to initiate viral entry (11). The large surface area of the spike protein and its role in host cell entry make it attractive as a target for the immune system and clinical treatments, such as drugs and therapeutic antibodies (12). Of note, the spike protein is heavily glycosylated, which helps shield the virus from interactions with antibodies (13). The region with the lowest degree of glycosylation is the receptor-binding domain, which binds to host cell surface proteins to initiate viral entry, and, as a result, is the most antigenic region of the spike protein (14). The other two proteins exposed on the virion surface, the envelope and membrane proteins, are also available for use as antigen targets; however, they are smaller in size and less accessible for protein-protein interactions than the spike protein. Because of the evidence indicating that spike is the most suitable antigenic target for SARS-CoV-2, it has been widely used in vaccine trials.

 

pt2

Anonymous ID: 5424bc Feb. 9, 2022, 10:14 p.m. No.15591823   🗄️.is 🔗kun   >>1826 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591819

Numerous companies and academic institutions across the globe have developed or are currently developing DNA/RNA vaccines for the SARS-CoV-2 spike protein (15). Generally, in the case of DNA vaccines, the full-length SARS-CoV-2 spike protein DNA sequence is inserted into a plasmid, and additional technologies, such as electroporation, can assist in making transfection more efficient (16, 17). The spike protein DNA transfected into the human cell can then be transcribed and translated to create the trimeric spike protein, which, then, moves to the endoplasmic reticulum and Golgi apparatus for post-translational modification (e.g. signal sequence cleavage, glycosylation) and continues through the secretory route to become anchored to the cell membrane for exposure to the immune system (18, 19)⁠. The RNA-based vaccine formulations comprise lipid nanoparticles assembled around mRNA molecules coding for the full-length SARS-CoV-2 spike sequence (20). The transfected mRNA can be directly translated to make the spike protein. The BioNTech/Pfizer and Moderna mRNA vaccines, which have widely been approved by government agencies and administered in several countries, have reported approximately 50-70 and 70-90% effectiveness after 1 and 2 doses, respectively, against the wild-type and alpha variant (B.1.1.7) and 30-60 and 60-90% effectiveness, respectively, against the beta (B.1.351) and gamma (P.1) variants (21, 22). However, additional variants of concern have been noted to provide either further partial or complete immune escape; thus, adapting the sequences may be required over time (23, 24). Prior to COVID-19, no DNA/RNA vaccines had been approved for human use, but, in August 2021, BioNTech and Pfizer received FDA approval for use of their mRNA vaccine (25). Further investigation into nucleic acid-based vaccine delivery platforms may improve effectiveness.

 

Although SARS-CoV-2 DNA/RNA vaccines have been subjected to health and safety testing prior to bulk dissemination, a diverse assortment of both systemic and local (near injection site) side effects, ranging from mild to severe, following vaccination have been described (26, 27). Symptoms resembling that of viral infection (e.g. headache and myalgia), life-threatening conditions (e.g. myocardial injury and thrombosis), and mortalities have been reported in relation to vaccination (28–31). Although some side effects may stem from the delivery modalities, several studies have indicated that the spike protein alone causes adverse effects on host tissues, such as blood brain barrier disruption, neuron fusion, inflammation, and cell senescence (32–35). Although it is difficult to detect the origin of side effects in vaccinated individuals, more investigation on the cellular effects of mRNA vaccines or the expressed protein antigen are warranted to create safer vaccines.

 

pt3

Anonymous ID: 5424bc Feb. 9, 2022, 10:14 p.m. No.15591826   🗄️.is 🔗kun   >>1829 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591823

 

Numerous companies and academic institutions across the globe have developed or are currently developing DNA/RNA vaccines for the SARS-CoV-2 spike protein (15). Generally, in the case of DNA vaccines, the full-length SARS-CoV-2 spike protein DNA sequence is inserted into a plasmid, and additional technologies, such as electroporation, can assist in making transfection more efficient (16, 17). The spike protein DNA transfected into the human cell can then be transcribed and translated to create the trimeric spike protein, which, then, moves to the endoplasmic reticulum and Golgi apparatus for post-translational modification (e.g. signal sequence cleavage, glycosylation) and continues through the secretory route to become anchored to the cell membrane for exposure to the immune system (18, 19)⁠. The RNA-based vaccine formulations comprise lipid nanoparticles assembled around mRNA molecules coding for the full-length SARS-CoV-2 spike sequence (20). The transfected mRNA can be directly translated to make the spike protein. The BioNTech/Pfizer and Moderna mRNA vaccines, which have widely been approved by government agencies and administered in several countries, have reported approximately 50-70 and 70-90% effectiveness after 1 and 2 doses, respectively, against the wild-type and alpha variant (B.1.1.7) and 30-60 and 60-90% effectiveness, respectively, against the beta (B.1.351) and gamma (P.1) variants (21, 22). However, additional variants of concern have been noted to provide either further partial or complete immune escape; thus, adapting the sequences may be required over time (23, 24). Prior to COVID-19, no DNA/RNA vaccines had been approved for human use, but, in August 2021, BioNTech and Pfizer received FDA approval for use of their mRNA vaccine (25). Further investigation into nucleic acid-based vaccine delivery platforms may improve effectiveness.

 

Although SARS-CoV-2 DNA/RNA vaccines have been subjected to health and safety testing prior to bulk dissemination, a diverse assortment of both systemic and local (near injection site) side effects, ranging from mild to severe, following vaccination have been described (26, 27). Symptoms resembling that of viral infection (e.g. headache and myalgia), life-threatening conditions (e.g. myocardial injury and thrombosis), and mortalities have been reported in relation to vaccination (28–31). Although some side effects may stem from the delivery modalities, several studies have indicated that the spike protein alone causes adverse effects on host tissues, such as blood brain barrier disruption, neuron fusion, inflammation, and cell senescence (32–35). Although it is difficult to detect the origin of side effects in vaccinated individuals, more investigation on the cellular effects of mRNA vaccines or the expressed protein antigen are warranted to create safer vaccines.

pt 4

Anonymous ID: 5424bc Feb. 9, 2022, 10:15 p.m. No.15591829   🗄️.is 🔗kun   >>1837 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591826

Codon Optimization of DNA/RNA Vaccine Candidates

Although the presence of overlapping genes on the wild-type nucleotide sequence of the spike protein challenges the effectiveness of DNA/RNA vaccines, precautionary steps can be taken to prevent the translation of these smaller, internal ORFs. For example, vaccine nucleotide sequences can be selectively codon optimized, as is normally performed to enhance translation efficiency in host tissues, to remove alternative start codons and internal ribosome entry sites, thus preventing non-specific recognition by ribosomal complexes (67). Codon optimization without consideration of overlapping ORFs, however, can result in both disruption of the current overlapping ORFs, ORF2b and ORF-Sh in the case of the spike protein vaccines, or spontaneously generating new ORFs. Although most, if not all, DNA/RNA vaccine candidate spike sequences have been reported to be codon optimized for translation in human cells, the spike nucleotide sequences have largely, so far, been kept private by the corresponding company or institution. Interestingly, however, the Moderna mRNA-1273 and Pfizer BNT162b2 vaccine mRNA sequences have been made publicly-available (https://github.com/NAalytics; https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/). The posting of these data allows direct comparative analyses between the vaccine-formulated and wild-type spike protein sequences.

 

Comparing the nucleotide sequences of the wild-type and vaccine mRNA spike proteins may reveal the extent to which the sequences have been changed during codon optimization, thus potentially altering translation efficiency of the spike protein and overlapping ORFs. Of note, prior to codon optimization, both companies have reported including proline mutations to stabilize and preserve the spike protein structure, thus implying small changes in spike amino acid content as well. Using the EMBOSS Needle pairwise sequence alignment tool, the wild-type spike sequence (NCBI accession: NC_045512) is found to be 68.7% and 45.3% identical to the mRNA-1273 and BNT162b2 vaccine spike sequences (as opposed to the entirety of the mRNA sequence), respectively, and the mRNA-1273 and BNT162b2 spike sequences are 48.6% identical to one another (68). The GC contents of the wild-type, BNT162b2, and mRNA-1273 spike nucleotide sequence, which correlate well with translation efficiency, are 37.3%, 56.9%, and 62.3%, respectively. These alignments reveal that extensive codon optimization was performed during vaccine preparation.

 

To quantify the degree to which the codon optimization performed on the vaccine mRNA sequences matches that of the human genome amino acid pool, the codon adaptability index (CAI), which has been noted to be an accurate reflector of gene translation, was calculated for all three spike sequences using the COUSIN and CAIcal web servers (69–71). As a reference, calculated CAI values for SARS-CoV-2 genes with regards to human codon usage average around 0.7, and a higher score represents a stronger indication for translation (72, 73). The CAI values for the wild-type, BNT162b2, and mRNA-1273 spike nucleotide sequences are 0.703, 0.715, and 0.981, respectively. While the BNT162b2 vaccine CAI value was slightly increased compared to the wild-type sequence, the mRNA-1273 vaccine CAI value was found to be significantly higher – almost reaching the maximum value. These findings suggest that the codon optimization used on both vaccine sequences have resulted in higher translation potential than the wild-type. Notably, the mRNA-1273 vaccine codon usage seems much more closely aligned with human codon biases, and the sequence contains a lower amount of substituted nucleotides and a higher GC content.

 

pt5

Anonymous ID: 5424bc Feb. 9, 2022, 10:16 p.m. No.15591837   🗄️.is 🔗kun   >>1841 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591829

Overlapping ORFs on DNA/RNA Vaccine Candidates

Considering the extensive codon optimization performed on the vaccine spike sequences, the comparison of putative ORFs in the wild-type and selected vaccine mRNA sequences may shed light on the protein-coding potential of DNA/RNA sequences used in SARS-CoV-2 vaccines. Thus, in order to examine the differences between the available ORFs on the wild-type, Moderna mRNA-1273, and Pfizer BNT162b2 sequences, putative ORFs of all three nucleotide sequences were detected using the NCBI ORFfinder web server (https://www.ncbi.nlm.nih.gov/orffinder/). Although ORF identification using this tool does not imply translation, an overview of the available reading frames may provide insights into coding potential differences between the wild-type and vaccine candidate sequences. Minimum ORF length was set to the default 75 nucleotides, no alternative initiation codons were allowed, and only “ATG” start sites were considered.

 

As shown in Figure 2, ORF2b and ORF-Sh were found in the wild-type sequence; however, both ORFs are absent in both of the mRNA vaccine candidate sequences. The counts, lengths, and sequence identities of predicted ORFs on both mRNA sequences were found to be markedly different from one another and from the wild-type, re-asserting that codon optimization can result in significant changes in the presence of overlapping ORFs. Eleven small overlapping ORFs (27-87 residues long) were discovered using NCBI ORFfinder on the wild-type spike protein sequence, and eight small ORFs (26-52 residues long) were found to overlap the Pfizer BNT162b2 vaccine mRNA sequence. Notably, the Moderna mRNA-1273 vaccine mRNA sequence displayed no overlapping sequences on the positive sense strand – only on the negative sense, which can be disregarded when considering mRNA. However, DNA-based vaccines, such the INO-4800 SARS-CoV-2 spike DNA vaccine from INOVIO Pharmaceuticals, should be assessed for the presence of protein-coding ORFs on the reverse strand (73). Thus, in terms of predicted protein-coding potential, the Moderna mRNA-1273 mRNA vaccine appears to be the most optimized sequence of the two to solely code for the SARS-CoV-2 spike protein. These findings also support the notion that vaccine candidate sequence codons can be reliably edited to remove undesired ORFs. Newly predicted ORFs on the Pfizer BNT162b2 vaccine mRNA sequence, on the other hand, highlight the fact that codon optimization can also lead to the spontaneous generation of novel overlapping ORFs. Of interest is the observation that allowing detection of alternative initiation codons increased the number of predicted ORFs on the positive sense strand of the wild-type (11 to 19 ORFs), BNT162b2 (8 to 25), and mRNA-1273 (0 to 4) sequences. Experimental validation and in-depth genomic analysis and annotations, however, are required to validate the presence or absence of these and other ORFs on the spike protein vaccine candidate sequences.

 

pt 6

Anonymous ID: 5424bc Feb. 9, 2022, 10:16 p.m. No.15591841   🗄️.is 🔗kun   >>1855 >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591837

Additional Steps to Exclude Overlapping ORFs on DNA/RNA Vaccine Sequences

Alternatively to codon optimization, another option for safeguarding against overlapping ORFs in a vaccine candidate is to select short section(s) of the protein sequence that code for the most antigenic regions, as exemplified by the Pfizer BNT162b1 mRNA vaccine that codes for a trimeric construct of the receptor-binding domain of the SARS-CoV-2 spike protein (74). A shorter sequence may have a lower potential to code for other smaller proteins. ORF predictions using the NCBI ORFfinder on the nucleotide sequences corresponding to the receptor-binding domain (nucleotides 999-1569 on wild-type spike sequence) of the wild-type, BNT162b2, and mRNA-1273 spike proteins revealed the presence of three (29-36 aa), two (28 and 44 aa), and zero ORFs on alternative frames, respectively, and three (29-69 aa), six (27-53 aa), and one (170 aa) ORFs, respectively, when considering alternative initiation codons. Although the shortening of the spike sequence reduces the number of overlapping ORFs, the potential for alternative translation still remains.

 

Multimeric vaccine DNA/RNA sequences that include antigenic regions of different viral proteins could also be used to increase immunogenicity while shortening the length of the construct and, thus, controlling for the presence of overlapping ORFs (75–77). For example, the hepatitis C E2 protein scaffold has been used to present the antigenic HIV-1 gp120 variable loop region to promote immunogenicity for potential HIV vaccination (78). Thus, the downsizing of the sequence to include only the most antigenic regions of the spike receptor-binding domain, such as the receptor-binding motif, or domains from other viral proteins to be placed on a codon-optimized protein scaffold may further control for overlapping protein-coding sequences (79). Sequence length and content can further affect the number of overlapping ORFs, but the scrutinization of protein-coding regions nevertheless relies on validating the translation of alternative reading frames.

 

The use of experimental techniques, such as ribosomal profiling or mass spectrometry, on vaccinated patient or laboratory animal samples or pseudovirus-infected tissue cultures may help determine whether the overlapping ORFs are translated and to what degree they are translated compared to the intended protein. Thus, several potential checkpoints can be utilized to control for the translation of small ORFs within DNA/RNA vaccine candidate sequences. Otherwise, unintended proteins could be translated by the host cell, which may lead to side effects resembling that of viral infection symptoms.

 

pt 7

Anonymous ID: 5424bc Feb. 9, 2022, 10:19 p.m. No.15591855   🗄️.is 🔗kun   >>1893 >>2036 >>2124 >>2187 >>2255 >>2284

>>15591841

Conclusions

DNA/RNA vaccines have proven to be an effective way to develop vaccines quickly for emerging pathogens. However, with a new set of solutions, comes a new set of problems (80). Although the wild-type SARS-CoV-2 spike protein nucleotide sequence has been found to code for translated overlapping genes, ORF detection predictions on the sequences of two mRNA vaccines reveal that codon optimization has the potential to disrupt non-specific translation. Additional overlapping ORFs can arise during codon optimization; thus, the final sequences should nevertheless be scrutinized for their protein-coding potential. In the case of DNA vaccines and viral vectors, the negative-sense strand should also be checked for its protein-coding potential. Additionally, as variants of concern become known and vaccines are altered to include them, the spontaneous generation of ORFs should be re-assessed. Many precautionary steps have been taken to ensure the safety and efficacy of the mRNA vaccines, including nucleoside modification to reduce inflammatory responses and 5’-capping and polyadenylation tail length optimization to increase mRNA stability and translation (20). Thus, the inclusion of additional steps to ensure that vaccine sequences code solely for the intended protein may also lead to better health and safety outcomes. Measures to check for other adverse effects on host cells, such as those resulting from potential interactions of vaccine nucleotide sequences with host RNAs or proteins, or the host microbiome may be increase efficacy and safety as well (81)⁠. More in-depth investigation of these delivery methods may reveal aspects that should be further refined to safeguard against unintended side effects.

 

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

 

Author Contributions

CB, MB, and AV contributed to conception and design of the study. CB, MB, and AV contributed to sequence and structural analyses. All authors contributed to manuscript writing and revision. All authors contributed to the article and approved the submitted version.

 

Funding

TB thanks theWellcome Trustfor support through an Investigator Award (200814/Z/16/Z; 2016 -2021). CB was supported byAntibiotic Research UK(PHZJ/687).

 

pt 8 of 8