|General Replication Strategies for RNA Viruses
Guest Writer: Dr. Michael Beard
The replication of viral RNA genomes is unique considering that the host cell does not contain a RNA dependent RNA polymerase. To overcome this constraint, the majority of RNA viruses encode their own RNA polymerase that is either packaged with the virus genome or is synthesised shortly after infection.
Single strand RNA viruses can be classified into three classical groups according to the polarity of their genomes, which in turn dictates the strategy for viral replication (White and Fenner, 1994). The first group, viruses with (+) sense RNA genomes include the Picornaviridae, Flaviviridae, Togaviridae, Caliciviridae and the Coronaviridae. The linear RNA genomes of the Picornaviridae and Flaviviridae are polycistronic and act as mRNA that is translated into a viral polyprotein which is subsequently cleaved into individual viral polypeptides, one of which is a RNA dependent RNA polymerase. This polymerase uses the (+) sense input RNA as a template for the transcription of (-) sense RNA which in turn can act as a template for the production of nascent (+) sense RNA. This newly transcribed (+) sense RNA has 3 possible functions;
(ii) a template for production of additional (-) strands or
(iii) packaged as progeny virus.
In contrast to the above replication strategies the Togaviridae and the Coronaviridae use a slightly different mechanism in which subgenomic mRNA species are used for the transcription of the structural proteins. Only the 5' two thirds of the Togaviridae (+) sense genome is translated that results in the synthesis of a polyprotein that is post- translationally cleaved into the non-structural proteins one of which is a RNA dependent RNA polymerase. This polymerase then synthesises full length (-) sense RNA from which two species of (+) sense RNA are copied;
(i) virion RNA that can be packaged or can act as a template for more (-) strand synthesis
(ii) a smaller subgenomic mRNA species that is translated into a polyprotein from which the structural proteins are derived.
The Coronaviridae replicate using a similar mechanism in which a subgenomic 3' nested set of overlapping (+) sense mRNA molecules are synthesised for the translation of the structural proteins. All the (+) sense RNA viruses share a common theme in that their genomes all have the ability to act as a mRNA thereby eliminating the need for the virus to package a RNA polymerase.
The second group, comprising the Orthomyxoviruses, Paramyxoviruses, Bunyaviruses, Arenaviruses and Rhabdoviruses all have (-) sense RNA genomes. The genomes of these viruses must serve two functions, firstly to serve as a template for transcription to generate (+) sense RNA and secondly as a template for replication. Upon entry into the host cell the (-) sense genome is transcribed to generate (+) sense monocistronic RNA that serves as mRNA for the production of viral proteins which initiate genome replication. This (+) sense RNA also serves as a template for the synthesis of (-) strand genomic RNA's. In contrast to the (+) sense RNA viruses described above, (-) strand viruses must package a functional RNA polymerase to initiate transcription of their (-) sense genome.
The Retroviridae make up the third group in which the RNA genome does not function as a (+) or (-) sense molecule but acts as a template for the production of viral DNA. This is achieved by RNA dependent DNA polymerase (reverse transcriptase) that is packaged with the RNA genome. The resulting viral DNA integrates into the host cell genome to provide the template for viral RNA synthesis by host derived mechanisms.
This article has been reproduced with the kind permission of its author, Dr. Michael Beard, previously of the Department of Medicine, Divison of Infectious Diseases, University of North Carolina.
This work may not be reprinted without the prior knowledge and consent of its author.
In addition to the chemical differences we have already talked about here, RNA is also shorter, and generally single-stranded (this does not apply to some viruses as we will see later). RNA also functions differently from DNA.
While the only role of DNA is the storage of genetic information, RNA has many different roles to fulfil. There are several types of RNA which perform these different functions.
Ribosomal RNA (rRNA) form complexes with protein to form ribosomes, the site of protein synthesis wihtin the cytoplasm of the cell.
Messenger RNA (mRNA) carries the information recorded in DNA from the nucleus to the cytoplasm of the cell.
Small nuclear RNA (snRNA) is involved in pre-mRNA splicing.
Heterogenous nuclear RNA (hnRNA) is the primary transcript from the eukaryotic enzyme, RNA polymerase II. hnRNA is the precursor of all mRNA often called "pre-mRNA", prior to the removal of introns.
Small nucleolar RNA (snoRNA) is found in the cell's nucleolus where it processes and methylates rRNA.
Transfer RNA (tRNA) carries amino acids to nascent polypeptide chains synthesised on the ribosomes.
DNA exists within our cells as chromosomes. Chromosomes are single moelcules which contain regions that carry the information to produce or "encode" proteins or RNA molecules. These regions are called genes and they are the most basic functional genetic units in our chromosomes. We have approximately 35 000 genes some of which are expressed contiuously ("house-keeping genes"), some only when the cell is undergoing certain processes or only in cells that have matured in a particular way, and some are expressed in response to an environmental stimulus. The transcription start site defines a gene. Sequences "before" or 5' to the start site are called upstream, and those after or 3' are called downstream. Pseudogenes or remnants of duplicated genes that, due to mutation, no longer function are sometimes found in humans.
When consisdering all of our DNA, including the genes and many other sequences which do not encode proteins, we are talking about our genome. This name also applies to viruses - although a viral genome has much less DNA (or RNA) than a human genome.
A cistron is the smallest unit of DNA that can encode a protein. A cistron does not include any regulatory or non-coding sequences.
Prokaryotic cells generally group their closely related genes and those genes activated or inactivated at the same time, near to each other. The genes together with their controlling elements are called operons and may be transcribed as a single mRNA which is polycistronic, or capable of encoding several proteins. Polycistronic messenger RNA (mRNA) consists of gene sequences separated by intercistonic sequences. Preceding the first gene is a leader sequence and following the last gene is a trailer sequence. The DNA between prokaryotic genes is called intergenic DNA.
Eukaryotic cells organise their genome very differnetly. DNA encoding a gene's precursor mRNA (pre-mRNA) is organised into regions called exons (EXpressed sequences) which may be spread across thousands of nucleotide base pairs (bp). The areas between exons ina gene are called introns (INtervening sequences).
Introns are not removed by luck, but with the aid of sequence specific splicing signals. Most introns start (5') with the sequence GU and end (3') with an AG which are referred to as the splice donor and splice acceptor sites. Another important sequence is the branch site located 20-50 base pairs upstream (5') of the splice acceptor site and containing a conserved A.
Five small nuclear RNA molecules (snRNA) and their proteins form a complex called the spliceosome. When snRNA is associated with proteins they are known as small nuclear ribonucleoproteins (snRNP; "snurps"). The five snRNPs which form the spliceosome are called U1, U2, U4, U5 and U6. The splice donr site is attched to the branch site to form a lasso or "lariat". Through an enzymatic process the intron is then removed and the exons joined together.
As with many things in biology, there is more than one way for introns to be spliced. Another form of intron removal involving a spliceosome is called alternative splicing and is shown below. This relies upon alternative splice sites wihtin exons. This process can produce more than one protein due to different ways of splicing the same mRNA. Interestingly, eukaryotes carry a lot of DNA that does not appear to encode any protein. This is often called junk DNA.
But intron removal can occur in the absence of a spliceosome, or in fact, any protein-based enzyme at all. These introns are removed by self-splicing and rely upon the action of catalytic RNA molecules called ribozymes. Self-splicing introns are divided into two groups based on the way the chemoistry behind the splicing. Group I introns are found in protozoa, fungal mitochondria, bacteriophage T4 and bacteria. Group II introns exist in mitochondrial and chloroplast genes (plastids).
The region of mRNA that encodes the protein is called the coding sequence (cds) and is a duplicate of the exon region of the DNA since the introns are removed from the mRNA. Human genes are usually monocistronic meaning that each protein is translated from a single mRNA.
Regulatory sequences on the DNA called enhancers, permit the binding of proteins that control gene expression. Enhancer sequences may be kilobase pairs away from the exons.
The transmissible spongiform encephalopathies (TSEs) are a group of invariably fatal neurodegenerative diseases found in a wide range of mammals. The disease is found naturally in many ruminants (scrapie, bovine spongiform encephalopathy-BSE), deer (chronic wasting disease) and mink (transmissible mink encephalopathy), as well as humans (see later). The disease can also be experimentally transmitted to rodents, pigs and primates (6). TSEs are characterised by long incubation times (in humans can be greater than 30 years), and an infected individual will usually show some signs of progressive ataxia, dysarthia, dysphagia, nystagmus, myoclonus and/or dementia. The time from onset of symptoms to death is highly variable (in humans it ranges from a few months to 10 years) (6).
THE INFECTIOUS AGENT
The TSEs are novel in that they are currently believed to be caused by an abnormal folding of a host encoded protein, with no nucleic acid component (1), although this hypothesis remains controversial. The protein has been named the prion protein (PrP), the normal form of the protein is termed PrPC (Cellular), and the disease form PrPSC (SCrapie). It is currently believed that the PrPSC form of the protein can arise spontaneously, but that it can then go on to auto-catalyse the conversion of PrPC to PrPSC (1).
The two forms of the protein have some different properties(2): PrPC is anchored to the cell membrane by a glyco-phospho-inositol (GPI) anchor, whilst PrPSC accumulates in endosomes; PrPSC accumulates in diseased individuals in plaque deposits in the brain, and is partially resistant to proteolytic digestion with proteinase K; and PrPSC is highly resistant to most sterilising procedures, and is not inactivated by treatment with many sterilising agents such as UV light (3), nor by autoclaving (4). However, the two proteins seem to have the same post-translational modifications, and cannot be distinguished with monoclonal antibodies (5). In addition, there are no in vitro assays which can be used to determine PrPSCbiological activity. Much of the research in this area has therefore been concentrated on the primary sequence of the PrP gene, as well as the use of transgenic animals carrying different alleles of the gene.
THE PrP GENE
A schematic of the PrP gene is shown in Figure 1. The gene is c 750 base pairs (bp) in length, coding for a c 250 amino acid (aa) protein, with the following domains (7,8,9): the N-terminal 22 aas encode a cleaved signal peptide involved in transport of the protein to the cell surface; the C-terminal 26 aas encode a signal sequence that is cleaved in the golgi when the GPI anchor is added; there are two glycosylation sites, and two C1 residues involved in intra-molecular disulfide bonding; finally, there is a region in the N terminal half of the gene which encodes a series of G-P1 rich octa peptide repeats. In humans, a number of pathogenic polymorphisms have been described, which are responsible for the inherited forms of this disease (see later).
There are 4 human diseases classified as TSEs. These are Creudtzfeldt-Jacob disease (CJD), Gertsmann-Straussler syndrome (GSS), fatal familial insomnia (FFI) and kuru (9). The latter is confined to the Fore tribe of Papua New Guinea (PNG), and is caused by cannibalistic rituals, specifically the preparation and eating of human brains. Since the widespread cessation of cannibalism in PNG, kuru has declined dramatically, and is believed to have been wiped out. Cases which still arise are thought to be due to the long incubation time of the disease, in people who engaged in cannibalism earlier this century. CJD is the most common TSE diagnosed in humans, and falls into three categories, iatrogenic, inherited, and sporadic.
Iatrogenic cases are extremely rare. They occur when contaminated material is transplanted (eg cornea or dura mater transplants), or instruments used in neuro-invasive procedures are contaminated (eg depth electrodes). most of the cases are due to batches of contaminated growth hormone prepared from human cadavers (10,11). Sporadic CJD has an incidence rate of c 1/million people/year, world wide. No correlation between sporadic CJD and populations that may be considered high risk (eg abattoir workers, shepherds) has been observed. It is currently believed that sporadic CJD arises through the spontaneous conversion of PrPCto PrPSC in an individual (2,6). Inherited CJD is caused by pathogenic polymorphisms in the human PrP gene. Such genetic lesions tend to be dominant although of variable penetrance, although it is possible given the long incubation times of the disease that asymptomatic people with a particular genetic lesion are dying of old age prior to onset of disease. A large number of different polymorphisms have been described for different lineages(12, 15). These include point mutations, such as p102l1 (13), as well as extra or fewer octa peptide repeats in the G-P rich region (12). GSS is an inherited disease similar to inherited CJD. Indeed, the former can be considered a sub-class of the latter.
In addition to pathogenic mutations, humans also have a neutral polymorphism, M129V This mutation has been shown to have an effect on both sporadic and inherited CJD. People who are M/M homozygous AND have the P102L mutation suffer from FFI, whereas those who have M/V or V/V and the P102L mutation suffer from typical CJD (14). FFI is characterised by progressive insomnia and torpor, with quite different symptoms to typical CJD. Also, it has been shown that people who are homozygous (M/M or V/V) are more likely susceptible to sporadic CJD than people who are heterozygous at this allele (15).
THE BSE OUTBREAK
Changes in the processing of cow and sheep carcasses for rendering into bone meal (protein supplement for cattle) in the UK in 1981/82 is the most likely cause of the BSE epidemic (17). Solvent extraction and certain heating steps were removed, and it is theorised that existing scrapie and BSE were no longer being inactivated, but instead being passed back into the food chain (17). Whether BSE can or will transmit to humans remains unknown at this time, with estimates of human infection ranging from 0 to 10 million. Ten cases of "atypical" CJD have been described in the literature (16). These cases were unusual in that they infected young people (median age 32). The Spongiform Encephalopathy Advisory Committee (SEAC) said that these cases were most likely caused by BSE, and on the strength of this the EC banned all British beef imports (The UK exported 400 000 head of cattle in 1995). However, much more research needs to be done before the transmission of BSE to humans is proven.
Construction of a Lentivector
To construct a delivery vector, certain essential cis-acting sequences must be retained within the retroviral vector genome. These include:
The packaging signal sequence (Y) which ensures the encapsidation of the vector RNA into virions.
Elements that are necessary in the reverse transcription process:
Primer binding site (PBS) - which binds the tRNA primer of reverse transcription
Terminal repeat (R) sequences - to guide reverse transcriptase between the RNA strands during DNA synthesis
Purine-rich region 5' of the 3' LTR - which acts as the priming site for synthesis of the second (+) DNA strand
Specific sequences near the ends of the LTRs that are necessary for the integration of the proviral vector into the chromosome of the host cell. The most common retroviral vector designs use the LTR of the virus backbone and an internal promoter to drive the expression of the foreign gene52. This approach gives rise to the phenomenon of "promoter suppression". When selection is applied for one gene from multiple transcription units, the expression of the other gene can be reduced or lost completely.
An internal ribosome entry site (IRES) sequence may be used instead of an internal promoter in a vector with two or more foreign genes to avoid the potential of promoter suppression. An IRES permits multiple proteins to be produced from a single vector without alternative splicing or multiple transcription units, hence increasing the stability of the transferred gene53.
The essential minimum packaging signal (Y) sequence of lentiviruses is still unclear. It is generally accepted to be in a region between the 5' LTR splice donor and the gag start codon.
Addition of 5' gag sequences to the vector backbone has been shown to increase packaging efficiency54,55. The 3' env gene fragment encompassing the Rev response element (RRE) was reported to enhance the encapsidation of vectors when placed upstream of the heterologous genes56,57.
Inclusion of this fragment allows accumulation of unspliced vector RNA in the presence of Rev protein and enhances the transduction of recombinant vectors56.
By incorporating the above information, the vector design, shown below, has the potential to overcome the current low titer production and transduction efficiency of lentiviral vectors.
Ideally, a recombinant lentiviral vector would be the best gene transfer system for noncycling cells. It has promising clinical applications, especially for cystic fibrosis gene therapy.
Lentivirus-based Vectors for CF Gene Therapy
Lentiviruses are part of the family Retroviridae. Like other retroviruses, they are enveloped viruses that carry a core of RNA encoding their genetic information. The structure of a lentivirus genome is shown in the figure below.
Figure 1. Schematic representation of lentiviral genome. LTR (long terminal repeats) - contains viral promoter and enhancer. SD - splicing donor. Y - packaging signal. Gag - codes for virion core. Pol - generates reverse transcriptase, endonuclease, and protease. Env - codes for envelope protein
Apart from boasting some of the best properties of current retrovirus vectors for gene delivery (see previous section), lentiviruses are the only retroviruses able to integrate into the chromosome of non-dividing cells. Gene transfer vectors based on HIV-1 have been shown to transduce non-dividing cells effectively in vitro46.
Several research groups have described HIV-based vectors, unfortunately the virus titers are extremely low47,48. Recently, a much higher production of vector stock was achieved when an amphotropic envelope protein was substituted for the endogenous HIV envelope protein in trans46. The amphotropic envelope protein, derived from either murine leukemia virus (MLV) or vesticular stomatitis virus G (VSV-G), broadened the tropism of the vector.
The use of the VSV-G envelope protein generates several problems. There is the possible production of pseudovirions where other RNAs can be encapsidated instead of recombinant virus49. The virus titer can be over-estimated as recombinant vectors, empty vectors and pseudovirions can all be neutralized by anti-VSV antiserum.
There is no stable packaging cell line for VSV-G envelope proteins as these proteins are cytotoxic50. The use of the MLV envelope protein generates no such problems. However, in the reported experiment, the virus titer was fourfold lower than when the VSV-G envelope was used.
Another hurdle to overcome in developing a novel recombinant lentiviral vector is the current unavailability of a stable packaging cell line. Constitutive expression of lentiviral proteins has been reported to be cytotoxic51. The use of inducible packaging cells or a cell line with a high tolerance limit may overcome this problem.