Advances in Nucleases Used for Genome Editing
- 1. Department of Microbiology and Molecular Biology, Brigham Young University, USA
Abstract
Genome editing is an exciting technology that allows for specific manipulation of complex genomes. While the original tools for genome manipulation had low efficiencies, genome editing tools discovered in the past fifteen years have been widely studied and great efforts have been required to improve their efficiency. This article summarizes how Zinc Finger Nuclease (ZFN), Transcription Activator-Like Effector Nuclease (TALEN) and Clustered Regularly Interspaced Palindromic Repeat (CRISPR)/Cas9 nucleases work, as well as the most meaningful advances achieved in the development of these technologies.
Keywords
Talen; CRISPR/Cas9; Zinc finger nuclease; Gene editing; Gene therapy.
CITATION
Solis-Leal A, Berges BK (2016) Advances in Nucleases Used for Genome Editing. JSM Biochem Mol Biol 3(2): 1017.
ABBREVIATIONS
ZFN: Zinc Finger Nuclease; TALEN: Transcription Activator Like Effector Nuclease; CRISPR: Clustered Regularly Interspaced Palindromic Repeat; DNA: Deoxyribonucleic Acid; DSBs: Double Strand Breaks; HDR: Homology-Directed Repair; NHEJ: Non Homologous End Joining; InDels: Insertions and Deletions; ZF : Zinc Finger; BP: Base Pair; AA: Amino Acid; RVDs: Repeat Variable Di-residues; 5MC: 5-Methylated Cytosines; RNA: Ribonucleic Acid; CrRNA: CRISPR RNA; TrACrRNA: Trans-Activating CRISPR RNA; GRNA: Guide RNA; PAM: Protospacer-Associated Motif
INTRODUCTION
Gene editing entails the engineering of DNA mutations, which can entail gene deletions, gene insertions, or gene modifications. While the technology required to efficiently promote gene editing of short DNA sequences has existed for decades, the ability to engineer specific mutations in complex or large genomes has been challenging. For over 40 years the gene editing field has developed new strategies to more efficiently engineer such mutations [1]. In such time, substantial advances have been made and new editing tools and strategies have been discovered and improved. Nucleases such as CRISPR/Cas9, TALEN and ZFN, make targeted modifications of complex genomes possible, and this technology has been used extensively to produce genetically engineered model organisms to study gene functions and other biological processes [2]. As we continue to develop the technology, these systems have a promising future for therapies for a wide variety of diseases such as cancer [3], HIV-1 [4], cystic fibrosis [5], Duchenne muscular dystrophy [6], among many other genetic disorders.
Genome modifications
The above nucleases produce double-strand breaks (DSBs) in DNA that lead to the activation of one of two pathways: homology-directed repair (HDR) or non-homologous end joining (NHEJ) [7]. Further, these enzymes can be engineered to target a single site in very complex genomes (e.g., the human haploid genome is 3x109bp). While the HDR mechanism uses a template DNA strand from the homologous chromosome as a correct copy to undertake repair of the DSB, the NHEJ system re-ligates the cleaved ends and randomly inserts and deletes nucleotides resulting in mutations referred to as indels. Thus, the HDR pathway produces a faithful copy of the original DNA while NHEJ is an error-prone system [8,9].
As our understanding of these DNA repair mechanisms has grown, so has our ability to develop more effective ways to edit complex genomes. Gene knockouts are achieved by cleaving the coding region of a gene, and when indels are produced this can result in a shift in the reading frame of a protein resulting in gene inactivation [10]. Likewise, this technique may be used to completely remove a gene by targeting both ends for DSBs, and if the ends are joined together this can result in removal of the intervening gene [2].
HDR is also an effective tool to insert genes into a complex genome. By introducing foreign linear DNA into a cell in the presence of a DSB, the new DNA can serve as a template for HDR, thus inserting desirable genes into specific sites in the genome [11].This exogenous DNA may be introduced through viral vectors, plasmids or even single-stranded oligonucleotides [12,13].
Genome editing tools
In order to activate these DNA repair pathways at specific sites in complex DNA molecules, precise locations need to be targeted so as not to affect any other sequence in the genome. Although restriction enzymes typically find their target sequence and induce many cuts in a complex genome, molecular engineering is accomplishing a greater efficiency of on-target cleavage and decreasing off-target effects to related DNA sequences. Off target effects could introduce DNA mutations leading to cancer, and so the use of highly specific/targeted nucleases is critical. The following targeted nucleases have been discovered and/ or developed in recent years and will be discussed below: Zinc Finger Nuclease (ZFN), Transcription Activator-Like Effector Nuclease (TALEN) and Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)/Cas9.
Zinc Finger nucleases
ZFNs consist of two different domains, a His2-Cys2 zinc finger (ZF) and a catalytic domain which contains a FokI nuclease. The ZF consists of a ββα structure which binds between 3 and 4 DNA base pairs (bp) in the presence of a zinc atom using the α -helix domain. Typically a ZFN contains between 3 and 6 ZF, thus recognizing longer (and hence more specific) DNA target sequences [14].
The FokI nuclease is a non-specific DNA-cleaving domain that upon dimerization is able to produce DSB in DNA. Therefore, a pair of ZFNsis engineered to bind to two opposite strands of DNA, and such a way that the two FokI domains overlap a sequence between two DNA sequences that are bound by distinct ZFNs. Each ZF binding domain in the targeted DNA region is separated from the other ZF binding domain by a spacer sequence of 5 to 7bp [15].
Targeting specific DNA sequences using ZF is essential to be able to use this tool. Though designing ZFs with unique specificity was not completely successful [16], different selection methods are providing new ways to specifically select for ZF with higher specificity [17]. One of these selection methods help ZF to recognize specific sequences and allow to combine multiple ZF DNA binding domains together, thus creating multi-fingered enzymes that are able to recognize longer DNA sequences and hence to cut more specifically. This novel method has been named OPEN [18], and is a very promising way to target unique sequences in eukaryotic genomes. Although ZFNs and the other nucleases mentioned in this article are highly promising, the introduction of the genes encoding these nucleases to all cells that need to be genetically modified is still a major challenge.
TALENs
TALENs are endonucleases formed by a TALE domain and a FokI domain. The TALE domain is the DNA-binding region and is formed by a33-35 highly conserved amino acid sequence with the exception of amino acids 12 and 13, which are also called repeat variable di-residues (RVDs). Different variations of these sequences in the RVDs provide high specificity for certain targeted nucleotides. The DNA cleaving is performed by a FokI nuclease domain (as in ZFNs), which upon dimerization, produces DSB activating the NHEJ or the HDR system [19]. Years of research have enhanced the activity and increased the specificity of these TALENs. A set of FokI heterodimer mutants, ELD (Q486E, I449L and N496D) and KKR (E490K, I153K and H537R), were designed to work as an exclusive pair and show increased cleavage site specificity [20]. In addition, Sharkey mutations in FokI (S148P and K441E) show an editing efficiency three- to six-fold higher than FokI wild-type in ZFN [21], and also provided promising results in TALEN constructs in Xenopustropicalis [22]. These two mutations (Sharkey and ELD-KKR) have been shown to increase editing efficiency by working together [22].
TALEN improvements are not restricted to the FokI nuclease; other improvements made to TALENs are related to the TALE scaffold. Both the structure and the length can be modified to enhance activity, sensitivity and specificity [23-25]. Mutations in the RVDs enhance TALEN activity or even allow it in certain scenarios, including a report showing that DNA binding specificity may be enhanced by the mutation of 3 or 7 cationic amino acids to glutamine [23]. In addition, 5-methylated cytosines (5Mc) restrict TALEN activity if they are found in the target sequence. However, such nucleotides can still be targeted if the amino acids responsible for binding cytosine, His-Asp, are changed to Asn Gly or even to an asparagine monomer, making this TALE repeat 33 amino acids long. This modification allows the formation of a special RVD loop that promotes 5mCbinding [24,25].
Original TALE monomer residues are similar except at the RVD, and the N- and C-terminal domains are respectively 287 aa and 231 aa long. However, a variety of TALE constructs with differing lengths have shown higher DNA cleavage efficiency; these second-generation TALENs are known as Goldy, Sunny and Platinum TALENs. Goldy TALENs have an N-terminus of 158 aa, followed by the repeat/binding domain, then with63 aa in the C-terminus [26]. Sunny TALENs have an N-terminal region of 207aa and a C-terminal regional of 63 aa. They also show a single mutation, P11H, in the C-terminus [27]. Platinum TALENs either have an N-terminal region of 136aa and a C-terminal region of 63 aa, or have an N-terminal region of 153 aa and a C-terminal region of 47 aa. They also show non-RVD variants (Ala-Asp, Asp Ala, Asp-Asp, or Glu-Ala) in the 4th and 32nd aa positions [28].
CRISPR/Cas9
Although the CRISPR/Cas9 type II system was discovered as a novel type of prokaryotic adaptive immune system [29], it has since been manipulated by researchers as an excellent genome editing tool. This system was found in Escherichia coli, where DNA sequences showed a pattern of repeats separated by a spacer sequence (also called protospacer) [30]. Subsequent research determined that small pieces of foreign DNA (usually from plasmids or bacteriophages) are integrated into the DNA region between the repeats [29]. This pattern of repeats, known as CRISPR, is transcribed into two different RNAs, the pre-CRISPR RNA (pre-crRNA), which is transcribed from the proto spacer DNA sequence, and the transactivating CRISPR RNA (tracrRNA), which binds to the pre-cr RNA and directs production of the mature crRNAs via RNase III activity. This complex of the tracrRNA and crRNA is called the guide RNA (gRNA) and it binds the Cas9 nuclease. ADSB is produced when this structure recognizes a 8-12bp sequence using the gRNA [31] that complements the protospacer as well as a 3bp sequence (NGG) called the protospacer-associated motif (PAM) [32]. PAM is not found next to the protospacer in the bacterial genomic DNA, thus it is key to differentiate foreign DNA and cleave it.
By modifying the protospacer sequence, desired sequences can be targeted through the CRISPR/Cas9 system. Additionally, Cas9 could be engineered to recognize different PAMs, thus increasing the number of sequences that could be targeted.
This together with a specific mutation (D1135E) called Sp Cas9, which makes Cas9 recognize longer PAMs, helps to increase its specificity [33]. However, a major concern while using this tool has been the risk of producing off-target cutting of DNA, and hence non-targeted mutations. In order to overcome this issue, the following strategies have been developed.
First, shortening the 5’ end of the protospacer sequence in gRNA from 20 to 17bp has helped to increase the specificity of the CRISPR/Cas9 system [34]. Second, reducing the amount of Cas9 in cells has decreased off-target mutagenesis, but has also decreased the efficiency of desired mutations [35]. Third, a specific mutation in the Cas9 (D10A) modifies the RuvC nuclease domain producing single-strand breaks instead of DSB. Thus, by targeting two opposite strands close enough to each other (about 100bp) a DSB may be produced. This strategy lowers the risk of off-target cleavage [36]. Another possibility is to fuse a mutated Cas9 that will not cleave DNA with a Fok1 nuclease. This system requires two gRNA binding sites 17bp apart to each other. Thus two Fok1 subunits will dimerize and produce a DSB [37].
It has been proposed that positively charged groups in the groove of the Cas9 that binds to single-stranded DNA are responsible for off-target cleavage because they stabilize the binding even if the protospacer sequence does not exactly match the target sequence. Thus, off-target cutting occurs when the strength of Cas9 binding to the non-target DNA strand exceeds the force of DNA re-hybridization. Individual alanine substitution mutants in 32 positively-charged amino acids and subsequent combination of mutants have shown to remarkably increase specificity [31].
CONCLUSION
Efforts to improve the activity, sensitivity and specificity are making these nucleases more reliable and increasing their potential for gene editing. However, decreasing off-target cleavage events is one of the major concerns of using this technology and this task is not yet complete, meaning that risks still exist with gene editing. Although gene editing technologies have been used extensively to genetically engineer model organisms to advance our understanding of biological systems, there is great potential to treat human genetic diseases, cancer, and infectious diseases once the techniques are better developed in the near future.
ACKNOWLEDGEMENTS
A Technology Transfer Grant from the Brigham Young University College of Life Sciences supported this work.