1 Introduction
In August 2015, the first successful application of genomic editing on a human embryo was announced. This was not only a milestone on the long road of curing genetic diseases, but also a start for an intense debate about the ethics, morality, and legality of deliberately modifying the human genome.
The discovery of recombinant DNA in the 1970s proved that genomic DNA could be modified to incorporate a DNA fragment originated from another organism. Two years later, the scientists organized a conference at Asilomar, in California, United States, to discuss the ethical and moral implications and decide how and under what conditions to use the new technology (Berg et al., 1975a). Forty years later, CRISPR/Cas9 genomic editing brings us not only close to solving the therapy for genetic diseases but also to the redesigning of human beings and many other organisms around us. To be able to extract the most beneficial aspects of technology, we need to understand how it works, what are its strengths and limitations, and what can be done to make it safer.
1.1 Definitions and Context
Genome editing signifies the intentional change of the DNA sequence in a genome by replacing, inserting, or deleting one or more nucleotides. The term is related and sometimes used interchangeably with âgene editing,â which means acting on one gene to modify its sequence. Related to these is the term âgene therapyâ meaning the correction of a genetic mutation by replacing or adding a functional copy of the mutated gene to the genome (Naldini, 2015). Gene therapy has a broader meaning than genome editing, and it has been used for decades related to etiologic genetic treatments.
A genome comprises the whole genetic information included in the DNA molecules of a cell, an organelle, or organism (Strachan and Read, 2011). Therefore we can talk about the nuclear genome, the mitochondrial genome, or the human genome as a whole. The human nuclear genome contains more than 3 billion letters or bases distributed in 23 pairs of DNA molecules. The mitochondrial genome is much smaller (about 16,500 base pairs in humans), with fewer genes and much more variability than the nuclear genome (Strachan and Read, 2011). When we refer to the genome of a whole organism, like a human or mouse genome, we generally have in mind the nuclear genome. The 23 pairs of DNA molecules in the nucleus of a human cell are associated with proteins and packed in as many pairs of chromosomes. The first 22 pairs of chromosomes are named autosomal, and they are identical, pairwise, in both genders while the other two chromosomes, designated X and Y, are present in different combinations in men and women: XY and respectively XX (Strachan and Read, 2011). The genes are discrete units found along the DNA molecules, containing the information needed to synthesize a protein or a functional RNA molecule. Even if regulatory RNA molecules are also encoded by genes, it is widely accepted that whenever we refer to the coding part of the genome, we consider only the protein-coding genes. Most of DNA is noncoding (about 98%) and only about 2% encodes proteins. The protein-coding genes have a discontinuous structure, represented by exons (the coding part) and introns, noncoding spacers of variable length between the exons, where most frequently are located regulatory elements and sometimes other genes. Each gene depends on close (cis) and further (trans) located elements, which can be accessed by protein complexes to regulate the onset, rate and time of their expression (Strachan and Read, 2011). The sequence in the coding part of the genes is translated into amino acid sequence in the proteins, with a set of three bases corresponding to one amino acid. Any mistake (mutation) in the DNA structure is likely to be translated differently if it alters the way the code is read. Every DNA molecule is copied only once per cell division and the new copy is distributed in the new cell. This is a very precise process due to both polymerase's proofreading mechanism and other repair and correction mechanisms (Miyabe et al., 2011). However, at times, a letter is misplaced and a variation is generated. Occasionally, physical, chemical, or biological factors can also alter the genome, requiring the intervention of repairing mechanisms. The consequences vary from none to the complete alteration of a protein, depending on where the mutation is located (Strachan and Read, 2011). Evolution itself is the result of an accumulation of genetic changes leading to survival, development, or extinction. When a mutation is present in the germinal cells, meaning the sperm, the egg, and their precursors, it can be transmitted to the offspring and can determine a genetic disease. Because all the cells of the new organism are derived from a single, initial cell, resulted from the fecundation, this mutation will also be present in all nucleated cells. Every individual inherits half of the genome (one autosomal chromosome of each pair plus one sex chromosome) from each parent. In some diseases, the presence of a single mutation, on one copy of the gene, is enough for pathogenicity (dominant diseases) while in others both copies need to be altered (recessive diseases); the X chromosome can also be affected and generate a different gender-based distribution of the disease in the family. Occasionally, a mutation occurs in one of the cells of an organ, after the organism has been formed. If the cell is actively proliferating, it will generate a clone with the same genome, but the change will remain without consequences if the cell is quiescent. This variant is named somatic mutation and will not be inherited by the offspring, as opposed to germline mutations, which are present in germ cells and therefore can be transmitted. Cancers, for example, start with a somatic mutation activating one or more genes involved in cell proliferation. Not all genetic changes are detrimental; some are innocent, or silent, as they do not alter any protein or other regulatory molecules. It is usual to refer to pathogenic genetic changes as mutations and the nonpathogenic ones as polymorphisms (Strachan and Read, 2011). Some DNA variants can either provide interindividual variation (e.g., hair or eye color) or support a better adaptation to the environment.
Given the severe consequences of pathogenic genetic changes, scientists looked for means to correct them. DNA modification of higher organisms is inspired by natural protection mechanisms present in bacteria or less evolved eukaryotes like yeast (Fernandez et al., 2017). The story started decades ago and followed the accumulation of knowledge about enzymes able to recognize and process DNA, in parallel with finding out how a genome is capable of maintaining its sequence and repairing itself after injury or polymerase errors.
1.2 Recombinant DNA TechnologyâThe Basis for DNA Modification
The history of genomic editing started in 1972, when Paul Berg and his team obtained the first recombinant DNA molecule ex vivo (Jackson et al., 1972), from two viral parts: the SV40 simian virus and a lambda phage. Shortly, he was followed by Stanley Cohen and Herbert Boyer, who demonstrated that a DNA molecule resulted from the combination of DNA from two different sources can be functional and able to be replicated in a host organism (Cohen et al., 1973). This technology made use of restriction enzymes, which are part of bacteria's immune system, to recognize and eliminate the DNA of infecting viruses (phages) from their genomes. A characteristic of restriction enzymes is that they can specifically recognize a short DNA sequence, irrespective of its origin, and then precisely cut both DNA strands, always in the same location, producing two ends where another DNA molecule treated in the same way can be inserted. The resultant DNA molecule can be introduced into a host organism where it can be replicated independently of the host genome and provide new characteristics to the cell (Cohen et al., 1973). The DNA providing the new feature may contain a gene or just the coding part of the gene, and it is called the insert while the other component, containing the elements required for self-replication and control of gene expression called vector. The vector can be a modified plasmid (bacterial DNA, usually responsible for resistance to chemicals or other functions and located outside bacteria's main genome), a virus, or a complex, engineered DNA molecule (Strachan and Read, 2011). This system allows for a limited size of the DNA insert, variable between thousands and tens of thousands of nucleotides, depending on the vector used (Strachan and Read, 2011). Recombinant DNA obtained from viral vectors modified to contain human genes were later used to integrate new DNA molecules into the mammalian genome, either in vitro, to perform functional studies, or in vivo, to test gene therapy or to introduce markers that could identify modified organisms. Restriction enzymes require a specific sequence that can be found anywhere in the genome, within genes, or in the intergenic space this being a disadvantage for the applications of genomic editing in eukaryotes, particularly mammalians, because it does not allow targeting of a single gene/region.
1.3 Genome Editing
The capacity to induce targeted modifications in the genome of higher eukaryotes evolved p...