DNA, short for deoxyribonucleic acid, is a universal carrier of hereditary information. In all life formsâviruses, bacteria, fungi, plants, and animalsâit carries important instructions for the design of the organism. And not only does it carry informationâit is also a molecule designed so that it may be accurately copied to the next generation. DNA is built from simple units, referred to as nucleotides, that are joined to form very long molecules. Each nucleotide contains any of four different nitrogenous bases: adenine, thymine, cytosine, or guanine, abbreviated A, T, C, and G, respectively. It is the sequence of these bases that forms the actual genetic message. Thus, the information in DNA may be expressed as a long sequence of the letters A, T, C, and Gâfor an example, see Figure 1.1.
Figure 1.1 Portion of the human genome. Letters A, T, C, and G represent the DNA bases adenine, thymine, cytosine, and guanine, respectively. This page has 6,400 bases. It would take 500,000 pages like this to cover the full human genome. This would correspond to more than 1,600 books assuming one book contains 300 pages. The magnifying glass is to indicate that the object of research concerning the human genome is to associate sequence elements with biological functions. Scientists use experimental methods, as well as computational methods, to do thisâa work in progress.
We refer to the complete genetic material of an organism as its genome. The human genome is an astounding three billion letters. An important milestone was reached in biomedical research in 2001 when, for the first time, a draft of the human genome was presented and the complete sequence of letters could be read. A small fraction of the human genome is shown in Figure 1.1. Consider the whole genome printed as a physical book. A total of 6,400 bases are in Figure 1.1. You would need in the order of 500,000 pages like this to cover the full human genome. That would correspond to more than 1,600 books, each with 300 pages. For more on printing the human genome on paper, see Figure 1.2.
Figure 1.2 Human genome printed on paper. Scientists at the University of Leicester printed the whole human genome on paper. It resulted in 130 book volumes that would take 95 years to read. (Published under CC BY-SA 2.0.)
The issues addressed by this textbook are related to the three billion letter sequence of the human genome. How are we to make sense of and understand this vast information? What different biological signals are contained in the DNA? How important are different regions of the sequence? Are some regions more important than others? What are the effects in the event the sequence of letters in DNA is changed? In molecular biology laboratories, scientists have carried out experiments to address these questions. In addition, as changes or mutations in DNA are natural components of evolution, nature has by itself carried out experiments during billions of years that may guide us in understanding the relationship between genetic information and biological function. For instance, mutations in DNA can give rise to specific inherited diseases as well as cancer. What are the changes in the DNA sequence that cause such deleterious effects?
To answer these questions, we need to understand the organization of the human genome, as well as the different functional sequence elements in that genome. The flow of genetic information is crucial. Hence, DNA specifies what RNA molecules are to be made. One subclass of these RNAs is subject to processing to form messenger RNA (mRNA) molecules. These mRNA molecules in turn act as templates for the production of proteins. Another abundant class of RNA molecules has functions other than to specify proteins. Throughout the elaborate flow of genetic information that includes copying of DNA sequences to RNA, RNA processing, as well as the synthesis of protein using mRNA, specific nucleotide sequences have distinct functions.
In essence, this book explores the information in the human genome and all of the important biological signals that are present. It illustrates various functions of DNA sequences. Examples include protein coding sequences and sequences that regulate the flow of genetic information. For all of the different sequence elements, the relationship between sequence and function is illustrated with disorders of a genetic background.
The theme of this book is the information contained within the human genome as outlined in Chapter 1. As a first element of information, we consider regions in the genome that specify the proteins to be made. Proteins are molecules built from amino acids, and the sequence of amino acids is determined by the sequence of nucleotides in the genome. Regions specifying proteins make up only a minute portion of the entire genome but are nevertheless significant.
We first consider how amino acid sequences are related to inherited disorders. As an example, we discuss the disorder sickle cell anemia. It is caused by a mutation that gives rise to a replacement of the amino acid glutamic acid to valine in the protein hemoglobin. There are multiple reasons why we discuss this particular disorder in some detail in the first book chapters. Early studies of sickle cell anemia were based on nonmolecular clinical observations, and it seemed likely that the disease is inherited. But as research progressed, we eventually obtained a detailed molecular understanding of sickle cell anemia from the changes in DNA to detailed structural information about the hemoglobin protein. Sickle cell anemia is historically significant as it was the first disease to be characterized where a genetic change is associated with a well-defined change in a protein molecule. This finding gave an early clue as to the power of molecular medicine. In addition, very few inherited disorders have been so thoroughly examined as sickle cell anemia, and information about the disease is still being collected today. There is a significant medical impact from this knowledge, which is why we return to this disorder also in other contexts such as gene therapy (Chapter 14).
The Discovery of Sickle Cell Anemia
Abotutuo. Chwechweechwe. Nwiiwii. Nuiduidui. These are all names of a disease common in Western Africaâa disease we now know also as sickle cell anemia. Its history in Africa may be tracked as far back as the seventeenth century. Classification of the disease was difficult on this continent, because the symptoms were closely related to those of other diseases in tropical areas. It was not until the early twentieth century that sickle cell anemia was first described in a medical publication. The affected individual was Walter Clement Noel.
Noel was born in 1884 on a large estate on Grenada. At this time, this island was a British colony. Noel was from a wealthy black family. He suffered from sickle cell anemia but was still able to attend school, and he completed his undergraduate studies in 1904. The same y...