Bioinformatics for Beginners
eBook - ePub

Bioinformatics for Beginners

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

  1. 238 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Bioinformatics for Beginners

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

About this book

Bioinformatics for Beginners: Genes, Genomes, Molecular Evolution, Databases and Analytical Tools provides a coherent and friendly treatment of bioinformatics for any student or scientist within biology who has not routinely performed bioinformatic analysis. The book discusses the relevant principles needed to understand the theoretical underpinnings of bioinformatic analysis and demonstrates, with examples, targeted analysis using freely available web-based software and publicly available databases. Eschewing non-essential information, the work focuses on principles and hands-on analysis, also pointing to further study options. - Avoids non-essential coverage, yet fully describes the field for beginners - Explains the molecular basis of evolution to place bioinformatic analysis in biological context - Provides useful links to the vast resource of publicly available bioinformatic databases and analysis tools - Contains over 100 figures that aid in concept discovery and illustration

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Bioinformatics for Beginners by Supratim Choudhuri in PDF and/or ePUB format, as well as other popular books in Scienze biologiche & Genetica e genomica. We have over one million books available in our catalogue for you to explore.

Information

Chapter 1

Fundamentals of Genes and Genomes*

This chapter briefly discusses the structure and function of genes and genomes. Some topics covered here are not usually discussed in textbooks of molecular biology. The obvious beginning is from the double-helical structure of DNA. The discussion on hydrogen bonding and the standard base-pairing principle is extended to include Hoogsteen hydrogen bonding and triple helix formation. The importance of intron phase in alternative splicing is discussed in detail; it lays the foundation for understanding exon shuffling during genome evolution, discussed in Chapter 2. Various types of noncoding RNAs (ncRNAs), such as small ncRNA, long ncRNA, competing endogenous RNA, and circular RNA are highlighted. The chemical basis of the instability of RNA is also discussed. The relationship between protein function and the location of amino acids in the polypeptide chain is explained with examples. Some important features of the human genome and characterization of its functional elements by the Encyclopedia of the DNA Elements (ENCODE) project are highlighted. A discussion on the epigenetic modification of the genome is also included.

Keywords

chirality; DNA structure; ENCODE; epigenetics; gene structure; Hoogsteen H-bonding; human genome; intrinsically disordered proteins; intron phase; noncoding RNA; triple helix

1.1 Biological Macromolecules, Genomics, and Bioinformatics

Genetic information is stored in the cell in the form of biological macromolecules, such as nucleic acids and proteins. The genetic information not only drives the functioning of the whole organism, but also drives the evolutionary engine. Thus, an understanding of the molecular basis of life is fundamental to understanding how genetic information shapes life and drives its evolution. The following discussion captures some fundamental aspects of the structure and function of genes and genomes with special notes (in boxes) on the applications of this information.

1.2 DNA as the Universal Genetic Material

With some exceptions, deoxyribonucleic acid (DNA) is the universal genetic material. In some viruses, termed RNA viruses, RNA is the genetic material. The term ribovirus is used for viruses with single- and double-stranded RNA genomes, including retroviruses, which are RNA-based for a portion of their life cycle.1
Among the RNA viruses, retroviruses are well known; they include the notorious AIDS virus. Retroviruses are unique because in their life cycle they have both RNA and DNA versions of their genome. A complete retrovirus contains an RNA genome. The RNA genome encodes some protein products that are necessary for converting the single-stranded RNA genome into a double-stranded DNA genome and then its subsequent integration into the host genome. One such protein product of the retroviral genome is the reverse transcriptase (RT) enzyme. Upon entry into the cell, the reverse transcriptase is produced from the viral RNA genome using the host cellular machinery. The RT then copies the single-stranded RNA genome into a single-stranded DNA, which then produces a double-stranded viral DNA genome. The double-stranded viral DNA genome is referred to as the provirus, which gets incorporated into the host genome from where it keeps producing more retrovirus particles with single-stranded RNA genomes.

1.3 DNA Double Helix

The structure of the DNA double helix and its building blocks are described in all biology textbooks. Here, some other aspects are also highlighted, including the information in Box 1.1. DNA is a double-stranded right-handed helix; the two strands are complementary because of complementary base pairing, and antiparallel because the two strands have opposite 5′−3′ orientation (Figure 1.1A). The diameter of the helical DNA molecule is 20 Å (=2 nm). The helical conformation of DNA creates the alternate major groove and minor groove (Figure 1.1B).
Box 1.1
1. The major grooves in DNA can bind proteins. This is an important property of DNA structure because the major grooves in the upstream regulatory regions of a gene bind transcription-regulatory proteins. For example, for Zn-finger transcription factors, each Zn finger recognizes and binds to a specific trinucleotide sequence in the major groove of DNA.2
2. Any double-stranded nucleic acid (whether DNA double strand, DNA–RNA hybrid double strand, or RNA–RNA double strand) is antiparallel in nature. The complementary and antiparallel nature of double-stranded nucleic acids is an important property to remember while designing synthetic oligonucleotides for hybridization (probes or primers).
3. By convention, nucleic acid (DNA or RNA) sequence is written 5′→3′ from left to right, such as 5′-ATGTAAGCAC-3′. If the 5′→3′ designation is not mentioned, it is assumed that the sequence has been written in a 5′→3′ direction, following convention.
image

Figure 1.1 DNA structure.
(A) Two nucleotides of the DNA double helix, showing their antiparallel orientation, two H-bonds between A and T and three H-bonds between G and C; (B) the DNA double helix showing the major and minor grooves as well as the diameter of the molecule; (C) the convention of classifying the two sides of the phosphodiester bond and the products generated from their cleavage; (D) the front side (Watson–Crick edge) and the back side (Hoogsteen edge) of a purine; (E) how Hoogsteen H-bonding aids in the formation of the triple helix (see Section 1.3.3); (F) the anti and the syn conformations of bases around the N-glycosidic bond.

1.3.1 Structural Units of DNA

DNA is composed of structural units called nucleotides (deoxyribonucleotides). Each nucleotide is composed of a pentose sugar (2′-deoxy-D-ribose); one of the four nitrogenous bases—adenine (A), thymine (T), guanine (G), or cytosine (C); and a phosphate. The pentose sugar has five carbon atoms and they are numbered 1′ (1-prime) through 5′ (5-prime). The base is attached to the 1′ carbon atom of the sugar, and the phosphate is attached to the 5′ carbon atom (Figure 1.1A). The sugar and base form a nucleoside, whereas nucleoside plus phosphate makes a nucleotide. Hence, nucleoside=sugar+base, whereas nucleotide=sugar+base+phosphate. Table 1.1 shows the naming of nucleosides and nucleotides. Each nucleotide in DNA (as well as in RNA) has one replaceable hydrogen, which is what makes the DNA (and RNA) acidic.
Table 1.1
Naming of Nucleosides and Nucleotides
Base Nucleoside (base+sugar) Nucleotide (base+sugar+phosphate)
Adenine Deoxyadenosine (sugar=deoxyribose) Deoxyadenylic acid OR deoxyadenosine monophosphate
Guanine Deoxyguanosine (sugar=deoxyribose) Deoxyguanylic acid OR deoxyguanosine monophosphate
Cytosine Deoxycytidine (sugar=deoxyribose) Deoxycytidylic acid OR deoxycytidine monophosphate
Thymine Deoxythymidine (sugar=deoxyribose) Deoxythymidylic acid OR deoxythymidine monophosphate
Uracil (in RNA) Uridine (in RNA) (sugar=ribose) Uridylic acid OR uridine monophosphate

1.3.2 Linkage between Nucleotides

The nucleotides are joined by 5′–3′ phosphodiester linkage; that is, the 5′-phosphate of a nucleotide is linked to the 3′-OH of the preceding nucleotide by a phosphodiester linkage. In a linear DNA molecule, the 5′-end has a free phosphate and the 3′-end has a free OH group (Figure 1.1A). Each phosphodiester bond has two sides: a 3′-side that is linked to the 3′-end of the preceding nucleotide, and a 5′-side that is linked to 5′-end ...

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. Preface
  7. Acknowledgment
  8. Chapter 1. Fundamentals of Genes and Genomes
  9. Chapter 2. Fundamentals of Molecular Evolution
  10. Chapter 3. Genomic Technologies
  11. Chapter 4. The Beginning of Bioinformatics
  12. Chapter 5. Data, Databases, Data Format, Database Search, Data Retrieval Systems, and Genome Browsers
  13. Chapter 6. Sequence Alignment and Similarity Searching in Genomic Databases: BLAST and FASTA
  14. Chapter 7. Additional Bioinformatic Analyses Involving Nucleic-Acid Sequences
  15. Chapter 8. Additional Bioinformatic Analyses Involving Protein Sequences
  16. Chapter 9. Phylogenetic Analysis
  17. Index