eBook - ePub

Bioinformatics for Beginners

Name: Bioinformatics for Beginners
ISBN: 9780124105102

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

Supratim Choudhuri,

238 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Bioinformatics for Beginners

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

Supratim Choudhuri,

About this book

Bioinformatics for Beginners: Genes, Genomes, Molecular Evolution, Databases and Analytical Tools provides a coherent and friendly treatment of bioinformatics for any student or scientist within biology who has not routinely performed bioinformatic analysis. The book discusses the relevant principles needed to understand the theoretical underpinnings of bioinformatic analysis and demonstrates, with examples, targeted analysis using freely available web-based software and publicly available databases. Eschewing non-essential information, the work focuses on principles and hands-on analysis, also pointing to further study options. - Avoids non-essential coverage, yet fully describes the field for beginners - Explains the molecular basis of evolution to place bioinformatic analysis in biological context - Provides useful links to the vast resource of publicly available bioinformatic databases and analysis tools - Contains over 100 figures that aid in concept discovery and illustration

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Publisher

Academic Press

Year

2014

Print ISBN

9780124104716

eBook ISBN

9780124105102

Topic

Biological Sciences

Subtopic

Genetics & Genomics

Index

Biological Sciences

Chapter 1

Fundamentals of Genes and Genomes*

This chapter briefly discusses the structure and function of genes and genomes. Some topics covered here are not usually discussed in textbooks of molecular biology. The obvious beginning is from the double-helical structure of DNA. The discussion on hydrogen bonding and the standard base-pairing principle is extended to include Hoogsteen hydrogen bonding and triple helix formation. The importance of intron phase in alternative splicing is discussed in detail; it lays the foundation for understanding exon shuffling during genome evolution, discussed in Chapter 2. Various types of noncoding RNAs (ncRNAs), such as small ncRNA, long ncRNA, competing endogenous RNA, and circular RNA are highlighted. The chemical basis of the instability of RNA is also discussed. The relationship between protein function and the location of amino acids in the polypeptide chain is explained with examples. Some important features of the human genome and characterization of its functional elements by the Encyclopedia of the DNA Elements (ENCODE) project are highlighted. A discussion on the epigenetic modification of the genome is also included.

Keywords

chirality; DNA structure; ENCODE; epigenetics; gene structure; Hoogsteen H-bonding; human genome; intrinsically disordered proteins; intron phase; noncoding RNA; triple helix

Outline

1.1 Biological Macromolecules, Genomics, and Bioinformatics 2

1.2 DNA as the Universal Genetic Material 2

1.3 DNA Double Helix 2

1.3.1 Structural Units of DNA 2

1.3.2 Linkage between Nucleotides 3

1.3.3 Base-Pairing Rules, Double Helix, and Triple Helix 4

1.3.4 Single-Stranded DNA 4

1.3.5 Base Sequence and the Genetic Code 5

1.4 Conformations of DNA 5

1.5 Typical Eukaryotic Gene Structure 5

1.5.1 Transcribed Region 7

1.5.1.1 Intron-Splicing Signals 7

1.5.1.2 Effect of Intron Phase on Alternative Splicing 9

1.5.1.3 Evolution of Introns 10

1.5.2 5′-Flanking Region of Transcribed Genes 11

1.5.3 3′-Flanking Region of Transcribed Genes 11

1.6 Mutations in the DNA Sequence 12

1.7 Some Features of RNA 12

1.7.1 Instability of mRNA 12

1.7.2 5′- and 3′-Untranslated Regions of mRNA 12

1.7.3 Secondary Structures in RNA 13

1.8 Coding Versus Noncoding RNA 14

1.8.1 Small Noncoding RNA, Long Noncoding RNA, Competing Endogenous RNA, and Circular RNA 14

1.9 Protein Structure and Function 15

1.9.1 Configuration and Chirality of Amino Acids 15

1.9.2 Ionic Character of Amino Acids 16

1.9.3 Relationship between Protein Function and the Location of Amino Acids in the Polypeptide Chain 16

1.9.4 Linkage between Amino Acids—The Peptide Bond 17

1.9.5 Four Levels of Protein Structure 17

1.9.6 Acidic and Basic Proteins 17

1.9.7 Nonstandard Amino Acids in Polypeptide Chains 18

1.10 Genome Structure and Organization 18

1.10.1 The Structure of a Representative Genome—The Human Genome 19

1.10.2 Functional Sequence Elements in the Genome 21

1.10.2.1 Promoters 21

1.10.2.2 Enhancers 21

1.10.2.3 Locus Control Regions 21

1.10.2.4 Insulators 22

1.10.3 Epigenetic Modifications of the Genome Can Edit the Language Written in the DNA Sequence and Add an Extra Layer of Complexity in Genome Expression 22

1.10.3.1 Histone Code 23

1.10.3.2 The Dynamics of Epigenetic Changes 24

1.10.4 Lessons Learned from the Second Phase of the ENCODE Project about the DNA Elements in the Human Genome and its Epigenetic Modifications 24

References 25

1.1 Biological Macromolecules, Genomics, and Bioinformatics

Genetic information is stored in the cell in the form of biological macromolecules, such as nucleic acids and proteins. The genetic information not only drives the functioning of the whole organism, but also drives the evolutionary engine. Thus, an understanding of the molecular basis of life is fundamental to understanding how genetic information shapes life and drives its evolution. The following discussion captures some fundamental aspects of the structure and function of genes and genomes with special notes (in boxes) on the applications of this information.

1.2 DNA as the Universal Genetic Material

With some exceptions, deoxyribonucleic acid (DNA) is the universal genetic material. In some viruses, termed RNA viruses, RNA is the genetic material. The term ribovirus is used for viruses with single- and double-stranded RNA genomes, including retroviruses, which are RNA-based for a portion of their life cycle.¹

Among the RNA viruses, retroviruses are well known; they include the notorious AIDS virus. Retroviruses are unique because in their life cycle they have both RNA and DNA versions of their genome. A complete retrovirus contains an RNA genome. The RNA genome encodes some protein products that are necessary for converting the single-stranded RNA genome into a double-stranded DNA genome and then its subsequent integration into the host genome. One such protein product of the retroviral genome is the reverse transcriptase (RT) enzyme. Upon entry into the cell, the reverse transcriptase is produced from the viral RNA genome using the host cellular machinery. The RT then copies the single-stranded RNA genome into a single-stranded DNA, which then produces a double-stranded viral DNA genome. The double-stranded viral DNA genome is referred to as the provirus, which gets incorporated into the host genome from where it keeps producing more retrovirus particles with single-stranded RNA genomes.

1.3 DNA Double Helix

The structure of the DNA double helix and its building blocks are described in all biology textbooks. Here, some other aspects are also highlighted, including the information in Box 1.1. DNA is a double-stranded right-handed helix; the two strands are complementary because of complementary base pairing, and antiparallel because the two strands have opposite 5′−3′ orientation (Figure 1.1A). The diameter of the helical DNA molecule is 20 Å (=2 nm). The helical conformation of DNA creates the alternate major groove and minor groove (Figure 1.1B).

Box 1.1

1. The major grooves in DNA can bind proteins. This is an important property of DNA structure because the major grooves in the upstream regulatory regions of a gene bind transcription-regulatory proteins. For example, for Zn-finger transcription factors, each Zn finger recognizes and binds to a specific trinucleotide sequence in the major groove of DNA.²

2. Any double-stranded nucleic acid (whether DNA double strand, DNA–RNA hybrid double strand, or RNA–RNA double strand) is antiparallel in nature. The complementary and antiparallel nature of double-stranded nucleic acids is an important property to remember while designing synthetic oligonucleotides for hybridization (probes or primers).

3. By convention, nucleic acid (DNA or RNA) sequence is written 5′→3′ from left to right, such as 5′-ATGTAAGCAC-3′. If the 5′→3′ designation is not mentioned, it is assumed that the sequence has been written in a 5′→3′ direction, following convention.

Figure 1.1 **DNA structure.**
(A) Two nucleotides of the DNA double helix, showing their antiparallel orientation, two H-bonds between A and T and three H-bonds between G and C; (B) the DNA double helix showing the major and minor grooves as well as the diameter of the molecule; (C) the convention of classifying the two sides of the phosphodiester bond and the products generated from their cleavage; (D) the front side (Watson–Crick edge) and the back side (Hoogsteen edge) of a purine; (E) how Hoogsteen H-bonding aids in the formation of the triple helix (see Section 1.3.3); (F) the *anti* and the *syn* conformations of bases around the N-glycosidic bond.

1.3.1 Structural Units of DNA

DNA is composed of structural units called nucleotides (deoxyribonucleotides). Each nucleotide is composed of a pentose sugar (2′-deoxy-D-ribose); one of the four nitrogenous bases—adenine (A), thymine (T), guanine (G), or cytosine (C); and a phosphate. The pentose sugar has five carbon atoms and they are numbered 1′ (1-prime) through 5′ (5-prime). The base is attached to the 1′ carbon atom of the sugar, and the phosphate is attached to the 5′ carbon atom (Figure 1.1A). The sugar and base form a nucleoside, whereas nucleoside plus phosphate makes a nucleotide. Hence, nucleoside=sugar+base, whereas nucleotide=sugar+base+phosphate. Table 1.1 shows the naming of nucleosides and nucleotides. Each nucleotide in DNA (as well as in RNA) has one replaceable hydrogen, which is what makes the DNA (and RNA) acidic.

Table 1.1

Naming of Nucleosides and Nucleotides

Base	Nucleoside (base+sugar)	Nucleotide (base+sugar+phosphate)
Adenine	Deoxyadenosine (sugar=deoxyribose)	Deoxyadenylic acid OR deoxyadenosine monophosphate
Guanine	Deoxyguanosine (sugar=deoxyribose)	Deoxyguanylic acid OR deoxyguanosine monophosphate
Cytosine	Deoxycytidine (sugar=deoxyribose)	Deoxycytidylic acid OR deoxycytidine monophosphate
Thymine	Deoxythymidine (sugar=deoxyribose)	Deoxythymidylic acid OR deoxythymidine monophosphate
Uracil (in RNA)	Uridine (in RNA) (sugar=ribose)	Uridylic acid OR uridine monophosphate

1.3.2 Linkage between Nucleotides

The nucleotides are joined by 5′–3′ phosphodiester linkage; that is, the 5′-phosphate of a nucleotide is linked to the 3′-OH of the preceding nucleotide by a phosphodiester linkage. In a linear DNA molecule, the 5′-end has a free phosphate and the 3′-end has a free OH group (Figure 1.1A). Each phosphodiester bond has two sides: a 3′-side that is linked to the 3′-end of the preceding nucleotide, and a 5′-side that is linked to 5′-end ...

Cover image
Title page
Table of Contents
Copyright
Dedication
Preface
Acknowledgment
Chapter 1. Fundamentals of Genes and Genomes
Chapter 2. Fundamentals of Molecular Evolution
Chapter 3. Genomic Technologies
Chapter 4. The Beginning of Bioinformatics
Chapter 5. Data, Databases, Data Format, Database Search, Data Retrieval Systems, and Genome Browsers
Chapter 6. Sequence Alignment and Similarity Searching in Genomic Databases: BLAST and FASTA
Chapter 7. Additional Bioinformatic Analyses Involving Nucleic-Acid Sequences
Chapter 8. Additional Bioinformatic Analyses Involving Protein Sequences
Chapter 9. Phylogenetic Analysis
Index

Frequently asked questions

Can I cancel at any time?

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

Can I download books?

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

What is the difference between the pricing plans?

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

How does Perlego work?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Can I read on my tablet or smartphone?

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Is Bioinformatics for Beginners an online PDF/ePUB?

Yes, you can access Bioinformatics for Beginners by Supratim Choudhuri in PDF and/or ePUB format, as well as other popular books in Biological Sciences & Genetics & Genomics. We have over 1.5 million books available in our catalogue for you to explore.

Bioinformatics for Beginners

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

Bioinformatics for Beginners

Genes, Genomes, Molecular Evolution, Databases and Analytical Tools

About this book

Trusted by 375,005 students

Information

Fundamentals of Genes and Genomes*

Keywords

1.1 Biological Macromolecules, Genomics, and Bioinformatics

1.2 DNA as the Universal Genetic Material

1.3 DNA Double Helix

1.3.1 Structural Units of DNA

1.3.2 Linkage between Nucleotides

Table of contents

Frequently asked questions