Population Genomics with R
eBook - ePub

Population Genomics with R

  1. 378 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Population Genomics with R

About this book

Population Genomics With R presents a multidisciplinary approach to the analysis of population genomics. The methods treated cover a large number of topics from traditional population genetics to large-scale genomics with high-throughput sequencing data. Several dozen R packages are examined and integrated to provide a coherent software environment with a wide range of computational, statistical, and graphical tools. Small examples are used to illustrate the basics and published data are used as case studies. Readers are expected to have a basic knowledge of biology, genetics, and statistical inference methods. Graduate students and post-doctorate researchers will find resources to analyze their population genetic and genomic data as well as help them design new studies.

The first four chapters review the basics of population genomics, data acquisition, and the use of R to store and manipulate genomic data. Chapter 5 treats the exploration of genomic data, an important issue when analysing large data sets. The other five chapters cover linkage disequilibrium, population genomic structure, geographical structure, past demographic events, and natural selection. These chapters include supervised and unsupervised methods, admixture analysis, an in-depth treatment of multivariate methods, and advice on how to handle GIS data. The analysis of natural selection, a traditional issue in evolutionary biology, has known a revival with modern population genomic data. All chapters include exercises. Supplemental materials are available on-line (http://ape-package.ird.fr/PGR.html).

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Population Genomics with R by Emmanuel Paradis in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

1

Introduction

1.1 Heredity, Genetics, and Genomics

One of the greatest achievements of biology during the twentieth century was to discover the mechanisms of heredity. One can hardly imagine all the theories formulated during many centuries before this discovery. Today, the double helix of DNA structure is an icon of science, and DNA has now a wide range of technological and commercial applications.
Heredity and its associated concepts are deeply rooted in the history of mankind. The emergence of agriculture in different parts of the world between 10,000 and 5000 years ago clearly interacted with knowledge on the heredity of some plants and animals. During thousands of years, breeders have observed the consequences of heredity on the domesticated forms of these species. In the nineteenth century, the scientific investigation of heredity took a significant turn with the generalization of microscopic observations, the formulation of the laws of heredity by Mendel, and Miescher’s discovery of “nuclein,” later renamed nucleic acids. An often overlooked feature of the history of genetics is that it took almost eight decades to demonstrate that DNA is the support of heredity, and even the brillant experiments by Avery and his colleagues were not convincing for some geneticists who thought that heredity was coded by proteins [52]. Therefore, population genetics originated well before the discovery of the physical support of heredity.
Historical Landmarks: Heredity, Genetics, and Genomics
1866: Mendel publishes his laws of heredity [184].
1869: Miescher discovers DNA [47].
1944: Avery et al. demonstrate that DNA is the support of heredity [10].
1953: Watson et al. discover the double helix structure of DNA [290].
1961: Crick et al. decipher the genetic code [44].
1973: Gilbert and Maxam publish the first DNA sequencing data [95].
1984: Discovery of microsatellites [295].
1996: First high-throughput sequencing technology [237].
2001: First human genome published [127].
2010: Completion of the first phase of the 1000 Genomes Project [270].
During the twentieth century, the methods used by biologists to study heredity and later DNA progressively increased in power (see Chap. 2). The growth of high-throughput sequencing technologies has been a very significant factor in the development of population genomics. Genomics has taken considerable importance during the last decade as a scientific field and a subject of considerable societal interest. This development has also impacted the field of population genetics.
This book adopts the following definitions. Population genetics is the study of the variation in genotypes among individuals across space and time, including the forces behind this variation. Genomics is the study of the structure and functions of genomes. Population genomics is similar to population genetics but applied to a very large number of loci, usually across the whole genome of a species. Thus, population genomics can be seen as a “scaled-up” version of population genetics dealing with at least a large number of loci up to the whole genome of the species of interest [20].
Historical Landmarks: Population Genetics
1930: Publication of Fisher’s Genetical Theory of Natural Selection [77].
1949: Publication of Wright s paper on population genetic structure [303].
1955: Kimura’s paper on allele fixation under genetic drift [142].
1966: Empirical studies show the importance of molecular variation in natural populations [107, 160].
1982: Kingman publishes three founding papers on the coalescent [147].
2005: Publication of the sequentially Markov coalescent facilitating the analysis of genomic data with recombination [182].

1.2 Principles of Population Genomics

This section starts with some explanations on the units used in this book. The biological meanings of some terms used here (bases, double-stranded, … ) are explained in the following subsection.

1.2.1 Units

The basic unit of the genome is the base, the part of the nucleotide that is variable: its symbol is ‘b’. Genomes can be small or (very) big, thus it is common to use prefixes borrowed from the International System of Units to express the size of a genome or the length of a DNA sequence:
one kilobase=...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Preface
  8. Symbol Description
  9. 1 Introduction
  10. 2 Data Acquisition
  11. 3 Genomic Data in R
  12. 4 Data Manipulation
  13. 5 Data Exploration and Summaries
  14. 6 Linkage Disequilibrium and Haplotype Structure
  15. 7 Population Genetic Structure
  16. 8 Geographical Structure
  17. 9 Past Demographic Events
  18. 10 Natural Selection
  19. A Installing R Packages
  20. B Compressing Large Sequence Files
  21. C Sampling of Alleles in a Population
  22. D Glossary
  23. Bibliography
  24. Index