Computational Exome and Genome Analysis
eBook - ePub

Computational Exome and Genome Analysis

Peter N. Robinson, Rosario Michael Piro, Marten Jager

  1. 557 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Computational Exome and Genome Analysis

Peter N. Robinson, Rosario Michael Piro, Marten Jager

Book details
Book preview
Table of contents
Citations

About This Book

Exome and genome sequencing are revolutionizing medical research and diagnostics, but the computational analysis of the data has become an extremely heterogeneous and often challenging area of bioinformatics. Computational Exome and Genome Analysis provides a practical introduction to all of the major areas in the field, enabling readers to develop a comprehensive understanding of the sequencing process and the entire computational analysis pipeline.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Computational Exome and Genome Analysis an online PDF/ePUB?
Yes, you can access Computational Exome and Genome Analysis by Peter N. Robinson, Rosario Michael Piro, Marten Jager in PDF and/or ePUB format, as well as other popular books in Informatique & Programmation de jeux. We have over one million books available in our catalogue for you to explore.

Information

Year
2017
ISBN
9781351650816

Contents


Preface
Contributors
PART I Introduction

CHAPTER 1 ▪ Introduction: Whole Exome and Genome Sequencing
CHAPTER 2 ▪ NGS Technology
CHAPTER 3 ▪ Illumina Technology
CHAPTER 4 ▪ Obtaining WES/WGS Data for This Book
PART II Raw Data Processing

CHAPTER 5 ▪ FASTQ Format
CHAPTER 6 ▪ Raw Data: Quality Control
CHAPTER 7 ▪ Trimming
PART III Alignment

CHAPTER 8 ▪ Alignment: Mapping Reads to the Reference Genome
CHAPTER 9 ▪ SAM/BAM Format
CHAPTER 10 ▪ Postprocessing the Alignment
CHAPTER 11 ▪ Alignment Data: Quality Control
PART IV Variant Calling

CHAPTER 12 ▪ Variant Calling and Quality-Based Filtering
CHAPTER 13 ▪ Variant Call Format (VCF)
CHAPTER 14 ▪ Jannovar
CHAPTER 15 ▪ Variant Annotation
CHAPTER 16 ▪ Variant Calling: Quality Control
CHAPTER 17 ▪ Integrative genomics viewer (IGV): Visualizing alignments and variants
CHAPTER 18 ▪ De Novo Variants
CHAPTER 19 ▪ Structural Variation
PART V Variant Filtering

CHAPTER 20 ▪ Pedigree and Linkage Analysis
CHAPTER 21 ▪ Intersection analysis and rare variant association studies
CHAPTER 22 ▪ Variant frequency analysis
CHAPTER 23 ▪ Variant pathogenicity prediction
PART VI Prioritization

CHAPTER 24 ▪ Variant prioritization
CHAPTER 25 ▪ Prioritization by random walk analysis
CHAPTER 26 ▪ Phenotype analysis
CHAPTER 27 ▪ Exomiser and Genomiser
CHAPTER 28 ▪ Medical Interpretation
PART VII Cancer

CHAPTER 29 ▪ A (Very) Short Introduction to Cancer
CHAPTER 30 ▪ Somatic Variants in Cancer
CHAPTER 31 ▪ Tumor evolution and sample purity
CHAPTER 32 ▪ Driver mutations and mutational signatures
Appendix A ▪ Hints and Answers
References
Index

Preface


The knowledge that we convey in this book draws from complementary sources and experiences. The Robinson Lab at the Institute for Medical Genetics and Human Genetics of the Charit′e University Hospital in Berlin, Germany, established computational pipelines for the analysis of whole-exome sequencing (WES) and (later) whole-genome sequencing (WGS) data starting in the early days of WES. Rosario Piro spent important years of his scientific career at the German Cancer Research Center (DKFZ), Heidelberg, which has a longstanding record of achievements in cancer genomics.
The Robinson lab published one of the first exome-based gene discovery papers in Nature Genetics in 2010, the identification of disease- causing mutations in PIGV by identity-by-descent filtering of exome sequence data [226]. We subsequently contributed to the characterization as disease genes of PIGO [224], PGAP2 [225], IL21R [219], TTC8 [140], and PGAP3 [169]. We have additionally used WES analysis to address questions in hereditary cardiomyopathy [377], Vici syndrome [112], and other diseases [215].
The lab has developed the Human Phenotype Ontology [146, 209, 372] (HPO), which is being used by many of the ma jor WES/WGS translational research pro jects, such as Genomics England’s 100,000 Genomes Project, the Wellcome Trust Sanger Institute’s DECIPHER and Deciphering Developmental Disorders (DDD) pro jects, the National Institutes of Health (NIH) Undiagnosed Diseases Network, and many others [214]. We have developed software that exploits HPO- based phenotypic similarity algorithms to prioritize genes and variants in WES and WGS analysis [373, 399, 403], as we shall examine in several chapters of this book. We have additionally developed the annotation tool Jannovar [184], and contributed to a range of other algorithms designed for the analysis of quality control of WES/WGS data [161, 162], as well as algorithms for ChIP-seq [159], ChIP-nexus [158], NGS-based T-cell receptor profiling [230], and RNA- seq [182].
Rosario Piro’s research at the DKFZ contributed to the remarkable finding that all secretory meningioma harbor the same single nucleotide change in the pluripotency-related transcription factor KLF4 [363], a finding that was independently and simultaneously confirmed by another research group [80]. A fusion gene (NAB2-STAT6) was identified in meningeal hemangiopericytoma and solitary fibrous tumors that allows these tumors to be distinguished from anaplastic meningiomas by simple STAT6 immunohistochemistry [391], having an immediate impact on cancer diagnostics [70, 391]. The integration of DNA sequencing and epigenomic data led to the finding that poor-prognosis hindbrain ependymomas harbor an extremely low mutation rate and are instead characterized by a CpG island methylator phenotype (CIMP), leading to a transcriptional silencing of differentiation genes [271].
Marten J¨ager established and implemented the exome and genome pipelines in the Robinson lab at the Institute for Medical Genetics and Human Genetics of the Charit′e University Hospital. His research has concentrated on methods for exome and genome bioinformatics including approaches to VCF annotation and variant calling with non-linear genome assemblies, a method for composite transcriptome assembly of RNA-seq data, and contributions to many other research projects.
All three authors held courses on exome analysis in the Bioinformatics Department of the Free University of Berlin, out of which this book grew.
This book, therefore, reflects the experiences made over multiple years in establishing first an Illumina Genome Analyzer (GAIIX) and later a HiSeq 1500 and other devices in Berlin as well as working with HiSeq 2000 and MiSeq sequencing data in Heidelberg. It reflects the experiences of developing software for phenotype analysis that is being used internationally by groups involved in translational genomics. A major rule both in Heidelberg and Berlin was “eyeball the data”, and we place an emphasis in this book on understanding how data is represented in the various file formats common in WES/WGS analysis, and how to recognize errors and artifacts (or indeed, how to recognize high- quality data). With several exceptions, this book does not attempt to explain in detail the algorithms or statistics involved in WES/WGS analysis, but instead presents a practical guide to the many steps of WES/WGS analysis with intuitive algorithmic explanations and pointers to the literature.

A NOTE ON HOW TO USE THIS BOOK

Bi...

Table of contents