Chapter 1
Introduction
Frans J. de Bruijn
In this first volume of the Handbook, metagenomics is introduced, together with computer-assisted analysis, information on consortia and databases, and as a number of complementary methods, such as microarrays, metatranscriptomics, metaproteomics, metabolomics, phenomics (the âomicsâ), and single-cell analysis.
Part 1, âBackground Chapters,â contains a number of chapters on nonmetagenomic methods, such as different genomic fingerprinting techniques and their analysis and level of resolution, as well as the first approach to metagenomics (Chapter 2). All these methods are still used today.
In Part 2, âThe Species Concept,â several experts examine the parameters to call something a new species and provide suggestions to authors when it is proper to call a novel isolate [operating taxonomic unit (OTU)] a new species. The recommendations of two expert meetings on the topic are summarized in another chapter in this part describing the 70% DNAâDNA hybridization level as essential in the species concept. This discussion is very relevant to all phylogenetic studies in both volumes of the Handbook.
In Part 3, metagenomics is introduced and a number of practical parameters of this technique are outlined. An introduction to metagenomics and the other âomicsâ is presented in Chapter 14. Three subsequent chapters deal with the 16S rRNA gene as phylogenetic marker and also examine the pitfalls of its use. Three chapters describe the impact of next-generation sequencing on metagenomics, examine its accuracy and quality of reads, and review the potential and challenges of environmental shotgun sequences for studying the hidden world of microbes. Metagenomics can involve (a) the generation and analysis of clone libraries which can be screened for particular properties and (b) random sequencing of metagenomic DNA. The former is discussed in an article on vector tools and functional screening of metagenomic libraries (see also Parts 6 and 7, Vol. II). The latter is used in many other articles in the Handbook. The remaining articles in this section introduce various technical aspects of metagenomics, as well as novel approaches such as gene-targeted metagenomics, using homing endonuclease restriction and marker insertion for phylogenetic studies, finding integrons, arrayOme- and tRNAcc-facilitated mobilome discovery, and improved serial analysis of V1 ribosomal sequence tags (SARST-V1) to study bacterial diversity. A plethora of other studies in various habitats are presented in Volume II of this Handbook.
In Part 4, some consortia and databases are discussed, including the Metacontrol consortium focusing on the metagenomics of suppressive soils, the Terragenome consortium to provide a metagenomic shotgun and phosmid sequencing analysis of a âreferenceâ soil, and the Argentinian BIOSPAS consortium aimed at bringing together a group of scientists employing metagenomic and associated approaches. This is followed by a description of the Human Gut Microbiome Initiative (HGMI) and the related Human Microbiome Project (HMP). Chapter 36 in this part describes the Ribosomal Database Project, an irreplaceable source for phylogenetic studies, using the rRNA genes as target (see Chapter 15, Vol. I). The final chapter in this part describes the Metagenomics RAST server a a public resource for automated phylogenetic and functional analysis of Metagenomes.
In Part 5, a smorgasbord of computer programs is presented essential for the analysis of (meta)genomic data. Clearly, computer-assisted analysis is a crucial component of every metagenomic project, and progress in the field is dependent on creating programs and databases for ever-growing datasets and can be the limiting factor for large metagenomic, transcriptomic, proteomic, and metabolomic projects. It equals in importance to the development of higher throughput novel sequencing methods (see Chapter 18, Vol. I). The authors in Part 5, as well as all other authors, have been asked to highlight the programs and web sites used in their chapters; therefore in addition to the limited programs highlighted in Part 5, a wealth of further information and other programs can be found in the chapters in Volumes I and II.
In Part 6 a number of complementary approches to metagenomics are presented, including metagenomics approaches in systems biology, the use of stable isotope probing, and subtractive hybridization.
In Part 6A the use of microarrays, including phylochips and geochips and metagenomic arrays, is discussed and examples in different habitats, such as NASA rocket cleanrooms, are given. This part also contains a chapter on phenotypic arrays or âphenomics,â another âomicâ technique, which can reveal the metabolic capacity of microbes in microplates.
In Part 6B, some examples of metatranscriptomic analysis are presented, which permit a glimpse into the metagene expression profile in various environments, such as the symbiotic protist community in Reticulitermes and comparative day and night metatranscriptomics of microbial communities in the North Pacific. In addition a âdouble RNAâ approach is presented to simultaneously assess the structure and function of microbial communities, and one chapter on the metatranscriptomics of eukaryotes is included.
In Part 6C, metaproteomics approaches are highlighted, and examples are presented on the proteomics of microbial stress responses, the metaproteomic analysis of Chesapeake Bay microbial communities, high-throughput proteomics in cyanobacteria, and global proteomic analysis of the chromate response in Arthrobacter.
In Part 6D, metabolomics is highlighted, which requires more sophisticated tools such as mass spectrometry. Examples include (a) two chapters that review the small molecule dimension and high-resolution tools to monitor bacterial growth on a molecular level, (b) one chapter on metabolomics in plants, where the metabolomics techniques are well established, and (c) a chapter on metabolite identification, pathways and âomicâ integration using databases and other tools.
In Part 6E a highly specialized complementary approach is described, namely the isolation and use of single cells for metagenomic and other analysis.
None of the parts described above are comprehensive. They mainly give a short insight about what one can do in addition to metagenomics to extract more functional data from the system under study to answer the following questions: âWho is there?â and âWhat are they doing?â An attempt was made to select studies in very different habitats, and a variety of approaches are highlighted. This is continued and expanded upon in Volume II.
Part 1
BACKGROUND CHAPTERS
Chapter 2
DNA Reassociation Yields Broad-Scale Information on Metagenome Complexity and Microbial Diversity
Vigdis L. Torsvik and Lise ĂvreĂ„s
2.1 Introduction
2.1.1 Evolution and Development of Diversity
There are close relationships between microbial evolution, diversity, and ecology. Prokaryotic organisms have evolved through 3.8 billion years [Rosing, 1999] in response to varying geological, geochemical, and climatic conditions. For approximately half of their life's history, they resided alone on Earth. Due to their great metabolic flexibility, short generation time, and ability to exchange genes over deep phylogenetic barriers, their ability to adapt and evolve are superior. This means that virtually every (micro) environment on Earth with physicalâchemical conditions that can sustain life is occupied by prokaryotic organisms [see Vol. II]. It is therefore not surprising that the biodiversity on Earth is dominated by these organisms, which constitute two of the three primary domains of life, the Archaea and Bacteria [Woese, 1987; Woese and Fox, 1977]. Their ecological consequences are huge, because ecosystem processes to a large extent are regulated by microbial communities. Important for understanding complex ecosystem functioning is to identify the primary drivers of microbial diversity and community structure. According to ecological theories, relationships between ecosystem functioning and diversity can partly be explained by the resource heterogeneity hypothesis and the âinsurance hypothesisâ [Yachi and Loreau, 1999]. The insurance hypothesis suggests that high diversity protects communities from unstable environmental conditions because the presence of diverse subpopulations not only increases the range of conditions in which the community as a whole can succeed, but also ensures long-term attainment of the community [Boles et al., 2004].
2.1.2 Methodological Advances, Discoveries, and Issues that Promoted Exploring the Environmental Community DNA
Before the introduction of molecular methods in microbial ecology, it was only possible to study the composition and diversity of microbial communities by investigating cultivated isolates. This traditional reductionist approach has limited our understanding of microbial ecology. In a holistic approach, the microorganisms in a community have been treated as one âblack box.â The aims were to (a) measure collective variables like biomass, population sizes, process rates, and diversity of cultured microorganisms and (b) integrate these to better understand microbial ecosystems. This approach was hampered by the lack of conceptual models linking biomasses, rate of functions, and diversity to the underlying controlling factors. During the 1970s, methods for direct counts of microorganisms using fluorescence microscopy were developed [Hobbie et al., 1977]. It was then realized that the microbial biomass in natural environments was orders of magnitude higher than previously anticipated, one gram of soil and sediment could harbor more than 1010 cells. It was demonstrated that there was a factor of 2â3 orders of magnitude between the numbers of microorganisms estimated by direct counts and by colony-forming units (cfu) [FĂŠgri et al., 1977]. A main question was why there was such a discrepancy. One assumption was that the majority of the microorganisms observed in natural environments like soils and sediments were inactive and that those growing in the laboratory represented the active populations. To investigate this, a fractionated centrifugation method for separating the bacteria from soil was developed. By microscopic counts it was estimated that the bacterial fractions contained 50â80% of the bacteria present in the soil samples and that no eukaryotic cells were present. Respiration was used to measure the activity in the bacterial fraction, and the specific oxygen uptake rates (qO2) calculated on the basis of microscopic counts ranged from 3 to 300 ÎŒl O2 mgâ1 dry weight hâ1, indicating that most of the microbial cells observed in the microscope were metabolically active [FĂŠgri et al., 1977]. Furthermore, the amount of DNA in the bacterial fractions (washed with sodium pyrophosphate to remove extracellular DNA) corresponded to an average DNA content per microscopic counted cell of 8.4 fg (10â15 g). This is approximately the same as in Escherichia coli cells in stationary growth phase [Ritz et al., 1997; Torsvik and Goksoyr, 1978]. It was therefore concluded that virtually all the cells observed in the microscope were viable and belonged to the metabolically active microbial community. A main issue was then whether the cultured bacterial isolates were representative for the total environmental community or whether they constituted a small, exotic subpopulation of microorganisms that could easily be âdomesticatedâ and grown in the laboratory.
Early in the 1980s, ideas emerged that led to a revolution and paradigm shift in microbial ecology. The basic idea was that if it was possible to retrieve DNA from the entire microbial community, this DNA would in principle contain genetic information about nearly all the organisms in the community, including both cultured and uncultured microorganisms. Major problems were (a) the lack of methods for extracting ultrapure DNA from âdirtyâ samples like soil and sediments and (b) finding tools to analyze and interpret the inform...