- © The Mineralogical Society Of America
The genomic era has provided us with hundreds of complete microbial genome sequences (current estimates as of July 12, 2005, are 266 microbial genomes completed and an additional 730 in progress; see http://www.GenomesOnline.org) (Mongodin et al. 2005a). Collectively, the sequencing of individual genomes and whole communities has enabled the realization of a level of genetic diversity and complexity that was previously unappreciated (Venter et al. 2004; Mongodin et al. 2005b). This is particularly evident when the results of these endeavors are related to the study of physiological processes and metabolic capabilities of both the individual species and community members from a range of environments. Often, species are found to harbor the genetic material for metabolic pathways that had not been identified or tested in the laboratory setting, and it has become increasingly evident that we are some distance away from understanding the tremendous biological, physiological and metabolic diversity and potential that clearly exists in the microbial world.
The chemical process of sequencing allows for the determination of the primary structure of a region of DNA (the main information carrier in a cell). The result of this process is a determination of the exact order of the four-nucleotide building blocks (adenine, cytosine, guanine and thymidine abbreviated A, C, G, T, respectively) that make up the DNA region in question. Completing the entire genome sequence of an organism thus provides a comprehensive representation of the entire sequence of the organism under study and its genome structure including the presence of chromosomes and in the case of prokaryotes, the presence of plasmids.
GENOME SEQUENCING AND ASSEMBLY
Upon completing the sequence of a microbial genome, a thorough analysis of the genetic data should follow (detailed in Fig. 1⇓). This process typically begins with the identification of all open reading frames (ORFs). A variety of ORF finding …