|
Genome Biology 2010
The rare biosphere: sorting out fact from fictionDOI: 10.1186/gb-2010-11-s1-i19 Abstract: Numerous theories and mechanisms that could account for the existence and persistence of rare biosphere members compete with explanations that invoke sequencing or clustering artifacts. Even with sequencing error rates below 0.005 per nucleotide position, the common method of generating OTUs (i.e. multiple sequence alignment and complete-linkage clustering) significantly increases the number of predicted OTUs and inflates richness estimates. The use of a novel Single Linkage Preclustering (SLP) strategy applied to short hypervariable regions of ribosomal RNAs accurately identified the predicted complexity of 'mock' microbial communities with a known number of rRNA operons. The strategy initially identifies sequences that are likely to have arisen by error using nearest neighbor clustering of pairwise sequence distances. The most abundant sequence for each precluster and the number of sequences in the precluster define inputs to average neighbor clustering using MOTHUR. When applied to sequences obtained from multiple microbial communities, the OTU-based descriptions of microbial population structures under different ecological regimes, and the global distribution patterns of OTUs reinforce credibility of the 'rare biosphere' as revealed through deep sequencing efforts.
|