|
- 2018
GePMI: A statistical model for personal intestinal microbiome identificationDOI: 10.1038/s41522-018-0065-2 Abstract: Overview of GePMI. (1) Each metagenomic sequencing dataset is processed into a k-mer set. (2) Each k-mer set is hashed into a subset of size m using the MinHash function so that the Jaccard similarity of the two k-mer sets can be approximated by MinHash similarity. (3) Each sample is then compared with other samples from unrelated individuals to generate a similarity distribution, which can be fitted by a beta distribution. (4) A query sample can be tested against each distribution. If its p value is below a threshold, it will be assigned to the sample with that distribution. (5) When testing in multiple distributions, p values are adjusted to control the false discovery rat
|