%0 Journal Article %T GePMI: A statistical model for personal intestinal microbiome identification %A Huazhe Lou %A Ron Shamir %A Rui Jiang %A Ting Chen %A Ying Wang %A Zicheng Wang %J Archive of "NPJ Biofilms and Microbiomes". %D 2018 %R 10.1038/s41522-018-0065-2 %X Overview of GePMI. (1) Each metagenomic sequencing dataset is processed into a k-mer set. (2) Each k-mer set is hashed into a subset of size m using the MinHash function so that the Jaccard similarity of the two k-mer sets can be approximated by MinHash similarity. (3) Each sample is then compared with other samples from unrelated individuals to generate a similarity distribution, which can be fitted by a beta distribution. (4) A query sample can be tested against each distribution. If its p value is below a threshold, it will be assigned to the sample with that distribution. (5) When testing in multiple distributions, p values are adjusted to control the false discovery rat %U https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6123480/