|
BMC Systems Biology 2011
DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseasesAbstract: Using a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource.The proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases.Over the past few decades, remarkable success has been achieved for such traditional gene-mapping approaches as family-based linkage analysis [1,2] and population-based association studies [3,4] in pinpointing genes that are responsible for human inherited diseases [5,6]. Nevertheless, these traditional methods are either only capable of linking diseases with genetic regions that typically contain dozens to hundreds of genes, or usually require carefully selected candidate genes that are biologically
|