|
BMC Genetics 2011
Predicting complex quantitative traits with Bayesian neural networks: a case study with Jersey cows and wheatAbstract: We investigated various Bayesian ANN architectures using for predicting phenotypes in two data sets consisting of milk production in Jersey cows and yield of inbred lines of wheat. For the Jerseys, predictor variables were derived from pedigree and molecular marker (35,798 single nucleotide polymorphisms, SNPS) information on 297 individually cows. The wheat data represented 599 lines, each genotyped with 1,279 markers. The ability of predicting fat, milk and protein yield was low when using pedigrees, but it was better when SNPs were employed, irrespective of the ANN trained. Predictive ability was even better in wheat because the trait was a mean, as opposed to an individual phenotype in cows. Non-linear neural networks outperformed a linear model in predictive ability in both data sets, but more clearly in wheat.Results suggest that neural networks may be useful for predicting complex traits using high-dimensional genomic information, a situation where the number of unknowns exceeds sample size. ANNs can capture nonlinearities, adaptively. This may be useful when prediction of phenotypes is crucial.Challenges in the study of associations between genomic variables (e.g., molecular markers) and complex phenotypes include the possible existence of cryptic relationships that may not be amenable to parametric statistical modeling, as well as the high dimensionality of the data, illustrated by the growing number of single nucleotide polymorphisms, now close to 10 million in humans http://www.genome.gov/11511175 webcite. These associations have been investigated primarily using na?ve single-marker regressions and, more recently, with Bayesian linear regression models of various types [1-3] but that assume additive inheritance almost invariably, while typically ignoring interactions and non-linearity. Taking into account these phenomena may enhance the ability of predicting outcomes, and this is relevant in genome-assisted management of livestock and plants and in indivi
|