%0 Journal Article
%T Using the Pareto principle in genome-wide breeding value estimation
%A Xijiang Yu
%A Theo HE Meuwissen
%J Genetics Selection Evolution
%D 2011
%I BioMed Central
%R 10.1186/1297-9686-43-35
%X Genomic selection (GS) is currently being adopted by the dairy cattle breeding industries around the world [1]. Genome-wide breeding value (GWEBV) prediction plays a pivotal role for this new technology. Its accuracy depends on the statistical methods used, the genome, the population structure, and trait heritability. GWEBV estimation methods are categorized based on the assumptions of their prior distributions of marker effects. Genome-wide BLUP (GBLUP) methods e.g. [2], assume a normal prior distribution for all marker loci with a constant variance. In Bayesian methods, a more flexible prior distribution of SNP effects can be applied that allows for a few but with very large SNP effects whilst most are small or even zero. However, Bayesian methods often use Monte Carlo Markov chain (MCMC) algorithms which make them computationally demanding.Meuwissen et al. [2] proposed BayesB for the estimation of SNP effects, which assumes that a fraction (1 - Іа) of the SNP have no effect and Іа SNP have an effect with a t-distributed prior that is more thick tailed than the normal distribution, i.e. it allows for a few SNP with very large effects and many SNP with small ones. Based on the work of [3] and [4] suggesting that normal priors give similar results as t-distributed priors, Luan et al. [5] used a mixture of two normal distributions as a prior, with probability Іа of the SNP effects coming from a normal distribution with a large variance and with probability (1 - Іа) from a normal distribution with a small variance. This was also justified by the observation that in practice GBLUP yields high accuracy [6], which suggests that the best predictions are obtained if the SNP with small effects are not neglected. This mixture prior distribution has two unknown parameters, Іа, the variance of the SNP with large effects. These parameters are difficult to estimate partly because the true distribution of the SNP effects is probably not a mixture of two normal distributions.The Pareto
%U http://www.gsejournal.org/content/43/1/35