|
BMC Bioinformatics 2008
Merging microsatellite data: enhanced methodology and software to combine genotype data for linkage and association analysisAbstract: Notably, MicroMerge v2 includes a new one-to-one alignment option that creates merged pedigree and locus files that can be handled by most genetic analysis software. Other features in MicroMerge v2 enhance the following aspects of control: 1) optimizing the algorithm for different merging scenarios, such as data sets with very different sample sizes or multiple data sets, 2) merging small data sets when a reliable set of allele frequencies are available, and 3) improving the quantity and 4) quality of merged data. We present results from simulated and real microsatellite genotype data sets, and conclude with an association analysis of three familial dyslipidemia (FD) study samples genotyped at different laboratories. Independent analysis of each FD data set did not yield consistent results, but analysis of the merged data sets identified strong association at locus D11S2002.The MicroMerge v2 features will enable merging for a variety of genotype data sets, which in turn will facilitate meta-analyses for powering association analysis.Association studies for complex diseases can require thousands of samples to detect genes with small effect [1]. A minimum of 1000 to 1500 samples have been suggested for genes conferring a 1–8% increase in susceptibility to a disease, and the failure of many complex disease studies has been attributed to insufficient sample size [1-3]. As increasing the sample size for a genetic analysis improves the statistical power and therefore the ability to detect a genetic effect, complex disease studies are increasingly employing collaboration. Collaborative genetic studies often distribute genotyping among several different laboratories. If the genotyping for a family-based linkage study is distributed by assigning complete families to each laboratory (ie, all DNA samples for a particular family are genotyped at the same laboratory), logarithm of the odds (lod) scores can be computed for each data set separately and then simply added to achieve
|