%0 Journal Article %T Sequencing genes in silico using single nucleotide polymorphisms %A Xinyi Zhang %A Bo Zhang %A Shuying Li %A Xin Huang %A John A Hansen %A Lue Zhao %J BMC Genetics %D 2012 %I BioMed Central %R 10.1186/1471-2156-13-6 %X To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS), which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles) at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%). This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC) Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes.Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate genes for more detailed functional and mechanistic studies.The recent advent of SNP array based high throughput genotyping technologies has stimulated a wave of genetic association studies, including genome-wide association studies (GWAS) [1], and has led to a number of discovered and validated associations, which are fully catalogued on the website http://www.genome.gov/gwastudies/ webcite. While many of the disease associations were found with single nucleotide polymorphisms (SNPs) in non-coding regions, some are found with SNPs within or nearby known functional genes. To m %K In silico %K SNPs %K 1000 Genomes Project %K multi-allelic gene %K imputation %U http://www.biomedcentral.com/1471-2156/13/6