Inferring genotyping error rates from genotyped trios
Luke Jostins
Quantitative Biology , 2011,
Abstract: Genotyping errors are known to influence the power of both family-based and case-control studies in the genetics of complex disease. Estimating genotyping error rate in a given dataset can be complex, but when family information is available error rates can be inferred from the patterns of Mendelian inheritance between parents and offspring. I introduce a novel likelihood-based method for calculating error rates from family data, given known allele frequencies. I apply this to an example dataset, demonstrating a low genotyping error rate in genotyping data from a personal genomics company.
Reverse engineering a gene network using an asynchronous parallel evolution strategy
Luke Jostins, Johannes Jaeger
BMC Systems Biology , 2010, DOI: 10.1186/1752-0509-4-17
Abstract: Here, we present synchronous and asynchronous versions of the piES algorithm, and apply them to a real reverse engineering problem: inferring parameters in the gap gene network. We find that the asynchronous piES exhibits very little communication overhead, and shows significant speed-up for up to 50 nodes: the piES running on 50 nodes is nearly 10 times faster than the best serial algorithm. We compare the asynchronous piES to pLSA on the same test problem, measuring the time required to reach particular levels of residual error, and show that it shows much faster convergence than pLSA across all optimisation conditions tested.Our results demonstrate that the piES is consistently faster and more reliable than the pLSA algorithm on this problem, and scales better with increasing numbers of nodes. In addition, the piES is especially well suited to further improvements and adaptations: Firstly, the algorithm's fast initial descent speed and high reliability make it a good candidate for being used as part of a global/local search hybrid algorithm. Secondly, it has the potential to be used as part of a hierarchical evolutionary algorithm, which takes advantage of modern multi-core computing architectures.The driving aim of systems biology is to understand complex regulatory systems. A powerful tool for this is reverse engineering, a top-down approach in which we use data to infer parameter values for a model of an entire system. This differs from the traditional bottom-up approach of building up the larger picture through individually measured simple interactions. Many methods have been developed for reverse engineering of gene regulatory networks, most of which are based on expression data from gene expression microarrays. However, most of these approaches do not consider temporal or spatial aspects of gene expression. Examples of this are methods that infer regulatory modules from expression data across different experimental conditions [1,2], or methods based on (sta
Latent variable model selection for Gaussian conditional random fields
Benjamin Frot,Luke Jostins,Gil McVean
Mathematics , 2015,
Abstract: We consider the problem of learning a conditional Gaussian graphical model in the presence of latent variables. Building on recent advances in this field, we suggest a method that decomposes the parameters of a conditional Markov random field into the sum of a sparse and a low-rank matrix. We derive convergence bounds for this estimator and show that it is well-behaved in the high-dimensional regime as well as "sparsistent" (i.e. capable of recovering the graph structure). We then describe a proximal gradient algorithm which is able to fit the model to thousands of variables. Through extensive simulations, we illustrate the conditions required for identifiability and show that there is a wide range of situations in which this model performs significantly better than its counterparts. We also show how this problem is relevant to some of the challenges faced by instrumental variable methods.
Using Genetic Prediction from Known Complex Disease Loci to Guide the Design of Next-Generation Sequencing Experiments
Luke Jostins, Adam P. Levine, Jeffrey C. Barrett
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0076328
Abstract: A central focus of complex disease genetics after genome-wide association studies (GWAS) is to identify low frequency and rare risk variants, which may account for an important fraction of disease heritability unexplained by GWAS. A profusion of studies using next-generation sequencing are seeking such risk alleles. We describe how already-known complex trait loci (largely from GWAS) can be used to guide the design of these new studies by selecting cases, controls, or families who are most likely to harbor undiscovered risk alleles. We show that genetic risk prediction can select unrelated cases from large cohorts who are enriched for unknown risk factors, or multiply-affected families that are more likely to harbor high-penetrance risk alleles. We derive the frequency of an undiscovered risk allele in selected cases and controls, and show how this relates to the variance explained by the risk score, the disease prevalence and the population frequency of the risk allele. We also describe a new method for informing the design of sequencing studies using genetic risk prediction in large partially-genotyped families using an extension of the Inside-Outside algorithm for inference on trees. We explore several study design scenarios using both simulated and real data, and show that in many cases genetic risk prediction can provide significant increases in power to detect low-frequency and rare risk alleles. The same approach can also be used to aid discovery of non-genetic risk factors, suggesting possible future utility of genetic risk prediction in conventional epidemiology. Software implementing the methods in this paper is available in the R package Mangrove.
YFitter: Maximum likelihood assignment of Y chromosome haplogroups from low-coverage sequence data
Luke Jostins,Yali Xu,Shane McCarthy,Qasim Ayub,Richard Durbin,Jeff Barrett,Chris Tyler-Smith
Quantitative Biology , 2014,
Abstract: Low-coverage short-read resequencing experiments have the potential to expand our understanding of Y chromosome haplogroups. However, the uncertainty associated with these experiments mean that haplogroups must be assigned probabilistically to avoid false inferences. We propose an efficient dynamic programming algorithm that can assign haplogroups by maximum likelihood, and represent the uncertainty in assignment. We apply this to both genotype and low-coverage sequencing data, and show that it can assign haplogroups accurately and with high resolution. The method is implemented as the program YFitter, which can be downloaded from http://sourceforge.net/projects/yfitter/
A Birth Cohort Analysis of First Employment Spells  [PDF]
Luke Ignaczak
Applied Mathematics (AM) , 2014, DOI: 10.4236/am.2014.511159

The duration of the first employment spell of workers across five different birth cohorts is investigated using pooled data from the 15th and 20th cycles of the Canadian General Social Survey. These retrospective surveys contain information that spans well over the last half of the 20th century. The data are benchmarked against the Labour Force Survey to emphasize the distinct nature of employment spells vis-a-vis job tenures as commonly used in the literature. Overall, this paper contributes to the debate of employment stability by analyzing the differences between job and employment durations and showing that successive cohorts of workers have had increasingly shorter first employment durations. The analysis finds cohort effects which play a significant role in explaining declining employment tenure. The cohort effects can be seen as a proxy for a number of socio-economic factors that affect the hazard of separation from employment. Separate analysis is completed for men and women by birth cohort. This pattern of declining tenure has occurred for both men and women, but the decline has been far more prominent for men. For men, macroeconomic factors affect the hazard more strongly in more recent cohorts, which is consistent with recessionary periods generating decreasing employment stability across cohorts. For women, cohort effects are consistent with the increasing generosity of maternity leave provisions through Unemployment Insurance.

Imputation-Based Meta-Analysis of Severe Malaria in Three African Populations
Gavin Band,Quang Si Le,Luke Jostins,Matti Pirinen,Katja Kivinen,Muminatou Jallow,Fatoumatta Sisay-Joof,Kalifa Bojang,Margaret Pinder,Giorgio Sirugo,David J. Conway,Vysaul Nyirongo,David Kachala,Malcolm Molyneux,Terrie Taylor,Carolyne Ndila,Norbert Peshu,Kevin Marsh,Thomas N. Williams,Daniel Alcock,Robert Andrews,Sarah Edkins,Emma Gray,Christina Hubbart,Anna Jeffreys,Kate Rowlands,Kathrin Schuldt,Taane G. Clark,Kerrin S. Small,Yik Ying Teo,Dominic P. Kwiatkowski,Kirk A. Rockett,Jeffrey C. Barrett,Chris C. A. Spencer ,Malaria Genomic Epidemiological Network ?
PLOS Genetics , 2013, DOI: 10.1371/journal.pgen.1003509
Abstract: Combining data from genome-wide association studies (GWAS) conducted at different locations, using genotype imputation and fixed-effects meta-analysis, has been a powerful approach for dissecting complex disease genetics in populations of European ancestry. Here we investigate the feasibility of applying the same approach in Africa, where genetic diversity, both within and between populations, is far more extensive. We analyse genome-wide data from approximately 5,000 individuals with severe malaria and 7,000 population controls from three different locations in Africa. Our results show that the standard approach is well powered to detect known malaria susceptibility loci when sample sizes are large, and that modern methods for association analysis can control the potential confounding effects of population structure. We show that pattern of association around the haemoglobin S allele differs substantially across populations due to differences in haplotype structure. Motivated by these observations we consider new approaches to association analysis that might prove valuable for multicentre GWAS in Africa: we relax the assumptions of SNP–based fixed effect analysis; we apply Bayesian approaches to allow for heterogeneity in the effect of an allele on risk across studies; and we introduce a region-based test to allow for heterogeneity in the location of causal alleles.
Correlated Individual Differences and Choice Prediction
Luke Lindsay
Games , 2011, DOI: 10.3390/g2010016
Abstract: This note briefly summarizes the consequences of adding correlated individual differences to the best baseline model in the Games competition, I-SAW. I find evidence that the traits of an individual are correlated, but refining I-SAW to capture these correlations does not significantly improve the model’s accuracy when predicting average behavior.
L'Intermédialité au Grand Siècle (?) ou la pratique des intermèdes sous le règne de Louis XIV
Luke Arnason
Synergies Canada , 2012,
Abstract: Le concept de l'intermédialité semble contraire à notre vision de l'esthétique classique que nous considérons généralement comme étant caractérisé par l'unité plut t que par la multiplicité. Pourtant, le Grand Siècle a inventé plusieurs genres intermédiaux dont les pièces à machines de Corneille, les comédie-ballets de Molière et les opéras de Lully. Afin d'élucider ce semblant de paradoxe, je propose d'examiner l'usage de l'intermède; un ornement parathéatral permettant de rendre toute pièce intermédiale. En étudiant l'étymologie du terme et l'histoire de sa pratique, je tenterai d'établir sa véritable fonction dramatique. Il sera ensuite possible d'élaborer une typologie de son usage, soulignant la fa on dont l'intermédialité pouvait influencer le sens, ou du moins la réception, d'une pièce. Cette étude devrait démontrer que la pratique du théatre au Grand Siècle était plus libre et variée que les pièces imprimées et les textes théoriques nous donnent lieu de croire. Elle montrera également que le corpus de pièces à intermèdes n'est pas cantonnée au répertoire des comédie-ballets moliéresques.
Marcabru in Motion: ‘Dire vos vuoill ses duptanssa’ in chansonniers A and C, and in Maftre Ermengaud’s Breviari d’amor
Luke Sunderland
Glossator : Practice and Theory of the Commentary , 2011,
Abstract: The poems by Marcabru (fl. 1130-50) and Matfre Ermengaud (d. 1322) illustrate the intertextual nature of the Occitan tradition. This paper compares three versions of Marcabru's text (chansonnier A, longer version in C, and Matrfré's citation). If the version A appears "broadly biographical or personal," the version C, almost twice as long, is "less of a poem about Marcabru’s life, more of an attempt to define love in all its attractions and horrors." Finally, Matfre's Breviari uses the poem to express an opinion and then, also, to testify against itself. This epistemological meditation is at the very center of the preoccupation of both Matfre and Marcabru.
