oalib

Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99

Submit

Any time

2020 ( 4 )

2019 ( 54 )

2018 ( 48 )

2017 ( 57 )

Custom range...

Search Results: 1 - 10 of 20610 matches for " James McInerney "
All listed articles are free for downloading (OA Articles)
Page 1 /20610
Display every page Item
Horizontal gene transfer drives extreme physiological change in Haloarchaea
Chris Creevey,James McInerney
Quantitative Biology , 2012,
Abstract: The haloarchaea are aerobic, heterotrophic, photophosphorylating prokaryotes, whose supposed closest relatives and ancestors, the methanogens, are CO2-reducing, anaerobic chemolithotrophs. Using two available haloarchaeal genomes we firstly confirmed the methanogenic ancestry of the group and then investigated those individual genes in the haloarchaea that differ in their phylogenetic signal to this relationship. We found that almost half the genes, about which we can make strong statements, have bacterial ancestry and are likely a result of multiple horizontal transfer events. Futhermore their functions specifically relate to the phenotypic changes required for a chemolithotroph to become a heterotroph. If this phylogenetic relationship is correct, it implies the development of the haloarchaeal phenotype was among the most extreme changes in cellular physiology fuelled by horizontal gene transfer.
Recurring cluster and operon assembly for Phenylacetate degradation genes
Fergal J Martin, James O McInerney
BMC Evolutionary Biology , 2009, DOI: 10.1186/1471-2148-9-36
Abstract: We have selected an exemplar well-characterised biochemical pathway, the phenylacetate degradation pathway, and we show that its complex history is only compatible with a model where a selective advantage accrues from moving genes closer together. This selective pressure is likely to be reasonably weak and only twice in our dataset of 102 genomes do we see independent formation of a complete cluster containing all the catabolic genes in the pathway. Additionally, de novo clustering of genes clearly occurs repeatedly, even though recombination should result in the random dispersal of such genes in their respective genomes. Interspecies gene transfer has frequently replaced in situ copies of genes resulting in clusters that have similar content but very different evolutionary histories.Our model for cluster formation in prokaryotes, therefore, consists of a two-stage selection process. The first stage is selection to move genes closer together, either because of macromolecular crowding, chromatin relaxation or transcriptional regulation pressure. This proximity opportunity sets up a separate selection for co-transcription.The aerobic degradation of phenylacetic acid in E. coli K12 occurs via a series of five reactions, involving eleven catabolic paa genes [1], two of which are distant paralogs, with the rest showing no sequence homology (figure 1). The first step of the pathway is catalysed by the product of the paaK gene, a CoA ligase that catalyses the conversion of phenylacetate into phenylacetyl-CoA. The second step involves a ring-oxygenase complex formed from the gene products of paaABCDE. This heteromer converts phenylacetyl-CoA into 2'-OH-phenylacetyl-CoA. The third step, where 2'-OH-phenylacetyl-CoA is converted to 3-hydroxyadipyl-CoA, is jointly catalysed by paaJ, paaG and paaZ. The fourth step sees the conversion of 3-hydroxyadipyl-CoA by paaF and paaH to β-ketoadipyl-CoA. The final step is catalysed by paaJ, which converts β-ketoadipyl-CoA to succinyl-CoA,
Learning Periodic Human Behaviour Models from Sparse Data for Crowdsourcing Aid Delivery in Developing Countries
James McInerney,Alex Rogers,Nicholas R. Jennings
Computer Science , 2013,
Abstract: In many developing countries, half the population lives in rural locations, where access to essentials such as school materials, mosquito nets, and medical supplies is restricted. We propose an alternative method of distribution (to standard road delivery) in which the existing mobility habits of a local population are leveraged to deliver aid, which raises two technical challenges in the areas optimisation and learning. For optimisation, a standard Markov decision process applied to this problem is intractable, so we provide an exact formulation that takes advantage of the periodicities in human location behaviour. To learn such behaviour models from sparse data (i.e., cell tower observations), we develop a Bayesian model of human mobility. Using real cell tower data of the mobility behaviour of 50,000 individuals in Ivory Coast, we find that our model outperforms the state of the art approaches in mobility prediction by at least 25% (in held-out data likelihood). Furthermore, when incorporating mobility prediction with our MDP approach, we find a 81.3% reduction in total delivery time versus routine planning that minimises just the number of participants in the solution path.
The Population Posterior and Bayesian Inference on Streams
James McInerney,Rajesh Ranganath,David M. Blei
Statistics , 2015,
Abstract: Many modern data analysis problems involve inferences from streaming data. However, streaming data is not easily amenable to the standard probabilistic modeling approaches, which assume that we condition on finite data. We develop population variational Bayes, a new approach for using Bayesian modeling to analyze streams of data. It approximates a new type of distribution, the population posterior, which combines the notion of a population distribution of the data with Bayesian inference in a probabilistic model. We study our method with latent Dirichlet allocation and Dirichlet process mixtures on several large-scale data sets.
The tree of genomes: An empirical comparison of genome-phylogeny reconstruction methods
Angela McCann, James A Cotton, James O McInerney
BMC Evolutionary Biology , 2008, DOI: 10.1186/1471-2148-8-312
Abstract: We confirm a previous suggestion that this method has a systematic bias. We show that no two methods produce the same results and most current methods of inferring genome phylogenies produce results that are significantly different to other methods.We conclude that genome phylogenies need to be interpreted differently, depending on the method used to construct them.Hundreds of genome sequencing projects have been completed [1], providing us with an abundant source of data to reconstruct phylogenetic relationships, but also with some novel problems in interpreting these data. The evolutionary history of any genome includes elements of gene duplication, gene loss, lineage sorting and horizontal transfer of genes, all of which have the ability to confound phylogeny reconstruction [2-4]. Against this background, a variety of genome-phylogeny methods have been developed. These vary in their approach, the input data they require and the interpretation of the result. However, to date, no study has been carried out that asks whether these methods are picking out fundamentally different signals or if they are more-or-less finding the same tree.Current genome-level phylogeny methods can be split into two categories – sequence-based methods and gene-content methods. Analyses of sequence evolution predates gene-content methods simply because data for individual genes were available before data for completed genomes. Ubiquitously distributed ribosomal RNA (rRNA) genes have usually been used as surrogates for larger samples of individual genomes. These particular genes are popular for phylogenetic studies due to their plentitude, universally conserved structure and apparent resistance to horizontal gene transfer (HGT) [5]. In contrast, some methods are designed to include information from the evolutionary history of several individual genes. The supertree approach, for instance, involves the creation of individual trees from gene families and the amalgamation of these into one fi
New approaches for unravelling reassortment pathways
Svinti Victoria,Cotton James A,McInerney James O
BMC Evolutionary Biology , 2013, DOI: 10.1186/1471-2148-13-1
Abstract: Background Every year the human population encounters epidemic outbreaks of influenza, and history reveals recurring pandemics that have had devastating consequences. The current work focuses on the development of a robust algorithm for detecting influenza strains that have a composite genomic architecture. These influenza subtypes can be generated through a reassortment process, whereby a virus can inherit gene segments from two different types of influenza particles during replication. Reassortant strains are often not immediately recognised by the adaptive immune system of the hosts and hence may be the source of pandemic outbreaks. Owing to their importance in public health and their infectious ability, it is essential to identify reassortant influenza strains in order to understand the evolution of this virus and describe reassortment pathways that may be biased towards particular viral segments. Phylogenetic methods have been used traditionally to identify reassortant viruses. In many studies up to now, the assumption has been that if two phylogenetic trees differ, it is because reassortment has caused them to be different. While phylogenetic incongruence may be caused by real differences in evolutionary history, it can also be the result of phylogenetic error. Therefore, we wish to develop a method for distinguishing between topological inconsistency that is due to confounding effects and topological inconsistency that is due to reassortment. Results The current work describes the implementation of two approaches for robustly identifying reassortment events. The algorithms rest on the idea of significance of difference between phylogenetic trees or phylogenetic tree sets, and subtree pruning and regrafting operations, which mimic the effect of reassortment on tree topologies. The first method is based on a maximum likelihood (ML) framework (MLreassort) and the second implements a Bayesian approach (Breassort) for reassortment detection. We focus on reassortment events that are found by both methods. We test both methods on a simulated dataset and on a small collection of real viral data isolated in Hong Kong in 1999. Conclusions The nature of segmented viral genomes present many challenges with respect to disease. The algorithms developed here can effectively identify reassortment events in small viral datasets and can be applied not only to influenza but also to other segmented viruses. Owing to computational demands of comparing tree topologies, further development in this area is necessary to allow their application to larger datasets
Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models
Shaun Mahony, James O McInerney, Terry J Smith, Aaron Golden
BMC Bioinformatics , 2004, DOI: 10.1186/1471-2105-5-23
Abstract: This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential.While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to gene-prediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.Computational gene prediction methods have yet to achieve perfect accuracy, even in the relatively simple prokaryotic genomes. Problems in gene prediction centre on the fact that many protein families remain uncharacterised. As a result, it seems that only approximately half of an organism's genes can be confidently predicted on the basis of homology to other known genes [1-3], so ab initio prediction methods are usually employed to identify many protein-coding regions of DNA.Currently, the most popular prokaryotic gene-prediction methods, such as GeneMark.hmm [4] and Glimmer2 [5], are based on probabilistic Markov models that aim to predict each base of a DNA sequence using a number of preceding bases in the sequence. These methods are undoubtedly very successful, with published sensitivity rates between 90% and 99% for most prokaryotic genomes. However, as the sensitivity rates of the methods rise, specificity generally tends to fall, and while the application of sophisticated post-processing rules can correct many false-positive predictions, no method has yet achieved 100% accuracy. This is especially the case in the more complex eukaryotic gene-finding problem, where less than 80% of exons in anonymous genomic sequences are correctly predicted by current methods [2,6-8].For the foreseeable future it does not seem that the ex
Modeling User Exposure in Recommendation
Dawen Liang,Laurent Charlin,James McInerney,David M. Blei
Computer Science , 2015,
Abstract: Collaborative filtering analyzes user preferences for items (e.g., books, movies, restaurants, academic papers) by exploiting the similarity patterns across users. In implicit feedback settings, all the items, including the ones that a user did not consume, are taken into consideration. But this assumption does not accord with the common sense understanding that users have a limited scope and awareness of items. For example, a user might not have heard of a certain paper, or might live too far away from a restaurant to experience it. In the language of causal analysis, the assignment mechanism (i.e., the items that a user is exposed to) is a latent variable that may change for various user/item combinations. In this paper, we propose a new probabilistic approach that directly incorporates user exposure to items into collaborative filtering. The exposure is modeled as a latent variable and the model infers its value from data. In doing so, we recover one of the most successful state-of-the-art approaches as a special case of our model, and provide a plug-in method for conditioning exposure on various forms of exposure covariates (e.g., topics in text, venue locations). We show that our scalable inference algorithm outperforms existing benchmarks in four different domains both with and without exposure covariates.
Dynamic Poisson Factorization
Laurent Charlin,Rajesh Ranganath,James McInerney,David M. Blei
Computer Science , 2015, DOI: 10.1145/2792838.2800174
Abstract: Models for recommender systems use latent factors to explain the preferences and behaviors of users with respect to a set of items (e.g., movies, books, academic papers). Typically, the latent factors are assumed to be static and, given these factors, the observed preferences and behaviors of users are assumed to be generated without order. These assumptions limit the explorative and predictive capabilities of such models, since users' interests and item popularity may evolve over time. To address this, we propose dPF, a dynamic matrix factorization model based on the recent Poisson factorization model for recommendations. dPF models the time evolving latent factors with a Kalman filter and the actions with Poisson distributions. We derive a scalable variational inference algorithm to infer the latent factors. Finally, we demonstrate dPF on 10 years of user click data from arXiv.org, one of the largest repository of scientific papers and a formidable source of information about the behavior of scientists. Empirically we show performance improvement over both static and, more recently proposed, dynamic recommendation models. We also provide a thorough exploration of the inferred posteriors over the latent variables.
Multicanonical Stochastic Variational Inference
Stephan Mandt,James McInerney,Farhan Abrol,Rajesh Ranganath,David Blei
Computer Science , 2014,
Abstract: Stochastic variational inference (SVI) enables approximate posterior inference with large data sets for otherwise intractable models, but like all variational inference algorithms it suffers from local optima. Deterministic annealing, which we formulate here for the generic class of conditionally conjugate exponential family models, uses a temperature parameter that deterministically deforms the objective, and reduce this parameter over the course of the optimization to recover the original variational set-up. A well-known drawback in annealing approaches is the choice of the annealing schedule. We therefore introduce multicanonical variational inference (MVI), a variational algorithm that operates at several annealing temperatures simultaneously. This algorithm gives us adaptive annealing schedules. Compared to the traditional SVI algorithm, both approaches find improved predictive likelihoods on held-out data, with MVI being close to the best-tuned annealing schedule.
Page 1 /20610
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.