Safe Reasoning Over Ontologies
Genady Grabarnik,Aaron Kershenbaum
Computer Science , 2009,
Abstract: As ontologies proliferate and automatic reasoners become more powerful, the problem of protecting sensitive information becomes more serious. In particular, as facts can be inferred from other facts, it becomes increasingly likely that information included in an ontology, while not itself deemed sensitive, may be able to be used to infer other sensitive information. We first consider the problem of testing an ontology for safeness defined as its not being able to be used to derive any sensitive facts using a given collection of inference rules. We then consider the problem of optimizing an ontology based on the criterion of making as much useful information as possible available without revealing any sensitive facts.
Discovery of protein-protein interactions using a combination of linguistic, statistical and graphical information
James W Cooper, Aaron Kershenbaum
BMC Bioinformatics , 2005, DOI: 10.1186/1471-2105-6-143
Abstract: This paper reports a scalable method for the discovery of protein-protein interactions in Medline abstracts, using a combination of text analytics, statistical and graphical analysis, and a set of easily implemented rules. Applying these techniques to 12,300 abstracts, a precision of 0.61 and a recall of 0.97 were obtained, (f = 0.74) and when allowing for two-hop and three-hop relations discovered by graphical analysis, the precision was 0.74 (f = 0.83).This combination of linguistic and statistical approaches appears to provide the highest precision and recall thus far reported in detecting protein-protein relations using text analytic approaches.Scientists in molecular biology find that a significant technique for studying protein function is through the study of protein-protein interactions. While the actual experimental study of such interactions remains the most important manner of obtaining these data, the number of protein-protein interactions reported in the literature is substantial and growing rapidly. There are a number of tabulations of these interactions, such as that provided by the Munich Institute for Protein Sequence (MIPS); these tabulations are of necessity incomplete.To address this problem, we have been developing a group of biology-specific computational annotators that work in conjunction with our group's text analytic software, for the discovery of protein-protein relations in text.In this paper, we undertook a study that utilizes a combination of computational linguistics, statistics and domain-specific rules to detect protein-protein interactions in a set of Medline abstracts.The system we describe here is particularly appealing because it can be used both to find known interactions and to find interactions not yet tabulated. According to the National Library of Medicine, Medline contains over 11 million abstracts, with about 40,000 being added each month. Thus, having a scalable, robust system for protein interaction discovery provides a
Clique-Finding for Heterogeneity and Multidimensionality in Biomarker Epidemiology Research: The CHAMBER Algorithm
Richard A. Mushlin, Stephen Gallagher, Aaron Kershenbaum, Timothy R. Rebbeck
PLOS ONE , 2009, DOI: 10.1371/journal.pone.0004862
Abstract: Background Commonly-occurring disease etiology may involve complex combinations of genes and exposures resulting in etiologic heterogeneity. We present a computational algorithm that employs clique-finding for heterogeneity and multidimensionality in biomedical and epidemiological research (the “CHAMBER” algorithm). Methodology/Principal Findings This algorithm uses graph-building to (1) identify genetic variants that influence disease risk and (2) predict individuals at risk for disease based on inherited genotype. We use a set-covering algorithm to identify optimal cliques and a Boolean function that identifies etiologically heterogeneous groups of individuals. We evaluated this approach using simulated case-control genotype-disease associations involving two- and four-gene patterns. The CHAMBER algorithm correctly identified these simulated etiologies. We also used two population-based case-control studies of breast and endometrial cancer in African American and Caucasian women considering data on genotypes involved in steroid hormone metabolism. We identified novel patterns in both cancer sites that involved genes that sulfate or glucuronidate estrogens or catecholestrogens. These associations were consistent with the hypothesized biological functions of these genes. We also identified cliques representing the joint effect of multiple candidate genes in all groups, suggesting the existence of biologically plausible combinations of hormone metabolism genes in both breast and endometrial cancer in both races. Conclusions The CHAMBER algorithm may have utility in exploring the multifactorial etiology and etiologic heterogeneity in complex disease.
Modelling Transmission of Vector-Borne Pathogens Shows Complex Dynamics When Vector Feeding Sites Are Limited
Arik Kershenbaum, Lewi Stone, Richard S. Ostfeld, Leon Blaustein
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0036730
Abstract: The relationship between species richness and the prevalence of vector-borne disease has been widely studied with a range of outcomes. Increasing the number of host species for a pathogen may decrease infection prevalence (dilution effect), increase it (amplification), or have no effect. We derive a general model, and a specific implementation, which show that when the number of vector feeding sites on each host is limiting, the effects on pathogen dynamics of host population size are more complex than previously thought. The model examines vector-borne disease in the presence of different host species that are either competent or incompetent (i.e. that cannot transmit the pathogen to vectors) as reservoirs for the pathogen. With a single host species present, the basic reproduction ratio R0 is a non-monotonic function of the population size of host individuals (H), i.e. a value exists that maximises R0. Surprisingly, if a reduction in host population size may actually increase R0. Extending this model to a two-host species system, incompetent individuals from the second host species can alter the value of which may reverse the effect on pathogen prevalence of host population reduction. We argue that when vector-feeding sites on hosts are limiting, the net effect of increasing host diversity might not be correctly predicted using simple frequency-dependent epidemiological models.
A global model of malaria climate sensitivity: comparing malaria response to historic climate data based on simulation and officially reported malaria incidence
Edlund Stefan,Davis Matthew,Douglas Judith V,Kershenbaum Arik
Malaria Journal , 2012, DOI: 10.1186/1475-2875-11-331
Abstract: Background The role of the Anopheles vector in malaria transmission and the effect of climate on Anopheles populations are well established. Models of the impact of climate change on the global malaria burden now have access to high-resolution climate data, but malaria surveillance data tends to be less precise, making model calibration problematic. Measurement of malaria response to fluctuations in climate variables offers a way to address these difficulties. Given the demonstrated sensitivity of malaria transmission to vector capacity, this work tests response functions to fluctuations in land surface temperature and precipitation. Methods This study of regional sensitivity of malaria incidence to year-to-year climate variations used an extended Macdonald Ross compartmental disease model (to compute malaria incidence) built on top of a global Anopheles vector capacity model (based on 10 years of satellite climate data). The predicted incidence was compared with estimates from the World Health Organization and the Malaria Atlas. The models and denominator data used are freely available through the Eclipse Foundation’s Spatiotemporal Epidemiological Modeller (STEM). Results Although the absolute scale factor relating reported malaria to absolute incidence is uncertain, there is a positive correlation between predicted and reported year-to-year variation in malaria burden with an averaged root mean square (RMS) error of 25% comparing normalized incidence across 86 countries. Based on this, the proposed measure of sensitivity of malaria to variations in climate variables indicates locations where malaria is most likely to increase or decrease in response to specific climate factors. Bootstrapping measures the increased uncertainty in predicting malaria sensitivity when reporting is restricted to national level and an annual basis. Results indicate a potential 20x improvement in accuracy if data were available at the level ISO 3166–2 national subdivisions and with monthly time sampling. Conclusions The high spatial resolution possible with state-of-the-art numerical models can identify regions most likely to require intervention due to climate changes. Higher-resolution surveillance data can provide a better understanding of how climate fluctuations affect malaria incidence and improve predictions. An open-source modelling framework, such as STEM, can be a valuable tool for the scientific community and provide a collaborative platform for developing such models.
The Encoding of Individual Identity in Dolphin Signature Whistles: How Much Information Is Needed?
Arik Kershenbaum, Laela S. Sayigh, Vincent M. Janik
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0077671
Abstract: Bottlenose dolphins (Tursiops truncatus) produce many vocalisations, including whistles that are unique to the individual producing them. Such “signature whistles” play a role in individual recognition and maintaining group integrity. Previous work has shown that humans can successfully group the spectrographic representations of signature whistles according to the individual dolphins that produced them. However, attempts at using mathematical algorithms to perform a similar task have been less successful. A greater understanding of the encoding of identity information in signature whistles is important for assessing similarity of whistles and thus social influences on the development of these learned calls. We re-examined 400 signature whistles from 20 individual dolphins used in a previous study, and tested the performance of new mathematical algorithms. We compared the measure used in the original study (correlation matrix of evenly sampled frequency measurements) to one used in several previous studies (similarity matrix of time-warped whistles), and to a new algorithm based on the Parsons code, used in music retrieval databases. The Parsons code records the direction of frequency change at each time step, and is effective at capturing human perception of music. We analysed similarity matrices from each of these three techniques, as well as a random control, by unsupervised clustering using three separate techniques: k-means clustering, hierarchical clustering, and an adaptive resonance theory neural network. For each of the three clustering techniques, a seven-level Parsons algorithm provided better clustering than the correlation and dynamic time warping algorithms, and was closer to the near-perfect visual categorisations of human judges. Thus, the Parsons code captures much of the individual identity information present in signature whistles, and may prove useful in studies requiring quantification of whistle similarity.
The bearing of biological fitness in humans and crops upon the emergence and spread of agriculture  [PDF]
Aaron Rottenberg
Natural Science (NS) , 2013, DOI: 10.4236/ns.2013.54A001

Past studies discussing the origins of agriculture have mainly emphasized changes in environmental and human-behavior factors as possible explanations for the shift from foraging to farming. This paper focuses on how increase in the biological fitness of both farmers and crops enabled the rapid evolution and success of farmers and agriculture. It is shown that the first plants under domestication achieved their superior fitness mainly as a consequence of some of their genetic and life-history traits. It led these species to be extensively integrated into human subsistence and eventually dominate the farmers’ fields. Concurrently, the first farmers gained their enhanced fitness by producing food surplus and by acquiring extra social prestige and power, while materializing the tendency to higher reproduction rate, and eventually to the expansion of farming populations. The unbreakable dependence between high fitness crops and high fitness man, namely their coevolution is a key issue and a promising research area in the understanding of the human story and the origins of agriculture.

Altered Fire Regimes and the Persistence of Quaking Aspen in the Rocky Mountains: A Literature Review  [PDF]
Aaron Rosenblum
Open Journal of Forestry (OJF) , 2015, DOI: 10.4236/ojf.2015.55050
Abstract: The persistence of quaking aspen (Populus tremuloides Michx.) is of significant importance to land managers in the Rocky Mountain region. Fire suppression in the past century has been implicated as a mechanism influencing aspen population dynamics, as aspen are generally considered an early seral disturbance adapted species. The heterogeneity of aspen community assemblages and fire regimes makes it difficult to discern what the result of fire suppression has been at large spatial and temporal scales. Decision makers should investigate the questions at hand at the stand level in their study location to best determine the mechanisms at play, as well as consider future potential changes to the system.
A Bioeconomic Model for Sustainable Grazing of Old World Bluestem under Uncertainty  [PDF]
Aaron Benson, Cody Zilverberg
Natural Resources (NR) , 2013, DOI: 10.4236/nr.2013.44044

\"WW B. Dahl\", a perennial old world bluestem (OWB) grass, has been promoted as a forage suitable for dryland grazing. Dryland grazing of OWB is however inherently risky economically and ecologically, and may not be sustainable while remaining profitable. In this paper we develop a biological and economic single-season model of dryland grazing given production and price uncertainty, and identify a stocking rate that maximizes expected net revenue, subject to a sustainability constraint. We then simulate the distribution of net revenues, and find that probability of loss is greater than 35%, and median profit is roughly $30/ha.

The Phonetics of Multiple Vowel Lengthening in Japanese  [PDF]
Shigeto Kawahara, Aaron Braver
Open Journal of Modern Linguistics (OJML) , 2013, DOI: 10.4236/ojml.2013.32019
Abstract: Many languages exploit a short vs. long lexical contrast in vowels. In most, if not all of these languages, the contrast is binary. In Japanese, however, speakers can lengthen vowels to express emphasis, and multiple degrees of lengthening can be used to express different degrees of emphasis. This paper offers the first experimental documentation of this emphatic vowel lengthening phenomenon. The current results demonstrate that, among the seven speakers recorded, at least a few speakers show six-levels of distinction in duration, and all but one speaker showed a steady linear correlation between duration and level of emphasis. We conclude that Japanese speakers have articulatory control that allows them to make very fine-grained durational distinctions, which go beyond mere binary short vs. long distinctions.
