Discovering gene annotations in biomedical text databases
Ali Cakmak, Gultekin Ozsoyoglu
BMC Bioinformatics , 2008, DOI: 10.1186/1471-2105-9-143
Abstract: In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products.In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general.GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values.The number of published molecular biology and genomics research articles has been increasing at a fast rate. Advancements in computationa
On Using the Research-Pyramid Model to Enhance Literature Digital Libraries
Sulieman Bani-Ahmad,Gultekin Ozsoyoglu
Information Technology Journal , 2010,
Abstract: We validate the research pyramid model of research evolution. Moreover, we propose and evaluate two algorithms to identify research pyramids. Finally, we improve publication scores in terms of accuracy and separability via publications’ research pyramids. Accurately ranking publications enables users to aggregate pertinent results quickly and easily. Studies show that citation-based publication-importance functions, e.g., PageRank and Citation Count, are extremely skewed and have accuracy problems. Based on the notion of research pyramids we propose a priori technique to assign more effective and accurate publication importance scores. We showed that the proposed technique provides more accurate and significantly less skewed publication scores than citation-based techniques. Our experiments showed 16-25% improvement in search outputs accuracy measured for the top-k search results.
PathCase-SB architecture and database design
Ali Cakmak, Xinjian Qi, Sarp A Coskun, Mitali Das, En Cheng, A Cicek, Nicola Lai, Gultekin Ozsoyoglu, Z Ozsoyoglu
BMC Systems Biology , 2011, DOI: 10.1186/1752-0509-5-188
Abstract: PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database.PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world.There are many computer science applications that integrate the data of different data sources, and build new tools that are otherwise difficult or impossible to build. PathCase Systems Biology (PathCase-SB) [1] brings together, under a single database environment, metabolic pathways data and systems biology models, and provides new or expanded browsing, querying, visualization, and simulation capabilities in order to help with systems biology modeling and analysis, all brought about due to the integrated environment. Note that PathCase-SB, which builds on PathCase [2], is not a model- or pathways-data source, and, it does not curate systems biology models. In this paper, we describe architecture and database design issues encountered in PathCase-SB's design and implementation, and present the current design of the PathCase-SB architecture and database. The user interfaces of PathCase-SB are described in detail in another study (Coskun et al: PathCase-SB: Integrating Data Sources and Providing Tools
ADEMA: An Algorithm to Determine Expected Metabolite Level Alterations Using Mutual Information
A. Ercument Cicek ,Ilya Bederman,Leigh Henderson,Mitchell L. Drumm,Gultekin Ozsoyoglu
PLOS Computational Biology , 2013, DOI: 10.1371/journal.pcbi.1002859
Abstract: Metabolomics is a relatively new “omics” platform, which analyzes a discrete set of metabolites detected in bio-fluids or tissue samples of organisms. It has been used in a diverse array of studies to detect biomarkers and to determine activity rates for pathways based on changes due to disease or drugs. Recent improvements in analytical methodology and large sample throughput allow for creation of large datasets of metabolites that reflect changes in metabolic dynamics due to disease or a perturbation in the metabolic network. However, current methods of comprehensive analyses of large metabolic datasets (metabolomics) are limited, unlike other “omics” approaches where complex techniques for analyzing coexpression/coregulation of multiple variables are applied. This paper discusses the shortcomings of current metabolomics data analysis techniques, and proposes a new multivariate technique (ADEMA) based on mutual information to identify expected metabolite level changes with respect to a specific condition. We show that ADEMA better predicts De Novo Lipogenesis pathway metabolite level changes in samples with Cystic Fibrosis (CF) than prediction based on the significance of individual metabolite level changes. We also applied ADEMA's classification scheme on three different cohorts of CF and wildtype mice. ADEMA was able to predict whether an unknown mouse has a CF or a wildtype genotype with 1.0, 0.84, and 0.9 accuracy for each respective dataset. ADEMA results had up to 31% higher accuracy as compared to other classification algorithms. In conclusion, ADEMA advances the state-of-the-art in metabolomics analysis, by providing accurate and interpretable classification results.
PathCase-SB: integrating data sources and providing tools for systems biology research
Sarp A Coskun, Xinjian Qi, Ali Cakmak, En Cheng, A Cicek, Lei Yang, Rishiraj Jadeja, Ranjan K Dash, Nicola Lai, Gultekin Ozsoyoglu, Zehra Ozsoyoglu
BMC Systems Biology , 2012, DOI: 10.1186/1752-0509-6-67
Abstract: PathCase Systems Biology (PathCase-SB) is built and released. This paper describes PathCase-SB user interfaces developed to date. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate systems biology models data and metabolic network data of selected biological data sources on the web (currently, BioModels Database and KEGG, respectively), and to provide more powerful and/or new capabilities via the new web-based integrative framework.Each of the current four PathCase-SB interfaces, namely, Browser, Visualization, Querying, and Simulation interfaces, have expanded and new capabilities as compared with the original data sources. PathCase-SB is already available on the web and being used by researchers across the globe.Integrating selected data from multiple data sources with the goals of expanding the capabilities of original data sources, and allowing new tool-building opportunities is a common theme in many fields of computer science. PathCase Systems Biology (PathCase-SB) [1,2] is such a site, released on Aug. 2010, that brings together the data of (i) systems biology data sources, e.g., BioModels Database [3-5], and (ii) pathways data sources, e.g., KEGG [6-9], with the goal of providing additional capabilities and tools made possible due to the integration. By pathways, we refer to pathways of various different metabolism as defined by biochemists, and as found in biochemistry textbooks (such as [10]) and atlases (such as [11]). In this paper, we describe the current functionality (i.e., the currently available user interfaces) of PathCase-SB which provides a database-enabled integrative framework and tools towards effective and efficient systems biology model development and simulation for mechanistic understanding of the behavior of complex biological systems.PathCase-SB is web-based and has multiple inte
Investigation of the Effects of Profile Shift in Helical Gear Mechanisms with Analytical and Numerical Methods  [PDF]
Gultekin Karadere, Ilhan Yilmaz
World Journal of Mechanics (WJM) , 2018, DOI: 10.4236/wjm.2018.85015
Abstract: In this paper, the effects of profile shift in cylindrical helical gear mechanisms have been investigated with numerical and analytical calculations. The mathematical model for computer simulation of gears has been designed and the numerical calculations have been carried out. Analytical calculations have been made with an excel program which was designed at different profile shift coefficients for a selected mechanism. Analytical calculations of the same mechanism have been verified by using ANSYS 14.5. The results of analytical and numerical solutions have been compared to profile shift coefficients.
The Attitudes of Preschool Teacher Candidates Studying Through Distance Education Approach Towards Teaching Profession And Their Perception Levels of Teaching Competency
The Turkish Online Journal of Distance Education , 2006,
Abstract: The purpose of this study is to determine the attitudes of preschool teacher candidates studying through distance education approach towards teaching profession and to determine their perception levels of teaching competency. The population and sampling of the study were the senior students of Anadolu University, Open Education Faculty, Preschool Teacher Training Undergraduate Program. The study was conducted through 957 teacher candidates. A survey was used in order to measure the attitudes of teacher candidates towards teaching profession and to determine their perception levels of teaching competency as a data collection instrument for this study. The study revealed that the attitudes of teacher candidates towards teaching profession are quite positive, and their perception levels of teaching competency are very good. Moreover, the teacher candidates consider the program they enrolled in so beneficial for them to gain teaching competencies.
Quality of Distance Education in Turkey: Preschool Teacher Training Case
Mehmet Gultekin
International Review of Research in Open and Distance Learning , 2009,
Abstract: Distance education is used for teacher training at different levels and fields in Turkey. Launched in the 2000-2001 academic year and still applied by Anadolu University, the Pre-School Teacher Training Program is one of those programs offered by distance education. This study aims to evaluate Anadolu University’s Preschool Teacher Training Program in Turkey by obtaining student opinions. A total of 1,026 senior students enrolled in the Preschool Education major at the Open Education Faculty of Anadolu University participated in the survey. A questionnaire to determine the opinions of students on the program was used as a means of data collection. Means (X) and standard deviations (SD) were employed to analyze the survey data. The results showed that although the teacher candidates study at a good level, they do not have a good record of watching the television programs. The results also revealed that the opinions of teacher candidates about the textbooks, television programs, teaching practices, and academic assistance services are positive.
Determination of the intrinsic scatter in the M-sigma and M-L relations
Kayhan Gultekin
Physics , 2009,
Abstract: We report on recently derived improved versions of the relations between supermassive black hole mass (M_BH) and host-galaxy bulge velocity dispersion (sigma) and luminosity (L) (the M-sigma and M-L relations), based on ~50 M_BH measurements and ~20 upper limits. Particular attention is paid to recovery of the intrinsic scatter (epsilon_0) in both relations. The scatter was found to be significantly larger than estimated in most previous studies. The large scatter requires revision of the local black hole mass function, and it implies that there may be substantial selection bias in studies of the evolution of the M-sigma and M-L relations. When only considering ellipticals, the scatter appears to decrease. These results appear to be insensitive to a wide range of assumptions about the measurement errors and the distribution of intrinsic scatter. We also report on the effects on the fits of culling the sample according to the resolution of the black hole's sphere of influence.
One Size Does not Fit All: When to Use Signature-based Pruning to Improve Template Matching for RDF graphs
Shi Qiao,Z. Meral Ozsoyoglu
Computer Science , 2015,
Abstract: Signature-based pruning is broadly accepted as an effective way to improve query performance of graph template matching on general labeled graphs. Most existing techniques which utilize signature-based pruning claim its benefits on all datasets and queries. However, the effectiveness of signature-based pruning varies greatly among different RDF datasets and highly related with their dataset characteristics. We observe that the performance benefits from signature-based pruning depend not only on the size of the RDF graphs, but also the underlying graph structure and the complexity of queries. This motivates us to propose a flexible RDF querying framework, called RDF-h, which selectively utilizes signature-based pruning by evaluating the characteristics of RDF datasets and query templates. Scalability and efficiency of RDF-h is demonstrated in experimental results using both real and synthetic datasets. Keywords: RDF, Graph Template Matching, Signature-based Pruning
