|
Protein interaction sentence detection using multiple semantic kernelsAbstract: We show that combinations of semantic kernels lead to statistically significant improvements in recognition rates and receiver operating characteristic (ROC) scores over the plain Gaussian kernel, when applied to a well-known labelled collection of abstracts. The proposed kernel composition method also allows us to automatically infer the most discriminative kernels.The results from this paper indicate that using semantic information from unlabelled text, and combinations of such information, can be valuable for classification of short texts such as PPI sentences. This study, however, is only a first step in evaluation of semantic kernels and probabilistic multiple kernel learning in the context of PPI detection. The method described herein is modular, and can be applied with a variety of feature types, kernels, and semantic models, in order to facilitate full extraction of interacting proteins.Proteins are the principal engine enabling chemical reactions in a cell, and, as such, are of great interest to biologists studying life on the molecular level. Part of the proteins' functionality depends on their interactions with each other. Information about these interactions is paramount to the understanding of pathologies, diseases, and treatments. The principal observations of interactions are made through biological experiments [1], whose results are reported in peer-reviewed biomedical journal articles. Protein-protein interactions (PPIs) are then found by researchers through various search engines indexing these specific articles. In text, a PPI is a relation between two protein entities linked by an action descriptor, which is usually either a verb, or a present (-ing) or past (-ed) participial adjective (e.g. activate, activating, activated). A relationship is difficult to describe using a query; therefore, current state-of-the-art search engines are not well suited for this task. In addition, ad hoc query-based searches are more appropriate for temporary informat
|