oalib

Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99

Submit

Any time

2020 ( 5 )

2019 ( 406 )

2018 ( 519 )

2017 ( 536 )

Custom range...

Search Results: 1 - 10 of 325457 matches for " S Cenk Sahinalp "
All listed articles are free for downloading (OA Articles)
Page 1 /325457
Display every page Item
The intelligence in developing systems for molecular biology
S Cenk Sahinalp
Genome Biology , 2007, DOI: 10.1186/gb-2007-8-1-301
Abstract: The 900 or so participants at the Annual International Conference on Intelligent Systems for Molecular Biology last August were treated to talks on topics ranging from sequence analysis, structural bioinformatics, and comparative genomics through to proteomics and systems biology. It was evident that interest in RNA, especially non-coding RNA (ncRNA), is growing, with quite a few talks on locating and predicting the structure of small (and not so small) ncRNAs. As well as such relatively new topics, the classic problem of discovering sequence motifs and assessing their significance seems to be re-emerging, especially in the context of new applications. As the biological problems scientists aim to address become more complex, the mathematical principles and computational tools being developed to solve them must become more sophisticated. The conference showed that not only are computer science and mathematics being applied to solving key problems in molecular biology, but these problems are inspiring the development of new computer science, and, to a certain degree, new mathematics.Sequence analysis was still the theme running through most talks. Its application outside DNA and proteins was illustrated by Kiyoko Aoki-Kinoshita (Kyoto University, Japan), who described motif discovery in carbohydrate sugar chains (glycans), the third major class of macromolecules. Starting from a single monosaccharide, many glycans have a tree-like structure consisting of branching chains with various combinations of monosaccharides. Aoki-Kinoshita described a profile Markov model using a probabilistic sibling-dependent tree (PST) that aims to recognize glycan motifs, which are basically paths on their tree representation. The model has been tested successfully on both synthetic glycans and glycan data from the KEGG GLYCAN database, accessed from http://www.genome.jp/kegg/glycan webcite.Eugene Fratkin (Stanford University, Palo Alto, USA) described a combinatorial technique for finding
Fast prediction of RNA-RNA interaction
Raheleh Salari, Rolf Backofen, S Cenk Sahinalp
Algorithms for Molecular Biology , 2010, DOI: 10.1186/1748-7188-5-5
Abstract: In this paper we present a novel algorithm to accurately predict the minimum free energy structure of RNA-RNA interaction under the most general type of interactions studied in the literature. Moreover, we introduce a fast heuristic method to predict the specific (multiple) binding sites of two interacting RNAs.We verify the performance of our algorithms for joint structure and binding site prediction on a set of known interacting RNA pairs. Experimental results show our algorithms are highly accurate and outperform all competitive approaches.Regulatory non-coding RNAs (ncRNAs) play an important role in gene regulation. Studies on both prokaryotic and eukaryotic cells show that such ncRNAs usually bind to their target mRNA to regulate the translation of corresponding genes. Many regulatory RNAs such as microRNAs and small interfering RNAs (miRNAs/siRNAs) are very short and have full sequence complementarity to the targets. However some of the regulatory antisense RNAs are relatively long and are not fully complementary to their target sequences. They exhibit their regulatory functions by establishing stable joint structures with target mRNA initiated by one or more loop-loop interactions.In this paper we present an efficient method for the RNA-RNA interaction prediction (RIP) problem with multiple binding domains. Alkan et al. [1] proved that RIP, in its general form, is an NP-complete problem and provided algorithms for predicting specific types of interactions and two relatively simple energy models - under which RIP is polynomial time solvable. We focus on the same type of interactions, which to the best of our knowledge, are the most general type of interactions considered in the literature; however the energy model we use is the joint structure energy model recently presented by Chitsaz et al. [2] which is more general than the one used by Alkan et al.In what follows below, we first describe a combinatorial algorithm to compute the minimum free energy joint str
Pair HMM based gap statistics for re-evaluation of indels in alignments with affine gap penalties: Extended Version
Alexander Sch?nhuth,Raheleh Salari,S. Cenk Sahinalp
Quantitative Biology , 2010,
Abstract: Although computationally aligning sequence is a crucial step in the vast majority of comparative genomics studies our understanding of alignment biases still needs to be improved. To infer true structural or homologous regions computational alignments need further evaluation. It has been shown that the accuracy of aligned positions can drop substantially in particular around gaps. Here we focus on re-evaluation of score-based alignments with affine gap penalty costs. We exploit their relationships with pair hidden Markov models and develop efficient algorithms by which to identify gaps which are significant in terms of length and multiplicity. We evaluate our statistics with respect to the well-established structural alignments from SABmark and find that indel reliability substantially increases with their significance in particular in worst-case twilight zone alignments. This points out that our statistics can reliably complement other methods which mostly focus on the reliability of match positions.
smyRNA: A Novel Ab Initio ncRNA Gene Finder
Raheleh Salari, Cagri Aksay, Emre Karakoc, Peter J. Unrau, Iman Hajirasouliha, S. Cenk Sahinalp
PLOS ONE , 2009, DOI: 10.1371/journal.pone.0005433
Abstract: Background Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a hard task; in particular past attempts for ab initio ncRNA search mostly failed with the exception of tools that can identify micro RNAs. Methodology/Principal Findings We present a very general ab initio ncRNA gene finder that exploits differential distributions of sequence motifs between ncRNAs and background genome sequences. Conclusions/Significance Our method, once trained on a set of ncRNAs from a given species, can be applied to a genome sequences of other organisms to find not only ncRNAs homologous to those in the training set but also others that potentially belong to novel (and perhaps unknown) ncRNA families. Availability: http://compbio.cs.sfu.ca/taverna/smyrna
Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution
Fereydoun Hormozdiari,Petra Berenbrink,Nata?a Pr?ulj,S. Cenk Sahinalp
PLOS Computational Biology , 2007, DOI: 10.1371/journal.pcbi.0030118
Abstract: The (asymptotic) degree distributions of the best-known “scale-free” network models are all similar and are independent of the seed graph used; hence, it has been tempting to assume that networks generated by these models are generally similar. In this paper, we observe that several key topological features of such networks depend heavily on the specific model and the seed graph used. Furthermore, we show that starting with the “right” seed graph (typically a dense subgraph of the protein–protein interaction network analyzed), the duplication model captures many topological features of publicly available protein–protein interaction networks very well.
Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data
Can Alkan,Mario Ventura,Nicoletta Archidiacono,Mariano Rocchi,S. Cenk Sahinalp,Evan E Eichler
PLOS Computational Biology , 2007, DOI: 10.1371/journal.pcbi.0030181
Abstract: The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%–5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.
Sparsification of RNA structure prediction including pseudoknots
Mathias M?hl, Raheleh Salari, Sebastian Will, Rolf Backofen, S Cenk Sahinalp
Algorithms for Molecular Biology , 2010, DOI: 10.1186/1748-7188-5-39
Abstract: In this paper, we introduce sparsification to significantly speedup the dynamic programming approaches for pseudoknotted RNA structure prediction, which also lower the space requirements. Although sparsification has been applied to a number of RNA-related structure prediction problems in the past few years, we provide the first application of sparsification to pseudoknotted RNA structure prediction specifically and to handling gapped fragments more generally - which has a much more complex recursive structure than other problems to which sparsification has been applied. We analyse how to sparsify four pseudoknot structure prediction algorithms, among those the most general method available (the Rivas-Eddy algorithm) and the fastest one (Reeder-Giegerich algorithm). In all algorithms the number of "candidate" substructures to be considered is reduced.Our experimental results on the sparsified Reeder-Giegerich algorithm suggest a linear speedup over the unsparsified implementation.Recently discovered catalytic and regulatory RNAs [1,2] exhibit their functionality due to specific secondary and tertiary structures [3,4]. The vast majority of computational analysis of non-coding RNAs have been restricted to nested secondary structures, neglecting pseudoknots - which are "among the most prevalent RNA structures" [5]. For example, Xaya-phoummine et al. [6] estimated that up to 30% of the base pairs in G+C-rich sequences form pseudoknots.However the general problem of pseudoknotted RNA structure prediction is NP-hard. As a result, a number of approaches have been introduced for handling restricted classes of pseudoknots [7-13]. Condon et al. [14] give an overview of their structure classes and the algorithm-specific restrictions and M?hl et al. [15] develop a general framework showing that all these algorithms follow a general scheme, which they use for efficient alignment of pseudoknotted RNA.The most general algorithm (with respect to the pseudoknot classes handled) among
Mirroring co-evolving trees in the light of their topologies
Iman Hajirasouliha,Alexander Sch?nhuth,David Juan,Alfonso Valencia,S. Cenk Sahinalp
Computer Science , 2011,
Abstract: Determining the interaction partners among protein/domain families poses hard computational problems, in particular in the presence of paralogous proteins. Available approaches aim to identify interaction partners among protein/domain families through maximizing the similarity between trimmed versions of their phylogenetic trees. Since maximization of any natural similarity score is computationally difficult, many approaches employ heuristics to maximize the distance matrices corresponding to the tree topologies in question. In this paper we devise an efficient deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score-optimal alignment of the two trees in question. Our algorithm is significantly faster than those methods based on distance matrix comparison: 1 minute on a single processor vs. 730 hours on a supercomputer. Furthermore we have advantages over the current state-of-the-art heuristic search approach in terms of precision as well as a recently suggested overall performance measure for mirrortree approaches, while incurring only acceptable losses in recall. A C implementation of the method demonstrated in this paper is available at http://compbio.cs.sfu.ca/mirrort.htm
Joint Inference of Genome Structure and Content in Heterogeneous Tumour Samples
Andrew McPherson,Andrew Roth,Gavin Ha,Sohrab P. Shah,Cedric Chauve,S. Cenk Sahinalp
Computer Science , 2015,
Abstract: For a genomically unstable cancer, a single tumour biopsy will often contain a mixture of competing tumour clones. These tumour clones frequently differ with respect to their genomic content (copy number of each gene) and structure (order of genes on each chromosome). Modern bulk genome sequencing mixes the signals of tumour clones and contaminating normal cells, complicating inference of genomic content and structure. We propose a method to unmix tumour and contaminating normal signals and jointly predict genomic structure and content of each tumour clone. We use genome graphs to represent tumour clones, and model the likelihood of the observed reads given clones and mixing proportions. Our use of haplotype blocks allows us to accurately measure allele specific read counts, and infer allele specific copy number for each clone. The proposed method is a heuristic local search based on applying incremental, locally optimal modifications of the genome graphs. Using simulated data, we show that our method predicts copy counts and gene adjacencies with reasonable accuracy.
deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data
Andrew McPherson,Fereydoun Hormozdiari,Abdalnasser Zayed,Ryan Giuliany,Gavin Ha,Mark G. F. Sun,Malachi Griffith,Alireza Heravi Moussavi,Janine Senz,Nataliya Melnyk,Marina Pacheco,Marco A. Marra,Martin Hirst,Torsten O. Nielsen,S. Cenk Sahinalp,David Huntsman,Sohrab P. Shah
PLOS Computational Biology , 2011, DOI: 10.1371/journal.pcbi.1001138
Abstract: Gene fusions created by somatic genomic rearrangements are known to play an important role in the onset and development of some cancers, such as lymphomas and sarcomas. RNA-Seq (whole transcriptome shotgun sequencing) is proving to be a useful tool for the discovery of novel gene fusions in cancer transcriptomes. However, algorithmic methods for the discovery of gene fusions using RNA-Seq data remain underdeveloped. We have developed deFuse, a novel computational method for fusion discovery in tumor RNA-Seq data. Unlike existing methods that use only unique best-hit alignments and consider only fusion boundaries at the ends of known exons, deFuse considers all alignments and all possible locations for fusion boundaries. As a result, deFuse is able to identify fusion sequences with demonstrably better sensitivity than previous approaches. To increase the specificity of our approach, we curated a list of 60 true positive and 61 true negative fusion sequences (as confirmed by RT-PCR), and have trained an adaboost classifier on 11 novel features of the sequence data. The resulting classifier has an estimated value of 0.91 for the area under the ROC curve. We have used deFuse to discover gene fusions in 40 ovarian tumor samples, one ovarian cancer cell line, and three sarcoma samples. We report herein the first gene fusions discovered in ovarian cancer. We conclude that gene fusions are not infrequent events in ovarian cancer and that these events have the potential to substantially alter the expression patterns of the genes involved; gene fusions should therefore be considered in efforts to comprehensively characterize the mutational profiles of ovarian cancer transcriptomes.
Page 1 /325457
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.