|
BMC Bioinformatics 2009
Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensisAbstract: We propose an algorithm called PORTRAIT, which is suitable for ncRNA analysis of transcriptomes from poorly characterized species. Sequences are translated by software that is resistant to sequencing errors, and the predicted putative proteins, along with their source transcripts, are evaluated for coding potential by a support vector machine (SVM). Either of two SVM models may be employed: if a putative protein is found, a protein-dependent SVM model is used; if it is not found, a protein-independent SVM model is used instead. Only ab initio features are extracted, so that no homology information is needed. We illustrate the use of PORTRAIT by predicting ncRNAs from the transcriptome of the pathogenic fungus Paracoccidoides brasiliensis and five other related fungi.PORTRAIT can be integrated into pipelines, and provides a low computational cost solution for ncRNA detection in transcriptome sequencing projects.Proteins are recognized as the most important players in cell homeostasis. Due to their importance and relatively straightforward characterization, it is expected that the main focus of transcriptome projects will be transcripts that code for proteins. To meet this demand, several specific computational tools have been created, both for absolute characterization and comparative analysis of these molecules. Only recently has attention begun to turn to those transcripts ignored or rejected by protein-oriented software packages: the so-called non-coding RNAs (ncRNAs). Classical, textbook examples of ncRNAs include ribosomal and transfer RNAs. More recently, other classes have been unveiled, such as microRNAs, siRNAs, piRNAs, asRNAs and the long, mRNA-like ncRNAs, widespread among all Domains, with evidence of ubiquitous tissue expression in plants and animals [1,2].Demand is now arising for specific tools for working with these molecules. A combination of new computational tools and advances in biological knowledge allowed for development of specific software for
|