%0 Journal Article %T Bio301: A Web-Based EST Annotation Pipeline That Facilitates Functional Comparison Studies %A Yen-Chen Chen %A Yun-Ching Chen %A Wen-Dar Lin %A Chung-Der Hsiao %A Hung-Wen Chiu %A Jan-Ming Ho %J ISRN Bioinformatics %D 2012 %R 10.5402/2012/139842 %X In this postgenomic era, a huge volume of information derived from expressed sequence tags (ESTs) has been constructed for functional description of gene expression profiles. Comparative studies have become more and more important to researchers of biology. In order to facilitate these comparative studies, we have constructed a user-friendly EST annotation pipeline with comparison tools on an integrated EST service website, Bio301. Bio301 includes regular EST preprocessing, BLAST similarity search, gene ontology (GO) annotation, statistics reporting, a graphical GO browsing interface, and microarray probe selection tools. In addition, Bio301 is equipped with statistical library comparison functions using multiple EST libraries based on GO annotations for mining meaningful biological information. 1. Motivation Expressed sequence tags (ESTs) [1] are small pieces of DNA sequences (usually 200 to 500 nucleotides long) derived by either unidirectional or bidirectional sequencing of cDNA libraries. The information generated from ESTs has been utilized not only to identify novel gene transcripts, gene locations, and intron-exon boundaries in human and mouse genome drafts [2, 3] but also to assess gene expression levels of given tissues [4]. The large volume of information generated by the rapidly increasing number of ESTs¡ª59 million EST entries in the dbEST in January 2009 alone¡ªprovides an excellent resource for comparative studies, so we have constructed an EST service website, Bio301, to facilitate comparative studies based on these EST data. Bio301 is equipped with not only an EST annotation pipeline but also functional comparative functionality. Bio301 has five characteristics considered to be essential for EST analysis tools that aid in functional comparative studies: accurate preprocessing, advanced functional annotation methods, flexibility in comparing multiple EST libraries, retrieval of EST data with respect to the annotation ontology, and integrated online EST service open to the entire research community. First, Bio301 preprocesses ESTs accurately by cleaning, clustering, and assembling them. These tasks are very important because accurate preprocessing leads to accurate functional annotation, which is crucial for functional comparison studies. Bio301 uses one of the best programs for sequence cleaning, SeqClean (http://compbio.dfci.harvard.edu/tgi/software/). Concordantly, Bio301 also uses state-of-the-art programs for clustering and assembly, TGICL and CAP3 [5, 6]. Since reference genomes with extensive genome annotation have been shown to be %U http://www.hindawi.com/journals/isrn.bioinformatics/2012/139842/