%0 Journal Article %T An EST-based analysis identifies new genes and reveals distinctive gene expression features of Coffea arabica and Coffea canephora %A Jorge MC Mondego %A Ramon O Vidal %A Marcelo F Carazzolle %A Eric K Tokuda %A Lucas P Parizzi %A Gustavo GL Costa %A Luiz FP Pereira %A Alan C Andrade %A Carlos A Colombo %A Luiz GE Vieira %A Gon£¿alo AG Pereira %A Brazilian Coffee Genome Project Consortium %J BMC Plant Biology %D 2011 %I BioMed Central %R 10.1186/1471-2229-11-30 %X Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestl¨¦-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories.We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/coffea webcite. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance.Coffee is the most important agricultural comm %U http://www.biomedcentral.com/1471-2229/11/30