Identification and characterization of novel amphioxus microRNAs by Solexa sequencing

DOI: 10.1186/gb-2009-10-7-r78

We combined Solexa sequencing with computational techniques to identify novel miRNAs in the amphioxus species B. belcheri (Gray). This approach allowed us to identify 113 amphioxus miRNA genes. Among them, 55 were conserved across species and encoded 45 non-redundant mature miRNAs, whereas 58 were amphioxus-specific and encoded 53 mature miRNAs. Validation of our results with microarray and stem-loop quantitative RT-PCR revealed that Solexa sequencing is a powerful tool for miRNA discovery. Analyzing the evolutionary history of amphioxus miRNAs, we found that amphioxus possesses many miRNAs unique to chordates and vertebrates, and these may thus represent key steps in the evolutionary progression from cephalochordates to vertebrates. We also found that amphioxus is more similar to vertebrates than are tunicates with respect to their miRNA phylogenetic histories.Taken together, our results indicate that Solexa sequencing allows the successful discovery of novel miRNAs from amphioxus with high accuracy and efficiency. More importantly, our study provides an opportunity to decipher how the elaboration of the miRNA repertoire that occurred during chordate evolution contributed to the evolution of the vertebrate body plan.When the class of RNA regulatory genes known as microRNAs (miRNAs) was discovered it introduced a whole new layer of gene regulation in eukaryotes [1]. Since the discovery of the first miRNA (lin-4) in Caenorhabditis elegans, thousands of miRNAs have been identified experimentally or computationally from a variety of species [1]. miRNAs are currently estimated to comprise 1 to 5% of animal genes and collectively regulate up to 30% of genes, making them one of the most abundant classes of regulators [2]. However, while the importance of miRNAs in animal ontogeny has been rapidly elucidated, their role in phylogeny currently remains largely unknown. Recent studies have provided important clues indicating that these approximately 22-nucleotide non-coding R


