|
生物物理学报 2000
REVERSE-TRANSLATED ALIGNMENT OF EST SEQUENCE WITH PROTEIN SEQUENCE
|
Abstract:
The sequences in database increase quickly along with the development of the high-throughput sequencing techniques, while most of the sequences are ESTs (Expressed Sequencing Tags) with unknown function. The homology alignment was often employed to identify the biological function of EST sequences, comparing all the six reading frames of EST against the selected protein databases at protein level. However, EST sequences contain nearly 5% sequencing errors, in which the frameshift errors made it difficult to treat precisely with traditional alignment. Addressing most of the possible sequencing errors, our alignment model is reverse-translateing the protein sequence into putative nucleotide sequence, which allowed direct comparison at nucleotide level. Such alignment between protein and EST sequences could be more accurate. And the knotty frameshifts in EST sequences could be identified with high quality.