%0 Journal Article %T Indexing for Large DNA Database Sequences %A S. M. Wohoush & M.H. Saheb %J International Journal of Biometric and Bioinformatics %D 2011 %I Computer Science Journals %X Bioinformatics data consists of a huge amount of information due to the large number ofsequences, the very high sequences lengths and the daily new additions. This data need to beefficiently accessed for many needs. What makes one DNA data item distinct from another is itsDNA sequence. DNA sequence consists of a combination of four characters which are A, C, G, Tand have different lengths. Use a suitable representation of DNA sequences, and a suitable indexstructure to hold this representation at main memory will lead to have efficient processing byaccessing the DNA sequences through indexing, and will reduce number of disk I/O accesses.I/O operations needed at the end, to avoid false hits, we reduce the number of candidate DNAsequences that need to be checked by pruning, so no need to search the whole database. Weneed to have a suitable index for searching DNA sequences efficiently, with suitable index sizeand searching time. The suitable selection of relation fields, where index is build upon has a bigeffect on index size and search time. Our experiments use the n-gram wavelet transformationupon one field and multi-fields index structure under the relational DBMS environment. Resultsshow the need to consider index size and search time while using indexing carefully. Increasingwindow size decreases the amount of I/O reference. The use of a single field and multiple fieldsindexing is highly affected by window size value. Increasing window size value lead to bettersearching time with special type index using single filed indexing. While the search time is almostgood and the same with most index types when using multiple field indexing. Storage spaceneeded for RDMS indexing types are almost the same or greater than the actual data. %K Large Database %K DNA Sequence %K Index Structure %K Sequence Transformation %K Wavelet Transformation %K RDMS Indexing. %U http://cscjournals.org/csc/manuscript/Journals/IJBB/volume5/Issue4/IJBB-125.pdf