%0 Journal Article
%T G-InforBIO: integrated system for microbial genomics
%A Naoto Tanaka
%A Takashi Abe
%A Satoru Miyazaki
%A Hideaki Sugawara
%J BMC Bioinformatics
%D 2006
%I BioMed Central
%R 10.1186/1471-2105-7-368
%X The G-InforBIO system is a novel tool for genome data management and sequence analysis. The system can import genome data encoded as eXtensible Markup Language documents as formatted text documents, including annotations and sequences, from DNA Data Bank of Japan and GenBank encoded as flat files. The genome database is constructed automatically after importing, and the database can be exported as documents formatted with eXtensible Markup Language or tab-deliminated text. Users can retrieve data from the database by keyword searches, edit annotation data of genes, and process data with G-InforBIO. In addition, information in the G-InforBIO database can be analyzed seamlessly with nine different software programs, including programs for clustering and homology analyses.The G-InforBIO system simplifies genome analyses by integrating several available software programs to allow efficient handling and manipulation of genome data. G-InforBIO is freely available from the download site.The number of microbial genomes for which sequence data are available is increasing each year. Currently, complete nucleotide sequences of more than 300 strains are available in the International Nucleotide Sequence Database (INSD), which includes DDBJ, EMBL, and GenBank [1], and the sequence data are summarized in the portal site, Genome Information Broker (GIB) [2,3]. Genome data are composed primarily of annotation and sequence data, and the large volume of annotation data and long nucleotide sequences must be integrated for effective genome research. Such genome data are used for analyses that include comparisons of genomic structures between closely related species [4,5], phylogenetic analysis [6], and detection of ubiquitous [7,8] and species-specific genes (ORFans) [9,10]. It appears that genomic analyses require high-capacity computers and many programs to study multiple long sequences.Software programs, including Artemis [11], ASAP [12,13], ERGO [14], and GenDB [15], have been deve
%U http://www.biomedcentral.com/1471-2105/7/368