%0 Journal Article %T Clique-based data mining for related genes in a biomedical database %A Tsutomu Matsunaga %A Chikara Yonemori %A Etsuji Tomita %A Masaaki Muramatsu %J BMC Bioinformatics %D 2009 %I BioMed Central %R 10.1186/1471-2105-10-205 %X We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes.We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.Progress in the life sciences has recently been made by integrating biomedical knowledge on numerous genes and formulating hypotheses on the genetic mechanisms underlying various vital phenomena [1,2]. A large variety of genetic and biomedical knowledge on genes has been compiled into databases [3], and is available in electronic forms such as the Online Mendelian Inheritance in Man (OMIM) database [4]. Researchers and physicians formulating hypotheses often need to identify groups of functionally related genes, such as gene families and gene pathways, and this is usually done by simply reading a large number of documents related to the phenomenon of interest [5]. Since such an approach will inevitably result in some relevant literature being overlooked, researchers and physicians need a way that will help them search for related gene sets automatically and comprehensively [6].Graph-based approaches [7-9] have recently emerged as a method for data mining. A biomedical relational graph is formed by nodes that represent biological entities (e.g. genes/proteins) and edges that represent the associations of those entities. For instance, protein-protein interactions %U http://www.biomedcentral.com/1471-2105/10/205