%0 Journal Article %T Application of Genetic Algorithm for Discovery of Core Effective Formulae in TCM Clinical Data %A Ming Yang %A Josiah Poon %A Shaomo Wang %A Lijing Jiao %A Simon Poon %A Lizhi Cui %A Peiqi Chen %A Daniel Man-Yuen Sze %A Ling Xu %J Computational and Mathematical Methods in Medicine %D 2013 %I Hindawi Publishing Corporation %R 10.1155/2013/971272 %X Research on core and effective formulae (CEF) does not only summarize traditional Chinese medicine (TCM) treatment experience, it also helps to reveal the underlying knowledge in the formulation of a TCM prescription. In this paper, CEF discovery from tumor clinical data is discussed. The concepts of confidence, support, and effectiveness of the CEF are defined. Genetic algorithm (GA) is applied to find the CEF from a lung cancer dataset with 595 records from 161 patients. The results had 9 CEF with positive fitness values with 15 distinct herbs. The CEF have all had relative high average confidence and support. A herb-herb network was constructed and it shows that all the herbs in CEF are core herbs. The dataset was divided into CEF group and non-CEF group. The effective proportions of former group are significantly greater than those of latter group. A Synergy index (SI) was defined to evaluate the interaction between two herbs. There were 4 pairs of herbs with high SI values to indicate the synergy between the herbs. All the results agreed with the TCM theory, which demonstrates the feasibility of our approach. 1. Introduction Traditional Chinese medicine (TCM) has been developed and practiced in China for thousands of years, and herbal prescription has played a key role in the medical treatment. A Large number of herbal prescriptions have been recorded over the years where valuable TCM knowledge is hidden. It is urgent and critical to analyze these data so that TCM models can be developed in the modernization of this ancient knowledge. Although TCM is still in practice and more countries consider it as an alternative treatment method [1], the principle of formulating TCM prescription remains unknown. However, it is a daunting task to analyze such a large dataset manually. The methods of knowledge discovery in database (KDD) have been suggested as viable approaches. KDD allows TCM researchers to find interesting patterns efficiently, and they may direct further laboratory work that leads to discovery [2]. Many successful projects have been reported. For example, Wang et al. [3] illustrated the use of structure equation modeling (SEM) to explore the diagnosis of the suboptimal health status (SHS) and provided evidence for the standardization of TCM patterns. Multilabel learning model [4, 5] was introduced for TCM syndrome identification. Complex network was built for the clinical data mining in TCM [6¨C8]. Generally, KDD research in TCM has been divided into two main categories. The first one attempts to extend our understanding using existing TCM %U http://www.hindawi.com/journals/cmmm/2013/971272/