%0 Journal Article %T An improved procedure for gene selection from microarray experiments using false discovery rate criterion %A James J Yang %A Mark CK Yang %J BMC Bioinformatics %D 2006 %I BioMed Central %R 10.1186/1471-2105-7-15 %X A combination of test functions is used to estimate the number of differentially expressed genes. Simulation study shows that the proposed method has a higher power to detect these genes than other existing methods, while still keeping the FDR under control. The improvement can be substantial if the proportion of true differentially expressed genes is large. This procedure has also been tested with good results using a real dataset.For a given expected FDR, the method proposed in this paper has better power to pick genes that show differentiation in their expression than two other well known methods.The development of microarray technologies has created unparalleled opportunities to study the mechanism of disease, monitor disease expression and evaluate effective therapies. Because tens of thousands of genes are investigated simultaneously with a technology that has not yet been perfected, assessing uncertainty in the decision process relies on statistical modelling and theory. One key function of any statistical procedure is to control the rate of erroneous decisions or, in the microarray case, rate of false discovery of responsible genes.The above concern can be illustrated as a multiple comparisons problem. Suppose we are interested in testing g parameters (¦Ì1,..., ¦Ìg) = ¦Ì. For each individual parameter ¦Ìj, the null hypothesis is H0j : ¦Ìj = 0 and the alternative hypothesis is H1j : ¦Ìj ¡Ù 0. This ¦Ìj can be thought as the difference in mean expressions of the jth gene under two different conditions in a microarray experiment. A conventional method to test each hypothesis is to take a sample and then calculate its p-value from a proper statistical test. If the calculated p-value is less than a threshold determined by a testing significance level, then H0j is rejected. However, when many hypotheses are simultaneously performed, a multiple comparisons procedure (MCP) has to be used to control the error rate [1].The traditional MCP controls the probability of making any %U http://www.biomedcentral.com/1471-2105/7/15