%0 Journal Article %T Robust assignment of cancer subtypes from expression data using a uni-variate gene expression average as classifier %A Martin Lauss %A Attila Frigyesi %A Tobias Ryden %A Mattias H£żglund %J BMC Cancer %D 2010 %I BioMed Central %R 10.1186/1471-2407-10-532 %X The basis for the proposed approach is the use of metagenes, instead of collections of individual genes, and a feature selection using AUC values obtained by ROC analysis. Each gene in a data set is assigned an AUC value relative to the tumor class under investigation and the genes are ranked according to these values. Metagenes are then formed by calculating the mean expression level for an increasing number of ranked genes, and the metagene expression value that optimally discriminates tumor classes in the training set is used for classification of new samples. The performance of the metagene is then evaluated using LOOCV and balanced accuracies.We show that the simple uni-variate gene expression average algorithm performs as well as several alternative algorithms such as discriminant analysis and the more complex approaches such as SVM and neural networks. The R package rocc is freely available at http://cran.r-project.org/web/packages/rocc/index.html webcite.One of the most promising clinical applications of genome wide expression studies is the construction of robust and reliable disease classifiers. Correct identification and sub-classification of diseases such as cancer is a prerequisite for proper and efficient treatment. To date a large number of different algorithms for disease classification have been described. They range in complexity from neural network approaches [1] to the simpler nearest-neighbor classification algorithms [2]. Even though some of the more complex approaches such as neural networks and self organized maps (SOM) [3] have proved to be very efficient, these methods often rely on the tuning of several parameters and hence are liable for over-fitting. Furthermore, simple classifiers seem to perform remarkably well when compared to more sophisticated ones [4]. In the present investigation our aim has been to design a simple predictor system useful for cancer subtype classification. Features to be included in the predictor signatures are se %U http://www.biomedcentral.com/1471-2407/10/532