[1] | Boulesteix AL, Slawski M (2009) Stability and aggregation of ranked gene lists. Brief Bioinform 10: 556–568.
|
[2] | Ein-Dor L, Zuk O, Domany E (2006) Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. PNAS 103: 5923–5928.
|
[3] | Boutros PC, Lau SK, Pintilie M, Liu N, Shepherd FA, et al. (2009) Prognostic gene signatures for non-small-cell lung cancer. PNAS 106: 2824–2828.
|
[4] | Lau SK, Boutros PC, Pintilie M, Blackhall FH, Zhu CQ, et al. (2007) Three-Gene Prognostic Classifier for Early-Stage Non Small-Cell Lung Cancer. J Clin Oncol 25: 5562–5569.
|
[5] | Shi W, Tsyganova M, Dosymbekov D, Dezso Z, Nikolskaya T, et al. (2010) The Tale of Underlying biology: Functional Analysis of MAQC-II Signatures. Pharmacogenomics J 10: 310–323.
|
[6] | Haury AC, Gestraud P, Vert JP (2011) The inuence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS ONE 6: e28210.
|
[7] | Ioannidis J, Allison D, Ball C, Coulibaly I, Cui X, et al. (2009) Repeatability of published microarray gene expression analyses. Nat Genet 41: 499–505.
|
[8] | Jurman G, Merler S, Barla A, Paoli S, Galea A, et al. (2008) Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24: 258–264.
|
[9] | Slawski M, Boulesteix AL (2012) GeneSelector: Stability and Aggregation of ranked gene lists. Bioconductor 2.9 package version 2.4.0:
|
[10] | Critchlow D (1985) Metric methods for analyzing partially ranked data. LNS 34. Heidelberg: Springer. 242 p.
|
[11] | Diaconis P (1988) Group representations in probability and statistics. Institute of Mathematical Statistics Lecture Notes – Monograph Series Vol. 11. Beachwood, OH: IMS. 198 p.
|
[12] | Lance G, Williams W (1966) Computer programs for hierarchical polythetic classification (“similarity analysis”). Comput J 9: 60–64.
|
[13] | Lance G, Williams W (1967) Mixed-Data Classificatory Programs I - Agglomerative Systems. Aust Comput J 1: 15–20.
|
[14] | Jurman G, Riccadonna S, Visintainer R, Furlanello C (2009) Canberra Distance on Ranked Lists. Agrawal S, Burges C, Crammer K, editors, Proc. Advances in Ranking - NIPS 09 Workshop. pp. 22–27.
|
[15] | Gobbi A (2008) Algebraic and combinatorial techniques for stability algorithms on ranked data. Master’s thesis, University of Trento.
|
[16] | Fagin R, Kumar R, Sivakumar D (2003) Comparing top-k lists. SIAM J Discrete Math 17: 134–160.
|
[17] | Hall P, Schimek M (2008) Inference for the Top-k Rank List Problem. Brito P, editor, Proc. COMPSTAT 08. pp. 433–444.
|
[18] | Schimek M, Budinska E, Kugler K, Lin S (2011) Package “TopKLists” for rank-based genomic data integration. In: Proc IASTED CompBio 2011. ACTA Press, 434–440:
|
[19] | Lin S (2010) Space oriented rank-based data integration. Stat Appl Genet Mol 9: Article 20:
|
[20] | Lin S, Ding J (2009) Integration of ranked lists via Cross Entropy Monte Carlo with applications to mRNA and microRNA studies. Biometrics 65: 9–18.
|
[21] | Bar-Ilan J, Mat-Hassan M, Levene M (2006) Methods for comparing rankings of search engine results. Comput Netw 50: 1448–1463.
|
[22] | Fury W, Batliwalla F, Gregersen P, Li W (2006) Overlapping Probabilities of Top Ranking Gene Lists, Hypergeometric Distribution, and Stringency of Gene Selection Criterion. In: Proc. 28th IEEE-EMBS. IEEE, 5531–5534:
|
[23] | Pearson R (2007) Reciprocal rank-based comparison of ordered gene lists. In: Proc. GENSIP 07. IEEE, 1–3:
|
[24] | Yang X, Sun X (2007) Meta-analysis of several gene lists for distinct types of cancer: A simple way to reveal common prognostic markers. BMC Bioinformatics 8: 118.
|
[25] | Schimek M, My?i?ková A, Budinská E (2012) An Inference and Integration Approach for the Consolidation of Ranked Lists. Commun Stat Simulat 41: 1152–1166.
|
[26] | Hall P, Schimek M (2012) Moderate deviation-based inference for random degeneration in paired rank lists. J Amer Statist Assoc. In press.
|
[27] | Guzzetta G, Jurman G, Furlanello C (2010) A machine learning pipeline for quantitative phenotype prediction from genotype data. BMC Bioinformatics 11: S3.
|
[28] | Schowe B, Morik K (2011) Fast-Ensembles of Minimum Redundancy Feature Selection. In: Okun O, Valentini G, Re M, editors. Ensembles in Machine Learning Applications. Volume 373 of Studies in Computational Intelligence. Heidelberg: Springer. pp. 75–95.
|
[29] | Yu L, Han Y, Berens M (2012) Stable Gene Selection from Microarray Data via Sample Weighting. IEEE ACM T Comput Bi 9: 262–272.
|
[30] | Kossenkov A, Vachani A, Chang C, Nichols C, Billouin S, et al. (2011) Resection of Non-Small Cell Lung Cancers Reverses Tumor-Induced Gene Expression Changes in the Peripheral Immune System. Clin Cancer Res 17: 5867–5877.
|
[31] | Desarkar M, Joshi R, Sarkar S (2011) Displacement Based Unsupervised Metric for Evaluating Rank Aggregation. In: Kuznetsov S, Mandal D, Kundu M, Pal S, editors. Pattern Recognition and Machine Intelligence, Volume 6744 of Lecture Notes in Computer Science. Heidelberg: Springer. pp. 268–273.
|
[32] | Soneson C, Fontes M (2012) A framework for list representation, enabling list stabilization through incorporation of gene exchangeabilities. Biostatistics 13: 129–141.
|
[33] | He Z, Yu W (2010) Stable feature selection for biomarker discovery. Comput Biol Chem 34: 215–225.
|
[34] | Corrada D, Viti F, Merelli I, Battaglia C, Milanesi L (2011) myMIR: a genome-wide microRNA targets identification and annotation tool. Brief Bioinform 12(6): 588–600.
|
[35] | The MicroArray Quality Control (MAQC) Consortium (2010) The MAQC-II Project: A comprehensive study of common practices for the development and validation of microarray-based predictive models. Nature Biotech 28: 827–838.
|
[36] | Di Camillo B, Sanavia T, Martini M, Jurman G, Sambo F, et al. (2012) Effect of size and het-erogeneity of samples on biomarker discovery: synthetic and real data assessment. Plos ONE 7: e32200.
|
[37] | Albanese D, Visintainer R, Merler S, Riccadonna S, Jurman G, et al. (2012) mlpy: Machine Learning Python. arXiv. 1202.6548 p.
|
[38] | Kendall M (1962) Rank correlation methods. Griffin Books on Statistics. Duxbury, MA: Griffin Publishing Company.
|
[39] | Diaconis P, Graham R (1977) Spearman’s Footrule as a Measure of Disarray. J Roy Stat Soc B 39: 262–268.
|
[40] | Graham R, Knuth D, Patashnik O (1989) Concrete Mathematics: A Foundation for Computer Science. Boston, MA: Addison Wesley.
|
[41] | Cheon GS, El-Mikkawy MEA (2007) Generalized Harmonic Number Identities And Related Matrix Representation. J Korean Math Soc 44: 487–498.
|
[42] | Simi?c S (1998) Best possible bounds and monotonicity of segments of harmonic series (II). Mat Vesnik 50: 5–10.
|
[43] | Villarino M (2004) Ramanujan’s Approximation to the n-th Partial Sum of the Harmonic Series. arXiv:math.CA/0402354 v5:
|
[44] | Villarino M (2006) Sharp Bounds for the Harmonic Numbers. arXiv:math.CA/0510585 v3:
|
[45] | Kauers M, Schneider C (2006) Indefinite Summation with Unspecified Summands. Discrete Math 306: 2021–2140.
|
[46] | Kauers M, Schneider C (2006) Application of Unspecified Sequences in Symbolic Summation. In: Proc. ISSAC 06. ACM, 177–183:
|
[47] | Schneider C (2004) Symbolic Summation with Single-Nested Sum Extension. In: Proc. ISSAC 04. ACM, 282–289:
|
[48] | Abramov S, Carette J, Geddes K, Lee H (2004) Telescoping in the context of symbolic summation in Maple. J Symb Comput 38: 1303–1326.
|
[49] | Schneider CSimplifying Sums in -Extensions J Algebra. Appl 6: 415–441.
|
[50] | Hoeffding W (1951) A Combinatorial Central Limit Theorem. Ann Math Stat 22: 558–566.
|
[51] | Borda J (1781) Mémoire sur les élections au scrutin. Histoire de l’Académie Royale des Sciences.
|
[52] | Saari D (2001) Chaotic Elections! A Mathematician Looks at Voting. Providence, RI: American Mathematical Society. 159 p.
|
[53] | Setlur S, Mertz K, Hoshida Y, Demichelis F, Lupien M, et al. (2008) Estrogen-dependent signaling in a molecularly distinct subclass of aggressive prostate cancer. J Natl Cancer Inst 100: 815–825.
|
[54] | Sboner A, Demichelis F, Calza S, Pawitan Y, Setlur S, et al. (2010) Molecular sampling of prostate cancer: a dilemma for predicting disease progression. BMC Med Genomics 3: 8.
|
[55] | Dudoit S, Fridlyand J, Speed T (2002) Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. J Am Stat Assoc 97: 77–87.
|
[56] | Pique-Regi R, Ortega A (2006) Block diagonal linear discriminant analysis with sequential embedded feature selection. In: Proc. ICASSP 06. IEEE, volume 5, pp. V– V:
|
[57] | Pique-Regi R, Ortega A, Asgharzadeh S (2005) Sequential Diagonal Linear Discriminant Analysis (SeqDLDA) for Microarray Classification and Gene Identification. In: Proc. CSB 05. IEEE, 112–116:
|
[58] | B? T, Jonassen I (2002) New feature subset selection procedures for classification of expression profiles. Genome Biol 3: research0017.1–research0017.11.
|
[59] | Cortes C, Vapnik V (1995) Support-Vector Networks. Mach Learn 20:
|
[60] | Cai D, Xiaofei H, Han J (2008) SRDA: An efficient algorithm for large-scale discriminant analysis. IEEE T Knowl Data En 20: 1–12.
|
[61] | Visintainer , R (2008) Feature ranking and classification of molecular data based on discriminant analysis methods. Master’s thesis, University of Trento.
|
[62] | Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene Selection for Cancer Classification using Support Vector Machines. Mach Learn 46: 389–422.
|
[63] | Furlanello C, Serafini M, Merler S, Jurman G (2003) Entropy-Based Gene Ranking without Selection Bias for the Predictive Classification of Microarray Data. BMC Bioinformatics 4: 54.
|
[64] | Baldi P, Brunak S, Chauvin Y, Andersen C, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16: 412–424.
|
[65] | Cortes C, Mobri M (2003) AUC optimization vs. error rate minimization. In: Thrun S, Saul L, Sch?lkopf B, editors, Proc. NIPS 03. volume 16, 169–176:
|
[66] | Calders T, Jaroszewicz S (2007) Efficient AUC Optimization for Classification. Proc. PKDD 07. Heidelberg: Springer. pp. 42–53.
|
[67] | Vanderlooy S, Hüllermeier E (2008) A critical analysis of variants of the AUC. Mach Learn 72: 247–262.
|
[68] | Wang X, Simon R (2011) Microarray-based cancer prediction using single genes. BMC Bioinformatics 12: 391.
|
[69] | Tusher V, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98: 5116–5121.
|
[70] | L?nnstedt I, Speed T (2001) Replicated microarray data. Stat Sinica 12: 31–46.
|
[71] | Neter J, Kutner M, Nachtsheim C, Wasserman W (1996) Applied Linear Statistical Models. Columbus, OH: McGraw-Hill/Irwin. 1408 p.
|
[72] | Jeffery I, Higgins D, Culhane A (2006) Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics 7: 359.
|
[73] | Smyth G (2003) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3: Article 3:
|
[74] | Xiao Y, Yang YH (2008) Bioconductor’s DEDS package. 27: Available: http://www.bioconductor.org/packages/rel?ease/bioc/html/DEDS.html. Accessed 2012 Apr.
|
[75] | Gentleman R, Carey V, Bates DM, Bolstad B, Dettling M, et al. (2004) Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol 5(10): R80.
|
[76] | R Development Core Team (2011) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. 27: Available: http://www.R-project.org. Accessed 2012 Apr.
|
[77] | Yao C, Zhang M, Zou J, Gong X, Zhang L, et al. (2008) Disease prediction power and stability of differential expressed genes. In: Proc. BMEI 2008. IEEE, 265–268:
|
[78] | Chen J, Hsueh HM, Delongchamp R, Lin CJ, Tsai CA (2007) Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data. BMC Bioinformatics 8: 412.
|
[79] | Simon R (2008) Microarray-based expression profiling and informatics. Curr Opin Biotech 16: 26–29.
|
[80] | Storey J (2002) A direct approach to false discovery rates. J Roy Stat Soc B 64: 479–498.
|
[81] | Efron B, Tibshirani R, Storey J, Tusher V (2001) Empirical Bayes Analysis of a Microarray Experiment. J Am Stat Assoc 96: 1151–1160.
|
[82] | Efron B, Tibshirani R (2002) Empirical Bayes Methods, and False Discovery Rates. Genet Epidemiol 23: 70–86.
|
[83] | Efron B, Tibshirani R, Taylor J (2005) The “Miss rate” for the analysis of gene expression data. Biostat 6: 111–117.
|
[84] | Witten D, Tibshirani R (2007) A comparison of fold-change and the t-statistic for microarray data analysis. Technical report, Department of Statistics, Stanford University. 27: Available: http://www-stat.stanford.edu/~tibs/ftp/F?CTComparison.pdf. Accessed 2012 Apr.
|
[85] | Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2: 499–526.
|
[86] | Mukherjee S, Niyogi P, Poggio T, Rifkin R (2006) Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv Comput Math 25: 161–193.
|
[87] | Kalousis A, Prados J, Hilario M (2005) Stability of feature selecion algorithms. In: Proc. ICNC 2007. IEEE, 218–225:
|
[88] | Kuncheva L (2007) A stability index for feature selecion. Proc. IASTED 07. Phuket, Thailand: ACTA Press. pp. 390–395.
|
[89] | Zhang L (2007) A Method for Improving the Stability of Feature Selection Algorithm. In: Proc. ICNC 07. IEEE, 715–717:
|
[90] | Krízek P, Kittler J, Hlavá? V (2007) Improving Stability of Feature Selection Methods. In: Kropatsc , Kampel M, Hanbury A, editors. Proc. CAIP 2007. pp. 929–936.
|
[91] | Xiao Y, Hua J, Dougherty ER (2007) Quantification of the impact of Feature Selection on the Variance of Cross-Validation Error Estimation. EURASIP J Bioinform Syst Biol 2007.
|