Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. 1. Introduction Prostate cancer (PCa) is the second leading cause of cancer-related deaths among men in the USA [1, 2]. Screening using serum prostate-specific antigen (PSA) has improved the early detection of PCa and has resulted in an increase in the proportion of patients with disease that is curable by prostatectomy [3, 4]. However, 20% to 30% of treated patients will develop a local or metastatic recurrence which reflects the most adverse clinical outcome [4]. Thus, from the clinical perspective, it is important to be able to predict which patients will experience a relapse. Traditional PCa prognosis models are based on some clinical features, such as pretreatment PSA levels, biopsy Gleason score (GS), and clinical stage, but in practice, they are inadequate to accurately predict disease progression [5]. With the development of microarray technology in recent years, a number of studies have been conducted to characterize the dynamics of gene expression in PCa progression by using DNA microarrays. In some studies, tumor expression signatures associated with clinical parameters and outcomes have been identified [6–9]. As a result, it is possible to develop the clinical models with the variables of gene signatures identified from microarray data and some clinical features to predict which men would experience progression to the metastatic form of PCa. However, it has been found that
References
[1]
C. S. Grasso, Y. M. Wu, D. R. Robinson, et al., “The mutational landscape of lethal castration-resistant prostate cancer,” Nature, vol. 487, pp. 239–243, 2012.
[2]
F. H. Schr?der, J. Hugosson, M. J. Roobol, et al., “Prostate-cancer mortality at 11 years of follow-up,” The New England Journal of Medicine, vol. 366, pp. 981–990, 2012.
[3]
A. Qaseem, M. J. Barry, T. D. Denberg, et al., “Screening for prostate cancer: a guidance statement from the clinical guidelines committee of the American college of physicians,” Annals of Internal Medicine, vol. 158, no. 10, pp. 761–769, 2013.
[4]
S. M. Dhanasekaran, A. Dash, J. Yu et al., “Molecular profiling of human prostate tissues: insights into gene expression patterns of prostate development during puberty,” The FASEB Journal, vol. 19, no. 2, pp. 243–245, 2005.
[5]
A. Sboner, F. Demichelis, S. Calza et al., “Molecular sampling of prostate cancer: a dilemma for predicting disease progression,” BMC Medical Genomics, vol. 3, article 8, 2010.
[6]
E. LaTulippe, J. Satagopan, A. Smith et al., “Comprehensive gene expression analysis of prostate cancer reveals distinct transcriptional programs associated with metastatic disease,” Cancer Research, vol. 62, no. 15, pp. 4499–4506, 2002.
[7]
J.-H. Luo, Y. P. Yu, K. Cieply et al., “Gene expression analysis of prostate cancers,” Molecular Carcinogenesis, vol. 33, no. 1, pp. 25–35, 2002.
[8]
K. Tamura, M. Furihata, T. Tsunoda et al., “Molecular features of hormone-refractory prostate cancer cells by genome-wide gene expression profiles,” Cancer Research, vol. 67, no. 11, pp. 5117–5125, 2007.
[9]
T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, 2000.
[10]
C. Peterson and M. Ringner, “Analyzing tumor gene expression files,” Artificial Intelligence in Medicine, vol. 28, pp. 59–74, 2003.
[11]
P. Xu, G. N. Brock, and R. S. Parrish, “Modified linear discriminant analysis approaches for classification of high-dimensional microarray data,” Computational Statistics and Data Analysis, vol. 53, no. 5, pp. 1674–1687, 2009.
[12]
U. R. Chandran, C. Ma, R. Dhir et al., “Gene expression profiles of prostate cancer reveal involvement of multiple molecular pathways in the metastatic process,” BMC Cancer, vol. 7, article 64, 2007.
[13]
R. S. Hudson, M. Yi, D. Esposito, et al., “MicroRNA-106b-25 cluster expression is associated with early disease recurrence and targets caspase-7 and focal adhesion in human prostate cancer,” Oncogene, vol. 32, no. 35, pp. 4139–4147, 2012.
[14]
E. Martinez and V. Trevino, “Modelling gene expression profiles related to prostate tumor progression using binary states,” Theoretical Biology and Medical Modelling, vol. 10, article 37, 2013.
[15]
S. Feng, O. Dakhova, C. J. Creighton, and M. M. Ittmann, “The endocrine fibroblast growth factor FGF19 promotes prostate cancer progression,” Cancer Research, vol. 73, no. 8, pp. 2551–2562, 2012.
[16]
D. Geman, C. d'Avignon, D. Q. Naiman, and R. L. Winslow, “Classifying gene expression profiles from pairwise mRNA comparisons,” Statistical Applications in Genetics and Molecular Biology, vol. 3, no. 1, article 19, 2004.
[17]
J. Zhu and T. Hastie, “Classification of gene microarrays by penalized logistic regression,” Biostatistics, vol. 5, no. 3, pp. 427–443, 2004.
[18]
L. Shen and E. C. Tan, “Dimension reduction-based penalized logistic regression for cancer classification using microarray data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, no. 2, pp. 166–175, 2005.
[19]
J. G. Liao and K.-V. Chin, “Logistic regression for disease classification using microarray data: model selection in a large p and small n case,” Bioinformatics, vol. 23, no. 15, pp. 1945–1951, 2007.
[20]
A. C. Tan, D. Q. Naiman, L. Xu, R. L. Winslow, and D. Geman, “Simple decision rules for classifying human cancers from gene expression profiles,” Bioinformatics, vol. 21, no. 20, pp. 3896–3904, 2005.
[21]
L. Xu, A. C. Tan, D. Q. Naiman, D. Geman, and R. L. Winslow, “Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data,” Bioinformatics, vol. 21, no. 20, pp. 3905–3911, 2005.
[22]
Y. Zhang, J. Szustakowski, and M. Schinke, “Bioinformatics analysis of microarray data,” Methods in Molecular Biology, vol. 573, pp. 259–284, 2009.
[23]
R. Ummanni, S. Teller, H. Junker et al., “Altered expression of tumor protein D52 regulates apoptosis and migration of prostate cancer cells,” FEBS Journal, vol. 275, no. 22, pp. 5703–5713, 2008.