The Cox proportional hazards regression model has become the traditional choice for modeling survival data in medical studies. To introduce flexibility into the Cox model, several smoothing methods may be applied, and approaches based on splines are the most frequently considered in this context. To better understand the effects that each continuous covariate has on the outcome, results can be expressed in terms of splines-based hazard ratio (HR) curves, taking a specific covariate value as reference. Despite the potential advantages of using spline smoothing methods in survival analysis, there is currently no analytical method in the R software to choose the optimal degrees of freedom in multivariable Cox models (with two or more nonlinear covariate effects). This paper describes an R package, called smoothHR, that allows the computation of pointwise estimates of the HRs—and their corresponding confidence limits—of continuous predictors introduced nonlinearly. In addition the package provides functions for choosing automatically the degrees of freedom in multivariable Cox models. The package is available from the R homepage. We illustrate the use of the key functions of the smoothHR package using data from a study on breast cancer and data on acute coronary syndrome, from Galicia, Spain. 1. Introduction An important aim in longitudinal medical studies is to study the possible effect of a set of prognostic factors on the course of a disease. In many of these studies, some of the prognostic factors may be continuous and their effects can be unknown. A classical approach for studying these effects is through the Cox regression model (Cox [1], Kalbfleisch and Prentice [2]). One possible approach allowing for nonlinear effects in the Cox model is to express the hazard as an additive Cox model (see, e.g., Hastie and Tibshirani [3], Gray [4], Huang et al. [5], and Huang and Liu [6]). In this paper, we use natural cubic regression splines (de Boor [7]) and penalized splines (P-splines, Eilers, and Marx [8]) to reflect the nature of continuous covariate effects in the additive Cox model. One of the most commonly used measures of this effect is the hazard ratio (HR) function. Cadarso-Suárez et al. [9] proposed a flexible method for constructing smoothing hazard ratio curves with confidence limits, which facilitates the expression of the results in a manner that is standard in clinical survival studies. The authors suggest the use of an additive Cox model where the effects of continuous predictors on log hazards are modeled nonlinearly using P-splines. This paper
References
[1]
D. Cox, “Regression models and life tables (with discussion),” Journal of the Royal Statistical Society B, vol. 34, pp. 187–220, 1972.
[2]
J. Kalbfleisch and R. Prentice, The Statistical Analysis of Failure Time Data, John Wiley & Sons, New York, NY, USA, 1980.
[3]
T. Hastie and R. Tibshirani, “Exploring the nature of covariate effects in the proportional hazards model,” Biometrics, vol. 46, no. 4, pp. 1005–1016, 1990.
[4]
R. Gray, “Flexible methods for analyzing survival data using splines, with application to breast cancer prognosis,” Journal of the American Statistical Association, vol. 87, pp. 942–951, 1992.
[5]
J. Z. Huang, C. Kooperberg, C. J. Stone, and Y. K. Truong, “Functional ANOVA modeling for proportional hazards regression,” Annals of Statistics, vol. 28, no. 4, pp. 961–999, 2000.
[6]
J. Z. Huang and L. Liu, “Polynomial spline estimation and inference of proportional hazards regression models with flexible relative risk form,” Biometrics, vol. 62, no. 3, pp. 793–955, 2006.
[7]
C. de Boor, A Practical Guide to Splines, Springer, New York, NY, USA, 2001.
[8]
P. H. C. Eilers and B. D. Marx, “Flexible smoothing with B-splines and penalties,” Statistical Science, vol. 11, no. 2, pp. 89–121, 1996.
[9]
C. Cadarso-Suárez, L. Meira-Machado, T. Kneib, and F. Gude, “Flexible hazard ratio curves for continuous predictors in multi-state models: an application to breast cancer data,” Statistical Modelling, vol. 10, no. 3, pp. 291–314, 2010.
[10]
H. Akaike, “A new look at the statistical model identification,” IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716–723, 1974.
[11]
C. M. Hurvich, J. S. Simonoff, and C.-L. Tsai, “Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion,” Journal of the Royal Statistical Society B, vol. 60, no. 2, pp. 271–293, 1998.
[12]
G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, pp. 461–464, 1978.
[13]
C. T. Volinsky and A. E. Raftery, “Bayesian information criterion for censored survival models,” Biometrics, vol. 56, no. 1, pp. 256–262, 2000.
[14]
T. Therneau and P. Grambsch, Modelling Survival Data: Extending the Cox Model, Springer, New York, NY, USA, 2000.
[15]
C. A. Struthers and J. D. Kalbfleisch, “Misspecified proportional hazard models,” Biometrika, vol. 73, no. 2, pp. 363–369, 1986.
[16]
G. Anderson and T. Fleming, “Model misspecification in proportional hazards regression,” Biometrika, vol. 82, pp. 527–541, 1995.
[17]
T. Kneib and L. Fahrmeir, “A mixed model approach for geoadditive hazard regression,” Scandinavian Journal of Statistics, vol. 34, no. 1, pp. 207–228, 2007.
[18]
S. Durrleman and R. Simon, “Flexible regression models with cubic splines,” Statistics in Medicine, vol. 8, no. 5, pp. 551–561, 1989.
[19]
J. H. Friedman, “Multivariate adaptive regression splines,” The Annals of Statistics, vol. 1, no. 19, pp. 1–67, 1991.
[20]
P. Royston and M. K. B. Parmar, “Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects,” Statistics in Medicine, vol. 21, no. 15, pp. 2175–2197, 2002.
[21]
U. S. Govindarajulu, D. Spiegelman, S. W. Thurston, B. Ganguli, and E. A. Eisen, “Comparing smoothing techniques in Cox models for exposure-response relationships,” Statistics in Medicine, vol. 26, no. 20, pp. 3735–3752, 2007.
[22]
S. Wood, Generalized Additive Models: An Introduction With R, Chapman & Hall, London, UK, 2006.
[23]
M. Tsujitani and Y. Tanaka, “Analysis of heart transplant survival data using generalized additive models,” Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 609857, 7 pages, 2013.
[24]
R Development Core Team, “R: A Language and Environment for Statistical Computing,” R Foundation for Statistical Computing, Vienna, Austria, 2012, http://www.R-project.org/.
[25]
P. M. Grambsch and T. M. Therneau, “Proportional hazards tests and diagnostics based on weighted residuals,” Biometrika, vol. 81, no. 3, pp. 515–526, 1994.
[26]
B. Cid-Alvarez, F. Gude, C. Cadarso-Suarez et al., “Admission and fasting plasma glucose for estimating risk of death of diabetic and nondiabetic patients with acute coronary syndrome: nonlinearity of hazard ratios and time-dependent comparison,” American Heart Journal, vol. 158, no. 6, pp. 989–997, 2009.