Lung cancer is one of the leading causes of death worldwide, accounting for an estimated 2.1 million cases in 2018. To analyze the risk factors behind the lung cancer survival, this paper employs two main models: Kaplan-Meier estimator and Cox proportional hazard model [1]. Also, log-rank test and wald test are utilized to test whether a correlation exists or not, which is discussed in detail in later parts of the paper. The aim is to find out the most influential factors for the survival probability of lung cancer patients. To summarize the results, stage of cancer is always a significant factor for lung cancer survival, and time has to be taken into account when analyzing the survival rate of patients in our data sample, which is from TCGA. Future study on lung cancer is also required to make improvement for the treatment of lung cancer, as our data sample might not represent the overall condition of patients diagnosed with lung cancer; also, more appropriate and advanced models should be employed in order to reflect factors that can affect survival rate of patients with lung cancer in detail.
References
[1]
Cox, D.R. (2018) Analysis of Survival Data. Chapman and Hall/CRC, London.
[2]
WHO (2018) Cancer. https://www.who.int/news-room/fact-sheets/detail/cancer
[3]
ACS (2019) What Is Lung Cancer?
https://www.cancer.org/content/cancer/en/cancer/lung-cancer/about/what-is.html
[4]
Furrukh, M. (2013) Tobacco Smoking and Lung Cancer: Perception-Changing Facts. Sultan Qaboos University Medical Journal, 13, 345.
https://doi.org/10.12816/0003255
[5]
Popper, H.H. (2016) Progression and Metastasis of Lung Cancer. Cancer and Metastasis Reviews, 35, 75-91. https://doi.org/10.1007/s10555-016-9618-0
[6]
NCI (2017) Cancer Genome Research and Precision Medicine.
https://www.cancer.gov/about-nci/organization/ccg/cancer-genomics-overview
[7]
Stewart, B.W. and Kleihues, P. (2003) World Cancer Report.
[8]
NIH (2019) The Cancer Genome Atlas Program.
https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga
[9]
Campbell, J.D., et al. (2016) Distinct Patterns of Somatic Genome Alterations in Lung Adenocarcinomas and Squamous Cell Carcinomas. Nature Genetics, 48, 607.
https://doi.org/10.1038/ng.3564
[10]
Kaplan, E.L. and Meier, P. (1958) Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53, 457-481.
https://doi.org/10.1080/01621459.1958.10501452
[11]
Collett, D. (2015) Modelling Survival Data in Medical Research. Chapman and Hall/CRC, London.
[12]
Bland, J.M. and Altman, D.G. (2004) The Log Rank Test. British Medical Journal, 328, 1073. https://doi.org/10.1136/bmj.328.7447.1073
[13]
Fahrmeir, L., et al. (2013) Regression: Models, Methods and Applications. Springer Science & Business Media, New York. https://doi.org/10.1007/978-3-642-34333-9_2
[14]
Goel, M.K., Khanna, P. and Kishore, J. (2010) Understanding Survival Analysis: Kaplan-Meier Estimate. International Journal of Ayurveda Research, 1, 274.
https://doi.org/10.4103/0974-7788.76794