Non-small cell lung cancer (NSCLC) has two major subtypes: adenocarcinoma (AC) and squamous cell carcinoma (SCC). The diagnosis and treatment of NSCLC are hindered by the limited knowledge about the pathogenesis mechanisms of subtypes of NSCLC. It is necessary to research the molecular mechanisms related with AC and SCC. In this work, we improved the logic analysis algorithm to mine the sufficient and necessary conditions for the presence states (presence or absence) of phenotypes. We applied our method to AC and SCC specimens, and identified lower and higher logic relationships between genes and two subtypes of NSCLC. The discovered relationships were independent of specimens selected, and their significance was validated by statistic test. Compared with the two earlier methods (the non-negative matrix factorization method and the relevance analysis method), the current method outperformed these methods in the recall rate and classification accuracy on NSCLC and normal specimens. We obtained biomarkers. Among biomarkers, genes have been used to distinguish AC from SCC in practice, and other six genes were newly discovered biomarkers for distinguishing subtypes. Furthermore, NKX2-1 has been considered as a molecular target for the targeted therapy of AC, and other genes may be novel molecular targets. By gene ontology analysis, we found that two biological processes (‘epidermis development’ and ‘cell adhesion’) were closely related with the tumorigenesis of subtypes of NSCLC. More generally, the current method could be extended to other complex diseases for distinguishing subtypes and detecting the molecular targets for targeted therapy.
References
[1]
Kamangar F, Dores GM, Anderson WF (2006) Patterns of cancer incidence, mortality, and prevalence across five continents: defining priorities to reduce cancer disparities in different geographic regions of the world. Journal of clinical oncology 14: 2137–2150. doi: 10.1200/jco.2005.05.2308
[2]
Ettinger DS, Akerley W, Bepler G, Blum MG, Chang A, et al. (2010) Non–small cell lung cancer. Journal of the national comprehensive cancer network 8: 740–801.
[3]
Coate LE, John T, Tsao MS, Shepherd FA (2009) Molecular predictive and prognostic markers in non-small-cell lung cancer. The lancet oncology 10: 1001–1010. doi: 10.1016/s1470-2045(09)70155-x
[4]
Carey LA, Perou CM, Livasy CA, Dressler LG, Cowan D, et al. (2006) Race, breast cancer subtypes, and survival in the carolina breast cancer study. Journal of the American medical association 295: 2492–2502. doi: 10.1001/jama.295.21.2492
[5]
Hugh J, Hanson J, Cheang MCU, Nielsen TO, Perou CM, et al. (2009) Breast cancer subtypes and response to docetaxel in node-positive breast cancer: use of an immunohistochemical definition in the bcirg 001 trial. Journal of clinical oncology 27: 1168–1176. doi: 10.1200/jco.2008.18.1024
[6]
Onitilo AA, Engel JM, Greenlee RT, Mukesh BN (2009) Breast cancer subtypes based on er/pr and her2 expression: comparison of clinicopathologic features and survival. Clinical medicine & research 7: 4–13. doi: 10.3121/cmr.2008.825
[7]
Schreiber G, McCrory DC (2003) Performance characteristics of different modalities for diagnosis of suspected lung cancer: Summary of published evidence. Chest journal 123: 115S–128S. doi: 10.1378/chest.123.1_suppl.115s
[8]
Khayyata S, Yun S, Pasha T, Jian B, McGrath C, et al. (2009) Value of p63 and ck5/6 in distinguishing squamous cell carcinoma from adenocarcinoma in lung fine-needle aspiration specimens. Diagnostic cytopathology 37: 178–183. doi: 10.1002/dc.20975
[9]
Huang T, Jiang M, Kong X, Cai YD (2012) Dysfunctions associated with methylation, microrna expression and gene expression in lung cancer. PloS one 7: e43441. doi: 10.1371/journal.pone.0043441
[10]
Ellis LM, Hicklin DJ (2008) Vegf-targeted therapy: mechanisms of anti-tumour activity. Nature reviews cancer 8: 579–591. doi: 10.1038/nrc2403
[11]
Paez JG, J?nne PA, Lee JC, Tracy S, Greulich H, et al. (2004) Egfr mutations in lung cancer: correlation with clinical response to gefitinib therapy. Science 304: 1497–1500. doi: 10.1126/science.1099314
[12]
Dempke W, Suto T, Reck M (2010) Targeted therapies for non-small cell lung cancer. Lung cancer 67: 257–274. doi: 10.1016/j.lungcan.2009.10.012
[13]
Puglisi F, Barbone F, Damante G, Bruckbauer M, Di Lauro V, et al. (1999) Prognostic value of thyroid transcription factor-1 in primary, resected, non-small cell lung carcinoma. Modern pathology: an official journal of the United States and Canadian Academy of Pathology, Inc 12: 318.
[14]
Kwei K, Kim Y, Girard L, Kao J, Pacyna-Gengelbach M, et al. (2008) Genomic profiling identifies titf1 as a lineage-specific oncogene amplified in lung cancer. Oncogene 27: 3635–3640. doi: 10.1038/sj.onc.1211012
[15]
Kimchi ET, Posner MC, Park JO, Darga TE, Kocherginsky M, et al. (2005) Progression of barrett's metaplasia to adenocarcinoma is associated with the suppression of the transcriptional programs of epidermal differentiation. Cancer research 65: 3146–3154.
[16]
Li Y, Patra JC (2010) Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network. Bioinformatics 26: 1219–1224. doi: 10.1093/bioinformatics/btq108
[17]
Wu X, Jiang R, Zhang MQ, Li S (2008) Network-based global inference of human disease genes. Molecular systems biology 4. doi: 10.1038/msb.2008.27
[18]
Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proceedings of the national academy of sciences 101: 4164–4169. doi: 10.1073/pnas.0308531101
[19]
Goh CS, Gianoulis TA, Liu Y, Li J, Paccanaro A, et al. (2006) Integration of curated databases to identify genotype-phenotype associations. BMC genomics 7: 257.
[20]
Slonim N, Elemento O, Tavazoie S (2006) Ab initio genotype–phenotype association reveals intrinsic modularity in genetic networks. Molecular systems biology 2. doi: 10.1038/msb4100047
[21]
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD Record. ACM, volume 22, pp. 207–216.
[22]
Bowers PM, Cokus SJ, Eisenberg D, Yeates TO (2004) Use of logic relationships to decipher protein network organization. Science 306: 2246–2249. doi: 10.1126/science.1103330
[23]
Tamura M, Dhaeseleer P (2008) Microbial genotype-phenotype mapping by class association rule mining. Bioinformatics 24. doi: 10.1093/bioinformatics/btn210
[24]
Ruan X, Wang J, Li H, Perozzi RE, Perozzi EF (2008) The use of logic relationships to model colon cancer gene expression networks with mrna microarray data. Journal of biomedical informatics 41: 530–543. doi: 10.1016/j.jbi.2007.11.006
[25]
Young GD, Winokur TS, Cerfolio RJ, Van Tine BA, Chow LT, et al. (2002) Differential expression and biodistribution of cytokeratin 18 and desmoplakins in non-small cell lung carcinoma subtypes. Lung cancer 36: 133–141. doi: 10.1016/s0169-5002(01)00486-x
[26]
Kuner R, Muley T, Meister M, Ruschhaupt M, Buness A, et al. (2009) Global gene expression analysis reveals specific patterns of cell junctions in non-small cell lung cancer subtypes. Lung cancer 63: 32–38. doi: 10.1016/j.lungcan.2008.03.033
[27]
Boelens MC, van den Berg A, Vogelzang I, Wesseling J, Postma DS, et al. (2007) Differential expression and distribution of epithelial adhesion molecules in non-small cell lung cancer and normal bronchus. Journal of clinical pathology 60: 608–614. doi: 10.1136/jcp.2005.031443
[28]
Angulo B, Suarez-Gauthier A, Lopez-Rios F, Medina P, Conde E, et al. (2008) Expression signatures in lung cancer reveal a profile for egfr-mutant tumours and identify selective pik3ca overexpression by gene amplification. The journal of pathology 214: 347–356. doi: 10.1002/path.2267
[29]
Cui T, Chen Y, Yang L, Kn?sel T, Huber O, et al. (2012) The p53 target gene desmocollin 3 acts as a novel tumor suppressor through inhibiting egfr/erk pathway in human lung cancer. Carcinogenesis 33: 2326–2333. doi: 10.1093/carcin/bgs273
[30]
Hayes D, Secrist H, Bangur C, Wang T, Zhang X, et al. (2006) Multigene real-time pcr detection of circulating tumor cells in peripheral blood of lung cancer patients. Anticancer research 26: 1567.
[31]
Gontan C, de Munck A, Vermeij M, Grosveld F, Tibboel D, et al. (2008) Sox2 is important for two crucial processes in lung development: branching morphogenesis and epithelial cell differentiation. Developmental biology 317: 296–309. doi: 10.1016/j.ydbio.2008.02.035
[32]
Angulo B, Suarez-Gauthier A, Lopez-Rios F, Medina P, Conde E, et al. (2008) Expression signatures in lung cancer reveal a profile for egfr-mutant tumours and identify selective pik3ca overexpression by gene amplification. The journal of pathology 214: 347–356. doi: 10.1002/path.2267
[33]
Sanchez-Cespedes M (2008) The impact of gene expression microarrays in the evaluation of lung carcinoma subtypes and dna copy number. Archives of pathology & laboratory medicine 132: 1562–1565.
[34]
Urgard E, Vooder T, V?sa U, V?lk K, Liu M, et al. (2011) Metagenes associated with survival in non-small cell lung cancer. Cancer informatics 10: 175. doi: 10.4137/cin.s7135
[35]
Weir BA, Woo MS, Getz G, Perner S, Ding L, et al. (2007) Characterizing the cancer genome in lung adenocarcinoma. Nature 450.
[36]
Chang HH, Dreyfuss JM, Ramoni MF (2011) A transcriptional network signature characterizes lung cancer subtypes. Cancer 117: 353–360. doi: 10.1002/cncr.25592
[37]
Massion PP, Taflan PM, Rahman SJ, Yildiz P, Shyr Y, et al. (2003) Significance of p63 amplification and overexpression in lung cancer development and prognosis. Cancer research 63: 7113–7121.
[38]
Zhou ZY, Yang GY, Zhou J, Yu MH (2012) Significance of trim29 and β-catenin expression in non-small-cell lung cancer. Journal of the Chinese medical association 75: 269–274. doi: 10.1016/j.jcma.2012.04.015
[39]
Xi H, Shulha HP, Lin JM, Vales TR, Fu Y, et al. (2007) Identification and characterization of cell type-specific and ubiquitous chromatin regulatory structures in the human genome. PLoS genetics 3: e136. doi: 10.1371/journal.pgen.0030136.eor
[40]
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. Nature genetics 25: 25–29.
[41]
Medina PP, Castillo SD, Blanco S, Sanz-Garcia M, Largo C, et al. (2009) The sry-hmg box gene, sox4, is a target of gene amplification at chromosome 6p in lung cancer. Human molecular genetics 18: 1343–1352. doi: 10.1093/hmg/ddp034
[42]
Larsson O, Wahlestedt C, Timmons JA (2005) Considerations when using the significance analysis of microarrays (sam) algorithm. BMC bioinformatics 6: 129. doi: 10.1186/1471-2105-6-129
[43]
Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mininga general survey and comparison. ACM sigkdd explorations newsletter 2: 58–64. doi: 10.1145/360402.360421
[44]
Sprinzak E, Cokus S, Yeates T, Eisenberg D, Pellegrini M (2009) Detecting coordinated regulation of multi-protein complexes using logic analysis of gene expression. BMC systems biology 3: 115. doi: 10.1186/1752-0509-3-115
[45]
Zhang H, Padmanabhan B (2004) Using randomization to determine a false discovery rate for rule discovery. In: Proceedings of the fourteenth workshop on information technologies and systems. pp. 140–145.
[46]
Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z (2009) Gorilla: a tool for discovery and visualization of enriched go terms in ranked gene lists. BMC bioinformatics 10: 48. doi: 10.1186/1471-2105-10-48