全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining

DOI: 10.4236/ijis.2017.71002, PP. 9-23

Keywords: Sequential Pattern, Breast Cancer, DNA, PrefixSpan, Lift Ratio

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper aims to propose the sequential pattern discovery method of Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer disease. The DNA which is composed of amino acids of gene P53 is mutated. It effects to change of P53 formation. Sequential pattern discovery is a process of extracting data to generate knowledge about the series of events that has the sequences in a certain frequency so that it creates a pattern. PrefixSpan is to propose method to find a pattern of DNA sequence database. As a result, there are various selected patterns of DNA sequence. The pattem which has high similarity is used as biomarker to identify the breast cancer disease. The performance measure of support value average is 0.8. It means that the frequent sequence pattern is high. Another measure is confidence. All of the confidence values are 1. Then, the last performance measure is lift ratio at average more than 1. It means that the composed sequence items in the pattern has high dependency and relatedness. Futhermore, the selected patterns are applied as biomarker with accuracy as 100%.

References

[1]  Pustai L., Lewis, C. and Yap, E. (1996) Cell Proliferation in Cancer-Regulation Mechanisms of Neoplastic Cell Growth. Oxford University Press, Oxford.
[2]  Soussi, T. (2011) TP53 Mutations in Human Cancer: Database Reassessment and Prospects for the Next Decade. Advances in Cancer Research, 110, 107-139.
https://doi.org/10.1016/B978-0-12-386469-7.00005-0
[3]  Sander, C. (2001) Bioinformatics Challenges in 2001. Bioinformatics, 17, 1-2.
https://doi.org/10.1093/bioinformatics/17.1.1
[4]  Zubi, Z.S. and Emsaed, M.A. (2013) Identifying Cancer Patients Using DNA Micro-Arrayy Data in Data Mining Environment. Journal of Science and Engineering, 3, 63-75.
[5]  Kalaiselvi, S. and Meena, A. (2016) Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence. International Journal of Scientific and Technical Advancements, 2, 95-100.
[6]  Han, J. and Kamber, M. (2006) Data Mining: Concepts and Techniques. 2nd Edition, Morgan Kaufmann Publishers, San Francisco.
[7]  Pei, J., Han, J.W., Mortazavi-Asl, B., Wang, J.Y., Pinto, H., Chen, Q.M., et al. (2004) Mining Sequential Patterns by Pattern Growth: The PrefixSpan Approach. IEEE Transaction on Knowledge and Data Engineering, 16, 1424-1440.
[8]  Fomby, T. (2011) Association Rules (Aka Affinity Analysis or Market Basket Analysis). Department of Economics, Southern Methodist University, Dallas, TX.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133