|
基于机器学习对泌尿类疾病标志物气体识别模式研究
|
Abstract:
泌尿类疾病,例如膀胱癌和前列腺癌,对全球健康构成严重威胁。本研究针对泌尿类疾病标志性VOC气体(甲苯,乙苯,异丙醇,戊醛),通过选取电子鼻采集的多传感器信号与这四类气体的关联特征,采用常规气体传感器特征,建立四分类VOC分类预测模型。采用主成分分析(PCA)对样本点降维,分别使用三种分类算法:K-邻近(KNN),支持向量机(SVM)和随机森林(RF)进行分类预测,三者准确率分别达到了88%,85%和91%。最后使用Stacking集成方式,分别对KNN和SVM,KNN和RF,SVM和RF进行两两集成,集成后的准确率有明显提升,其中效果最佳的集成方式是SVM和RF,其准确率达到了97%。研究表明stacking集成的SVM和RF模型成功地预测四种标志物VOC,为泌尿类关键疾病的早期筛查和无创检测打下坚实基础。
Urinary diseases, such as bladder cancer and prostate cancer, pose a serious threat to global health. This study focuses on the landmark VOC gases (toluene, ethylbenzene, isopropanol, and glutaraldehyde) of urinary diseases. By selecting the correlation characteristics between the multi-sensor signals collected by the electronic nose and these four gases, and using conventional gas sensor features, a four class VOC classification prediction model is established. Principal Component Analysis (PCA) was used to reduce the dimensionality of sample points, and three classification algorithms were used: K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest (RF) for classification prediction, with accuracy rates of 88%, 85%, and 91%, respectively. Finally, using the Stacking integration method, KNN and SVM, KNN and RF, SVM and RF were integrated pairwise, and the accuracy was significantly improved after integration. The best integration method was SVM and RF, with an accuracy of 97%. Research has shown that the SVM and RF models integrated with stacking have successfully predicted four biomarkers of VOC, laying a solid foundation for early screening and non-invasive detection of key urological diseases.
[1] | Anwar, H., Anwar, T. and Murtaza, S. (2023) Review on Food Quality Assessment Using Machine Learning and Electronic Nose System. Biosensors and Bioelectronics: X, 14, Article ID: 100365. https://doi.org/10.1016/j.biosx.2023.100365 |
[2] | 庞林江, 王俊, 路兴花, 等. 基于电子鼻技术的山核桃陈化指标预测模型研究[J]. 传感技术学报, 2019, 32(9): 1303-1307. |
[3] | Anwar, H., Anwar, T. and Murtaza, M.S. (2023) Applications of Electronic Nose and Machine Learning Models in Vegetables Quality Assessment: A Review. 2023 IEEE International Conference on Emerging Trends in Engineering, Sciences and Technology (ICES&T), Bahawalpur, 9-11 January 2023, 1-5. https://doi.org/10.1109/ICEST56843.2023.10138839 |
[4] | Gardner, J.W., Shin, H.W., Hines, E.L., et al. (2000) An Electronic Nose System for Monitoring the Quality of Potable Water. Sensors and Actuators B: Chemical, 69, 336-341. |
[5] | Attallah, O. and Morsi, I. (2022) An Electronic Nose for Identifying Multiple Combustible/Harmful Gases and Their Concentration Levels via Artificial Intelligence. Measurement, 199, Article ID: 111458. https://doi.org/10.1016/j.measurement.2022.111458 |
[6] | Wilson, A.D. (2013) Diverse Applications of Electronic-Nose Technologies in Agriculture and Forestry. Sensors, 13, 2295-2348. https://doi.org/10.3390/s130202295 |
[7] | 郭泽尚, 王磊, 常志勇. 电子鼻在肠道疾病诊断中应用的研究进展[J]. 吉林大学学报(医学版), 2022, 46(6): 1332-1337. |
[8] | Filianoti, A., Costantini, M., Bove, A.M., et al. (2022) Volatilome Analysis in Prostate Cancer by Electronic Nose: A Pilot Monocentric Study. Cancers, 14, Article 2927. https://doi.org/10.3390/cancers14122927 |
[9] | 喻璐, 谭志文, 邹望辉. 基于传感器阵列的气体检测与分析系统设计[J]. 电子设计工程, 2022, 30(10): 129-133, 138. |
[10] | Fang, C., Li, H.Y., Li, L., et al. (2022) Smart Electronic Nose Enabled by an All-Feature Olfactory Algorithm. Advanced Intelligent Systems, 4, Article ID: 2200074. https://doi.org/10.1002/aisy.202270032 |
[11] | Righettoni, M., Tricoli, A. and Pratsinis, S.E. (2010) Si: WO3 Sensors for Highly Selective Detection of Acetone for Easy Diagnosis of Diabetes by Breath Analysis. Analytical Chemistry, 82, 3581-3587. https://doi.org/10.1021/ac902695n |
[12] | Smith, A.D., Cowan, J.O., Filsell, S., et al. (2004) Diagnosing Asthma: Comparisons between Exhaled Nitric Oxide Measurements and Conventional Tests. American Journal of Respiratory and Critical Care Medicine, 169, 473-478. https://doi.org/10.1164/rccm.200310-1376OC |
[13] | Choi, S.J., Jang, B.H., Lee, S.J., et al. (2014) Selective Detection of Acetone and Hydrogen Sulfide for the Diagnosis of Diabetes and Halitosis Using SnO2 Nanofibers Functionalized with Reduced Graphene Oxide Nanosheets. ACS Applied Materials & Interfaces, 6, 2588-2597. https://doi.org/10.1021/am405088q |
[14] | Mendis, S., Sobotka, P.A. and Euler, D.E. (1995) Expired Hydrocarbons in Patients with Acute Myocardial Infarction. Free Radical Research, 23, 117-122. https://doi.org/10.3109/10715769509064026 |
[15] | Dragonieri, S., Schot, R., Mertens, B.J.A., et al. (2007) An Electronic Nose in the Discrimination of Patients with Asthma and Controls. Journal of Allergy and Clinical Immunology, 120, 856-862. https://doi.org/10.1016/j.jaci.2007.05.043 |
[16] | Zhu, S., Corsetti, S., Wang, Q., et al. (2019) Optical Sensory Arrays for the Detection of Urinary Bladder Cancer-Related Volatile Organic Compounds. Journal of Biophotonics, 12, e201800165. https://doi.org/10.1002/jbio.201800165 |
[17] | Jian, Y., Zhang, N., Liu, T., et al. (2022) Artificially Intelligent Olfaction for Fast and Noninvasive Diagnosis of Bladder Cancer from Urine. ACS Sensors, 7, 1720-1731. https://doi.org/10.1021/acssensors.2c00467 |
[18] | Tyagi, H., Daulton, E., Bannaga, A.S., et al. (2021) Urinary Volatiles and Chemical Characterisation for the Non-Invasive Detection of Prostate and Bladder Cancers. Biosensors, 11, Article 437. https://doi.org/10.3390/bios11110437 |
[19] | Gao, Q., Su, X., Annabi, M.H., et al. (2019) Application of Urinary Volatile Organic Compounds (VOCs) for the Diagnosis of Prostate Cancer. Clinical Genitourinary Cancer, 17, 183-190. https://doi.org/10.1016/j.clgc.2019.02.003 |
[20] | Karamizadeh, S., Abdullah, S.M., Manaf, A.A., et al. (2013) An Overview of Principal Component Analysis. Journal of Signal and Information Processing, 4, 173-175. https://doi.org/10.4236/jsip.2013.43B031 |
[21] | Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. WIREs Computational Statistics, 2, 433-459. https://doi.org/10.1002/wics.101 |
[22] | Kumar, N.S. and Arun, M. (2017) Genetic Algorithm-Based Feature Selection for Classification of Land Cover Changes Using Combined LANDSAT and ENVISAT Images. International Journal of Bio-Inspired Computation, 10, 172-187. https://doi.org/10.1504/IJBIC.2017.086700 |
[23] | Pardo, M. and Sberveglieri, G. (2005) Classification of Electronic Nose Data with Support Vector Machines. Sensors and Actuators B: Chemical, 107, 730-737. https://doi.org/10.1016/j.snb.2004.12.005 |
[24] | Sinju, K.R., Bhangare, B.K., Debnath, A.K. and Ramgir, N.S. (2023) Quick Classification and Prediction of CO2, NH3, H2S, and NO2 Gases from Their Mixture Using a ZnO Nanowire-Based Electronic Nose. Journal of Electronic Materials, 52, 4686-4698. https://doi.org/10.1007/s11664-023-10419-5 |
[25] | Cutler, A., Cutler, D.R. and Stevens, J.R. (2012) Random Forests. In: Zhang, C. and Ma, Y., Eds., Ensemble Machine Learning, Springer, New York, 157-175. https://doi.org/10.1007/978-1-4419-9326-7_5 |
[26] | Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324 |
[27] | Zhang, H., Li, J.L., Liu, X.M., et al. (2021) Multi-Dimensional Feature Fusion and Stacking Ensemble Mechanism for Network Intrusion Detection. Future Generation Computer Systems, 122, 130-143. https://doi.org/10.1016/j.future.2021.03.024 |
[28] | Poli, R., Kennedy, J. and Blackwell, T. (2007) Particle Swarm Optimization: An Overview. Swarm Intelligence, 1, 33-57. https://doi.org/10.1007/s11721-007-0002-0 |