|
生成模型在胸部X射线图像中的应用综述
|
Abstract:
生成模型在医学影像领域的快速发展为胸部X射线(CXR)图像的合成、编辑与增强等领域提供了新的技术手段。本文系统综述了生成对抗网络(GAN)、变分自编码器(VAE)及扩散模型在CXR图像中的研究进展与应用。VAE通过隐变量学习生成数据分布,在疾病检测与图像重构中表现稳健,但生成图像常存在模糊现象;GAN凭借其高真实感图像生成能力,被广泛用于解决数据稀缺问题及跨模态图像合成,但其训练不稳定性和模式崩溃问题仍需优化;扩散模型凭借逐步去噪的生成机制,在图像质量与多样性上展现出超越GAN的潜力,成为当前研究热点。文章进一步分析了该领域的研究现状,总结了生成模型在数据增强、图像生成、及图像编辑中的创新应用,并对比了不同技术的优势与局限性。尽管生成模型在提升诊断效率与数据隐私保护方面成果显著,但仍面临伦理、法律及模型泛化性等挑战。未来研究需聚焦多模态生成、隐私保护框架设计及病理特征解耦,以推动生成模型在临床中的实际应用。本文为医学影像领域的研究者提供了技术参考与方向指引,具有重要的学术价值与应用前景。
The rapid development of generative models in the field of medical imaging has provided new technical means for the synthesis, editing and enhancement of chest X-ray (CXR) images. This paper systematically reviews the research progress and applications of generative adversarial networks (GANs), variational autoencoders (VAEs) and diffusion models in CXR images. VAEs generate data distributions through latent variable learning, and are robust in disease detection and image reconstruction, but the generated images are often blurred; GANs are widely used to solve data scarcity problems and cross-modal image synthesis due to their ability to generate highly realistic images, but their training instability and mode collapse problems still need to be optimized; diffusion models have the potential to surpass GANs in image quality and diversity due to their step-by-step denoising generation mechanism, and have become a current research hotspot. This paper further analyzes the current research status in this field, summarizes the innovative applications of generative models in data enhancement, image generation, and image editing, and compares the advantages and limitations of different technologies. Although generative models have achieved remarkable results in improving diagnostic efficiency and protecting data privacy, they still face challenges such as ethics, law, and model generalization. Future research needs to focus on multimodal generation, privacy protection framework design, and pathological feature decoupling to promote the practical application of generative models in clinical practice. This article provides technical references and direction guidance for researchers in the field of medical imaging, and has important academic value and application prospects.
[1] | Morís, D.I., Moura, J.d., Novo, J. and Ortega, M. (2024) Adapted Generative Latent Diffusion Models for Accurate Pathological Analysis in Chest X-Ray Images. Medical & Biological Engineering & Computing, 62, 2189-2212. https://doi.org/10.1007/s11517-024-03056-5 |
[2] | Kingma, D.P. and Welling, M. (2014) Auto-Encoding Variational Bayes. Stat, 1050: 1. |
[3] | Rais, K., Amroune, M., Benmachiche, A., et al. (2024) Exploring Variational Autoencoders for Medical Image Generation: A Comprehensive Study. |
[4] | Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Nets. Communications of the ACM, 63, 139-144. |
[5] | Alamir, M. and Alghamdi, M. (2022) The Role of Generative Adversarial Network in Medical Image Analysis: An In-Depth Survey. ACM Computing Surveys, 55, 1-36. https://doi.org/10.1145/3527849 |
[6] | Tang, Y., Tang, Y., Zhu, Y., Xiao, J. and Summers, R.M. (2021) A Disentangled Generative Model for Disease Decomposition in Chest X-Rays via Normal Image Synthesis. Medical Image Analysis, 67, Article ID: 101839. https://doi.org/10.1016/j.media.2020.101839 |
[7] | Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., et al. (2015) Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. International Conference on Machine Learning, Lile, 6-11 July 2015, 2256-2265. |
[8] | Ho, J., Jain, A. and Abbeel, P. (2020) Denoising Diffusion Probabilistic Models. Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, 6-12 December 2020, 6840-6851. |
[9] | Rombach, R., Blattmann, A., Lorenz, D., Esser, P. and Ommer, B. (2022) High-Resolution Image Synthesis with Latent Diffusion Models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 10684-10695. https://doi.org/10.1109/cvpr52688.2022.01042 |
[10] | Kazerouni, A., Aghdam, E.K., Heidari, M., Azad, R., Fayyaz, M., Hacihaliloglu, I., et al. (2023) Diffusion Models in Medical Imaging: A Comprehensive Survey. Medical Image Analysis, 88, Article ID: 102846. https://doi.org/10.1016/j.media.2023.102846 |
[11] | Shin, H., Tenenholtz, N.A., Rogers, J.K., Schwarz, C.G., Senjem, M.L., Gunter, J.L., et al. (2018) Medical Image Synthesis for Data Augmentation and Anonymization Using Generative Adversarial Networks. Simulation and Synthesis in Medical Imaging: 3rd International Workshop, SASHIMI 2018, Held in Conjunction with MICCAI 2018, Granada, 16 September 2018, 1-11. https://doi.org/10.1007/978-3-030-00536-8_1 |
[12] | Chambon, P., Bluethgen, C., Delbrouck, J.B., et al. (2022) Roentgen: Vision-Language Foundation Model for Chest X-Ray Generation. |
[13] | Lee, H., Lee, D.Y., Kim, W., et al. (2023) Vision-Language Generative Model for View-Specific Chest X-Ray Generation. |
[14] | Yi, X., Walia, E. and Babyn, P. (2019) Generative Adversarial Network in Medical Imaging: A Review. Medical Image Analysis, 58, Article ID: 101552. https://doi.org/10.1016/j.media.2019.101552 |
[15] | Pu, Y., Gan, Z., Henao, R., et al. (2016) Variational Autoencoder for Deep Learning of Images, Labels and Captions. 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, 5-10 December 2016. |
[16] | Karras, T., Laine, S. and Aila, T. (2019) A Style-Based Generator Architecture for Generative Adversarial Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 4401-4410. https://doi.org/10.1109/cvpr.2019.00453 |
[17] | Song, Y., Sohl-Dickstein, J., Kingma, D.P., et al. (2020) Score-Based Generative Modeling through Stochastic Differential Equations. |
[18] | Park, E. (2015) Manifold Learning with Variational Auto-Encoder for Medical Image Analysis. Technical Report, University of North Carolina, Tech. Rep. |
[19] | Bercea, C.I., Rueckert, D. and Schnabel, J.A. (2022) What Do We Learn? Debunking the Myth of Unsupervised Outlier Detection. |
[20] | Burgess, C.P., Higgins, I., Pal. A., et al. (2018) Understanding Disentangling in beta-VAE. |
[21] | Zhou, L., Deng, W. and Wu, X. (2020) Unsupervised Anomaly Localization Using VAE and beta-VAE. |
[22] | Ghali, R. and Akhloufi, M.A. (2023) Vision Transformers for Lung Segmentation on CXR Images. SN Computer Science, 4, Article No. 414. https://doi.org/10.1007/s42979-023-01848-4 |
[23] | Crespi, L., Loiacono, D. and Chiti, A. (2021) Chest X-Rays Image Classification from β-Variational Autoencoders Latent Features. 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, 5-7 December 2021, 1-8. https://doi.org/10.1109/ssci50451.2021.9660190 |
[24] | Cao, F. and Zhao, H. (2021) Automatic Lung Segmentation Algorithm on Chest X-Ray Images Based on Fusion Variational Auto-Encoder and Three-Terminal Attention Mechanism. Symmetry, 13, Article No. 814. https://doi.org/10.3390/sym13050814 |
[25] | Gerlings, J., Jensen, M.S. and Shollo, A. (2021) Explainable AI, but Explainable to Whom? An Exploratory Case Study of xAI in Healthcare. In: Lim, C.-P., et al., Eds., Intelligent Systems Reference Library, Springer International Publishing, 169-198. https://doi.org/10.1007/978-3-030-83620-7_7 |
[26] | Chatterjee, S., Maity, S., Bhattacharjee, M., Banerjee, S., Das, A.K. and Ding, W. (2022) Variational Autoencoder Based Imbalanced COVID-19 Detection Using Chest X-Ray Images. New Generation Computing, 41, 25-60. https://doi.org/10.1007/s00354-022-00194-y |
[27] | Montenegro, H. and Cardoso, J.S. (2024) Anonymizing Medical Case-Based Explanations through Disentanglement. Medical Image Analysis, 95, Article ID: 103209. https://doi.org/10.1016/j.media.2024.103209 |
[28] | Gu, Y., Yang, J., Usuyama, N., et al. (2023) Biomedjourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys. |
[29] | Li, F., Huang, W., Luo, M., Zhang, P. and Zha, Y. (2021) A New VAE-GAN Model to Synthesize Arterial Spin Labeling Images from Structural MRI. Displays, 70, Article ID: 102079. https://doi.org/10.1016/j.displa.2021.102079 |
[30] | Siddiqui, A.A., Tirunagari, S., Zia, T., et al. (2024) VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics. |
[31] | Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2020) Generative Adversarial Networks. Communications of the ACM, 63, 139-144. https://doi.org/10.1145/3422622 |
[32] | 刘建伟, 谢浩杰, 罗雄麟. 生成对抗网络在各领域应用研究进展[J]. 自动化学报, 2020, 46(12): 2500-2536. |
[33] | Aljohani, A. and Alharbe, N. (2022) Generating Synthetic Images for Healthcare with Novel Deep Pix2Pix GAN. Electronics, 11, Article No. 3470. https://doi.org/10.3390/electronics11213470 |
[34] | Kora Venu, S. and Ravula, S. (2020) Evaluation of Deep Convolutional Generative Adversarial Networks for Data Augmentation of Chest X-Ray Images. Future Internet, 13, Article No. 8. https://doi.org/10.3390/fi13010008 |
[35] | Dhawan, K. and Nijhawan, S.S. (2024) Cross-Modality Synthetic Data Augmentation Using GANs: Enhancing Brain MRI and Chest X-Ray Classification. |
[36] | Huang, Y., Maier, A., Fan, F., et al. (2022) Learning Perspective Deformation in X-Ray Transmission Imaging. |
[37] | Liang, Z., Huang, J.X. and Antani, S. (2022) Image Translation by Ad Cyclegan for COVID-19 X-Ray Images: A New Approach for Controllable Gan. Sensors, 22, Article No. 9628. https://doi.org/10.3390/s22249628 |
[38] | Kong, L., Lian, C., Huang, D., et al. (2021) Breaking the Dilemma of Medical Image-to-Image Translation. Advances in Neural Information Processing Systems, Vol. 34, 1964-1978. |
[39] | Qin, X., Bui, F.M., Han, Z. and Khademi, A. (2024) Toward Improved Interpretability in Medical Imaging: Revealing the Disease Evidence from Chest X-Ray Images Using an Adversarial Generative Approach. IEEE Access, 12, 82002-82014. https://doi.org/10.1109/access.2024.3412608 |
[40] | Kim, E., Lee, S. and Lee, K.M. (2023) Abnormality Detection in Chest X-Ray via Residual-Saliency from Normal Generation. IEEE Access, 11, 21799-21810. https://doi.org/10.1109/access.2023.3251350 |
[41] | Saboo, A., Ramachandran, S.N., Dierkes, K., et al. (2021) Towards Disease-Aware Image Editing of Chest X-Rays. |
[42] | Jin, Y., Chang, W. and Ko, B. (2023) Generating Chest X-Ray Progression of Pneumonia Using Conditional Cycle Generative Adversarial Networks. IEEE Access, 11, 88152-88160. https://doi.org/10.1109/access.2023.3305994 |
[43] | Liang, Z., Huang, J.X., Li, J. and Chan, S. (2020). Enhancing Automated COVID-19 Chest X-Ray Diagnosis by Image-To-Image GAN Translation. 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Seoul, 16-19 December 2020, 1068-1071. https://doi.org/10.1109/bibm49941.2020.9313466 |
[44] | Weber, T., Ingrisch, M., Bischl, B. and Rügamer, D. (2023) Implicit Embeddings via GAN Inversion for High Resolution Chest Radiographs. In: Fragemann, J., et al., Eds., Medical Applications with Disentanglements, Springer, 22-32. https://doi.org/10.1007/978-3-031-25046-0_3 |
[45] | Saboo, A., Gyawali, P.K., Shukla, A., et al. (2021) Latent-Optimization Based Disease-Aware Image Editing for Medical Image Augmentation. BMVC, 22-25 November 2021, 181. |
[46] | Wang, Z., Zhang, X., Chen, W. and Niu, J. (2022) Lung Segmentation Reconstruction Based Data Augmentation Approach for Abnormal Chest X-Ray Images Diagnosis. 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, 11-15 July 2022, 2203-2207. https://doi.org/10.1109/embc48229.2022.9871784 |
[47] | Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., et al. (2019) Chexpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 590-597. https://doi.org/10.1609/aaai.v33i01.3301590 |
[48] | Khader, F., Müller-Franzes, G., Tayebi Arasteh, S., Han, T., Haarburger, C., Schulze-Hagen, M., et al. (2023) Denoising Diffusion Probabilistic Models for 3D Medical Image Generation. Scientific Reports, 13, Article No. 7303. https://doi.org/10.1038/s41598-023-34341-2 |
[49] | Schaudt, D., Späte, C., von Schwerin, R., Reichert, M., von Schwerin, M., Beer, M., et al. (2023) A Critical Assessment of Generative Models for Synthetic Data Augmentation on Limited Pneumonia X-Ray Data. Bioengineering, 10, Article No. 1421. https://doi.org/10.3390/bioengineering10121421 |
[50] | Packhäuser, K., Folle, L., Thamm, F. and Maier, A. (2023). Generation of Anonymous Chest Radiographs Using Latent Diffusion Models for Training Thoracic Abnormality Classification Systems. 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, 18-21 April 2023, 1-5. https://doi.org/10.1109/isbi53787.2023.10230346 |
[51] | Dhariwal, P. and Nichol, A. (2021) Diffusion Models Beat Gans on Image Synthesis. Advances in Neural Information Processing Systems, Vol. 34, 8780-8794. |
[52] | Weber, T., Ingrisch, M., Bischl, B. and Rügamer, D. (2023) Cascaded Latent Diffusion Models for High-Resolution Chest X-Ray Synthesis. In: Lecture Notes in Computer Science, Springer, 180-191. https://doi.org/10.1007/978-3-031-33380-4_14 |
[53] | Wolleb, J., Bieder, F., Sandkühler, R. and Cattin, P.C. (2022) Diffusion Models for Medical Anomaly Detection. In: Kashima, H., Ide, T. and Peng, W.-C., Eds., Advances in Knowledge Discovery and Data Mining, Springer, 35-45. https://doi.org/10.1007/978-3-031-16452-1_4 |
[54] | Fathi, N., Kumar, A., Nichyporuk, B., et al. (2024) DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations. |
[55] | Hashmi, A.U.R., Almakky, I., Qazi, M.A., et al. (2024) Xreal: Realistic Anatomy and Pathology-Aware X-Ray Generation via Controllable Diffusion Model. |
[56] | Song, J., Meng, C. and Ermon, S. (2020) Denoising Diffusion Implicit Models. |
[57] | Chen, Y., Xu, S., Sellergren, A., et al. (2025) CoCa-CXR: Contrastive Captioners Learn Strong Temporal Structures for Chest X-Ray Vision-Language Understanding. |