%0 Journal Article %T 基于编码器–解码器架构和Potts模型的图像分割方法
Image Segmentation Method Based on Encoder-Decoder Architecture and Potts Model %A 张同德 %A 潘振宽 %A 魏伟波 %A 王烨然 %J Journal of Image and Signal Processing %P 213-223 %@ 2325-6745 %D 2025 %I Hans Publishing %R 10.12677/jisp.2025.142020 %X 深度学习驱动的图像分割在医学影像和自动驾驶等领域成效显著,但其黑箱决策机制导致模型选择与超参数调整缺乏理论指导,依赖大数据和高算力支撑。相较之下,基于变分模型的方法虽多受限于局部特征提取,易忽略全局上下文关联,但其通过融合全局统计规律与局部平滑约束的特性,在数学可解释性和抗噪声伪影方面展现优势。因此,本文提出了一种基于Potts模型展开的与U-Net相似的架构,旨在提升图像分割的准确性和鲁棒性。与传统U-Net不同,本文在下采样和上采样过程中引入了基于Potts模型的正则化块,以增强分割过程中的区域一致性和边缘保留能力。通过HQS (半二次分裂)方法求解Potts模型,并结合FoE正则化项,使用可训练的离散余弦变换(DCT)-高斯卷积实现了梯度算子的学习,激活函数采用软阈值公式(STF)。此外,为了捕获全局上下文信息并处理远距离依赖,在网络的最底层加入了Transformer结构,进一步改善分割效果。实验结果表明,本文提出的模型在少量参数和数据集上能够有效学习特征,提高分割精度。本研究为图像分割任务提供了新的视角,展示了结合深度神经网络与传统变分模型架构的广阔潜力。
Deep learning-driven image segmentation has demonstrated significant efficacy in fields such as medical imaging and autonomous driving. However, its black-box decision-making mechanisms lead to a lack of theoretical guidance for model selection and hyperparameter tuning, with heavy reliance on large datasets and high computational resources. In contrast, variational model-based methods, though often limited by local feature extraction and neglect of global contextual relationships, exhibit advantages in mathematical interpretability and noise/artifact resistance through their integration of global statistical patterns and local smoothness constraints. This paper proposes a U-Net-inspired architecture based on the Potts model unfolding framework to enhance segmentation accuracy and robustness. Unlike traditional U-Net, our method introduces Potts model-derived regularization blocks during downsampling and upsampling to strengthen region consistency and edge preservation capabilities. The Potts model is solved via the Half-Quadratic Splitting (HQS) method, combined with a Fields of Experts (FoE) regularization term. Trainable Discrete Cosine Transform (DCT)-Gaussian convolutions are employed to learn gradient operators, with activation functions adopting the Soft Thresholding Formula (STF). Additionally, a Transformer structure is integrated at the network’s deepest layer to capture global contextual information and address long-range dependencies, further refining segmentation performance. Experimental results demonstrate that our model effectively learns features with limited parameters and datasets while improving segmentation precision. This study offers a novel perspective for image segmentation tasks, highlighting the vast potential of hybrid architectures that combine deep neural networks with classical variational models. %K Potts模型, %K 深度学习, %K 变分网络, %K 图像分割
Potts Model %K Deep Learning %K Variational Network %K Image Segmentation %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=110637