%0 Journal Article
%T 基于编码器&#8211;解码器架构和Potts模型的图像分割方法<br>Image Segmentation Method Based on Encoder-Decoder Architecture and Potts Model
%A 张同德
%A 潘振宽
%A 魏伟波
%A 王烨然
%J Journal of Image and Signal Processing
%P 213-223
%@ 2325-6745
%D 2025
%I Hans Publishing
%R 10.12677/jisp.2025.142020
%X 深度学习驱动的图像分割在医学影像和自动驾驶等领域成效显著&#65292;但其黑箱决策机制导致模型选择与超参数调整缺乏理论指导&#65292;依赖大数据和高算力支撑。相较之下&#65292;基于变分模型的方法虽多受限于局部特征提取&#65292;易忽略全局上下文关联&#65292;但其通过融合全局统计规律与局部平滑约束的特性&#65292;在数学可解释性和抗噪声伪影方面展现优势。因此&#65292;本文提出了一种基于Potts模型展开的与U-Net相似的架构&#65292;旨在提升图像分割的准确性和鲁棒性。与传统U-Net不同&#65292;本文在下采样和上采样过程中引入了基于Potts模型的正则化块&#65292;以增强分割过程中的区域一致性和边缘保留能力。通过HQS (半二次分裂)方法求解Potts模型&#65292;并结合FoE正则化项&#65292;使用可训练的离散余弦变换(DCT)-高斯卷积实现了梯度算子的学习&#65292;激活函数采用软阈值公式(STF)。此外&#65292;为了捕获全局上下文信息并处理远距离依赖&#65292;在网络的最底层加入了Transformer结构&#65292;进一步改善分割效果。实验结果表明&#65292;本文提出的模型在少量参数和数据集上能够有效学习特征&#65292;提高分割精度。本研究为图像分割任务提供了新的视角&#65292;展示了结合深度神经网络与传统变分模型架构的广阔潜力。<br>Deep learning-driven image segmentation has demonstrated significant efficacy in fields such as medical imaging and autonomous driving. However, its black-box decision-making mechanisms lead to a lack of theoretical guidance for model selection and hyperparameter tuning, with heavy reliance on large datasets and high computational resources. In contrast, variational model-based methods, though often limited by local feature extraction and neglect of global contextual relationships, exhibit advantages in mathematical interpretability and noise/artifact resistance through their integration of global statistical patterns and local smoothness constraints. This paper proposes a U-Net-inspired architecture based on the Potts model unfolding framework to enhance segmentation accuracy and robustness. Unlike traditional U-Net, our method introduces Potts model-derived regularization blocks during downsampling and upsampling to strengthen region consistency and edge preservation capabilities. The Potts model is solved via the Half-Quadratic Splitting (HQS) method, combined with a Fields of Experts (FoE) regularization term. Trainable Discrete Cosine Transform (DCT)-Gaussian convolutions are employed to learn gradient operators, with activation functions adopting the Soft Thresholding Formula (STF). Additionally, a Transformer structure is integrated at the network&#8217;s deepest layer to capture global contextual information and address long-range dependencies, further refining segmentation performance. Experimental results demonstrate that our model effectively learns features with limited parameters and datasets while improving segmentation precision. This study offers a novel perspective for image segmentation tasks, highlighting the vast potential of hybrid architectures that combine deep neural networks with classical variational models.
%K Potts模型&#65292
%K 深度学习&#65292
%K 变分网络&#65292
%K 图像分割<br>Potts Model
%K Deep Learning
%K Variational Network
%K Image Segmentation
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=110637