%0 Journal Article
%T 基于Transformer和运动边界掩码关注变化的视频插帧方法
Transformer-Based Video Frame Interpolation with MB Mask Guidance
%A 石明光
%A 王晓红
%A 马春运
%J Software Engineering and Applications
%P 201-216
%@ 2325-2278
%D 2025
%I Hans Publishing
%R 10.12677/sea.2025.142019
%X 为了提升基于光流的视频插帧方法在变化区域的生成质量,我们提出了一种新颖的两阶段视频插帧框架,该框架在光流运动信息的约束下指导中间帧的细化。捕捉长距离相关性信息能够提高光流估计的准确性,因此,我们提出了一种BWT-FlowNet用于光流估计,该网络通过集成双级窗口Transformer和内容感知机制来捕捉视频序列中的长距离时空交互。随后,利用光流中的运动信息预测运动边界掩模(MB mask),以帮助网络在中间帧细化过程中聚焦于内容变化区域。我们还开发了一种运动边界感知细化网络(MBAR Net)用于中间帧的细化过程。在MBAR Net的子层中使用金字塔MB mask以突出运动区域。此外,引入掩模感知损失函数(Mask Perceptual Loss)以有效约束内容变化区域,从而提高预测帧的质量。实验表明,我们提出的方法在多个公共基准测试中均取得了优异的性能。
To enhance the generation quality of flow-based video frame interpolation methods in changing regions, we propose a novel two-stage video frame interpolation framework that guides the refinement of intermediate frames under the constraint of optical flow motion information. Capturing long-range relevant information can enhance the accuracy of optical flow estimation. Therefore, we propose a BWT-FlowNet for optical flow estimation, which integrates a bi-level window Transformer with content awareness to capture long-range spatial-temporal interactions in video sequences. Then, a Motion Boundary Mask (MB Mask) is predicted by leveraging the motion information from optical flow, which is used to help the network focus on content-changing areas during the refinement of intermediate frames. We also develop a Motion Boundary-Aware Refinement Net (MBAR Net) to refine the process of intermediate frames. Pyramid MB Masks are utilized in sub-layers of the MBAR Net to highlight motion regions. In addition, the Mask Perceptual Loss function is introduced to constrain content-changing areas effectively, improving the quality of predicted frames. Experiments demonstrate that our proposed method achieves excellent performance on several public benchmarks.
%K 视频插帧,
%K 光流估计,
%K 掩码,
%K Transformer
Video Frame Interpolation
%K Optical Flow Estimation
%K Mask
%K Transformer
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=111077