|
- 2018
基于FPGA的新边缘指导插值算法硬件实现DOI: 10.3785/j.issn.1008-973X.2018.11.022 Abstract: 针对图像超分辨率算法中新边缘指导插值算法(NEDI)计算复杂度较高、软件计算时间较长的问题,提出基于Cholesky分解的可扩展NEDI算法硬件设计方案.采用Cholesky分解方法简化NEDI算法中复杂的矩阵求逆运算,采用Goldschmidt算法设计低延时定点数除法器加速矩阵求逆运算,使用多周期计算方法隐藏数据相关性带来的数据等待时间并减少硬件资源使用.为了减少硬件资源的消耗,根据NEDI算法在不同大小窗口下核心计算部分的不变性,使用固定资源设计可扩展算法核心电路,采用可变资源设计扩展电路,在FPGA上实现该电路设计.实验结果表明,可扩展NEDI算法硬件的关键路径延时为7.007 ns,工作频率大于100 MHz.与使用PC端软件计算的结果相比,可扩展NEDI算法硬件电路计算结果的误差为0.1%,计算速度是使用PC端软件计算的51倍.Abstract: A scalable hardware implementation of new edge-directed interpolation (NEDI) algorithm based on the Cholesky decomposition algorithm was proposed to reduce the complexity of the matrix computation and the long time consumption of the calculation of NEDI algorithm. NEDI algorithm is one of the image super-resolution algorithms. The Cholesky decomposition algorithm was used to simplify the matrix inversion and a low latency fixed-point divider based on the Goldschmidt algorithm was designed to accelerate the progress of the matrix inversion. Multicycle computation was used to leverage the time cost of waiting data and to reduce the resoure utilization of hardware. According to the invariance of core calculation in NEDI algorithm under different conditions, a core circuit was designed using fixed resources, and a corresponding expansion circuit was designed using variable resources to reduce the hardware resource usage. The circuit design was implemented based on field programmable gate array (FPGA). The experimental results indicated that the time delay on critical path was 7.007 ns and the system frequency of the designed hardware was greater than 100 MHz. The results computed by the scalable NEDI hardware circuit had a maximum offset of 0.1% and the calculation speed was 51 times faster than that of the software on PC.
|