Abstract:
In this paper, we propose a collection of approximations for the 8-point discrete cosine transform (DCT) based on integer functions. Approximations could be systematically obtained and several existing approximations were identified as particular cases. Obtained approximations were compared with the DCT and assessed in the context of JPEG-like image compression.

Abstract:
Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.

Abstract:
An algebraic integer (AI) based time-multiplexed row-parallel architecture and two final-reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI-encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8$\times$8 2-D DCT which is entirely free of quantization errors in AI basis. As a result, the user-selectable accuracy for each of the coefficients in the FRS facilitates each of the 64 coefficients to have its precision set independently of others, avoiding the leakage of quantization noise between channels as is the case for published DCT designs. The proposed FRS uses two approaches based on (i) optimized Dempster-Macleod multipliers and (ii) expansion factor scaling. This architecture enables low-noise high-dynamic range applications in digital video processing that requires full control of the finite-precision computation of the 2-D DCT. The proposed architectures and FRS techniques are experimentally verified and validated using hardware implementations that are physically realized and verified on FPGA chip. Six designs, for 4- and 8-bit input word sizes, using the two proposed FRS schemes, have been designed, simulated, physically implemented and measured. The maximum clock rate and block-rate achieved among 8-bit input designs are 307.787 MHz and 38.47 MHz, respectively, implying a pixel rate of 8$\times$307.787$\approx$2.462 GHz if eventually embedded in a real-time video-processing system. The equivalent frame rate is about 1187.35 Hz for the image size of 1920$\times$1080. All implementations are functional on a Xilinx Virtex-6 XC6VLX240T FPGA device.

Abstract:
A low complexity digital VLSI architecture for the computation of an algebraic integer (AI) based 8-point Arai DCT algorithm is proposed. AI encoding schemes for exact representation of the Arai DCT transform based on a particularly sparse 2-D AI representation is reviewed, leading to the proposed novel architecture based on a new final reconstruction step (FRS) having lower complexity and higher accuracy compared to the state-of-the-art. This FRS is based on an optimization derived from expansion factors that leads to small integer constant-coefficient multiplications, which are realized with common sub-expression elimination (CSE) and Booth encoding. The reference circuit [1] as well as the proposed architectures for two expansion factors α？ = 4.5958 and α′ = 167.2309 are implemented. The proposed circuits show 150% and 300% improvements in the number of DCT coefficients having error ≤ 0:1% compared to [1]. The three designs were realized using both 40 nm CMOS Xilinx Virtex-6 FPGAs and synthesized using 65 nm CMOS general purpose standard cells from TSMC. Post synthesis timing analysis of 65 nm CMOS realizations at 900 mV for all three designs of the 8-point DCT core for 8-bit inputs show potential real-time operation at 2.083 GHz clock frequency leading to a combined throughput of 2.083 billion 8-point Arai DCTs per second. The expansion-factor designs show a 43% reduction in area (A) and 29% reduction in dynamic power (PD) for FPGA realizations. An 11% reduction in area is observed for the ASIC design for α？ = 4.5958 for an 8% reduction in total power ( PT ). Our second ASIC design having α′ = 167.2309 shows marginal improvements in area and power compared to our reference design but at significantly better accuracy.

取整函数是一种常见的函数，它的形式简单，性质非常独特，在求极限、求导、求积分等的问题上都有广泛应用。应用取整函数的性质，建立一个啤酒瓶换啤酒的实数集到整数集的一个映射，将任意实数转化成整数，解决如何更加优化地买啤酒。通过把超市促销活动啤酒瓶换啤酒问题转化为数学模型，根据取整函数的性质，导出一些结果；并且对这个数学模型进行理论深入探讨与延伸，从而得到一般性的结论。结合我们得到的结论，进一步对结论进行应用，简化实际问题。
The rounding function is a common function. Its form is simple, its nature is very unique, and it is widely used in the problems of seeking limits, seeking and integrating. Applying the nature of the rounding function, a mapping of the real number set of the beer bottle to the beer to a set of inte-gers is established, and any real number is converted into an integer to solve how to optimize the beer. By transforming the problem of beer bottle change for beer in supermarket promotion into a mathematical model, some results are derived according to the nature of the rounding function; and the mathematical model is deeply explored and extended theoretically, and a general conclu-sion is obtained. Combined with the conclusions we have obtained, the conclusions are further ap-plied to simplify the actual problems.