The paper presents a unified hybrid architecture to compute the integer inverse discrete cosine transform (IDCT) of multiple modern video codecs—AVS, H.264/AVC, VC-1, and HEVC (under development). Based on the symmetric structure of the matrices and the similarity in matrix operation, we develop a generalized “decompose and share” algorithm to compute the IDCT. The algorithm is later applied to four video standards. The hardware-share approach ensures the maximum circuit reuse during the computation. The architecture is designed with only adders and shifters to reduce the hardware cost significantly. The design is implemented on FPGA and later synthesized in CMOS 0.18?um technology. The results meet the requirements of advanced video coding applications. 1. Introduction In recent years, different video applications use different video standards, such as H.264/AVC [1], VC-1 [2], and AVS [3]. To improve the coding efficiency further, recently a joint collaboration team on video coding (JCT-VC) is drafting a next generation video coding standards, known tentatively as high efficient video coding (HEVC or H.265) [4]. The target bit rate is half of that of H.264/AVC. Besides, several other effective techniques are proposed in the draft to reduce the complexity of the encoder such as improved intrapicture coding, and simpler VLC coefficients [5]. As a result of these new features, experts predict that the HEVC will dominate the future multimedia market. In order to meet up the present and future demands of different multimedia applications, it becomes necessary to develop a unified video decoder that can support all popular video standards on a single platform. In recent years, there is a growing interest to develop multistandard inverse transform architectures for advanced multimedia applications. However, most of them do not support AVS, the video codec developed by Chinese government that became the core technology of China Mobile Multimedia Broadcasting (CMMB) [6]. None of the existing works supports the HEVC; thought it is not finalized yet, considering the future prospective of the HEVC [7], it is important to start exploring possible implementation in hardware of the transform unit discussed in the draft. In this paper, we present a new generalized algorithm and its hardwire implementation of an 8 × 8 IDCT architecture. The scheme is based on matrix decomposition with sparse matrices and offset computations. These sparse matrices are derived in a way that can be reused maximum number of times during decoding different inverse matrices. All multipliers
References
[1]
ITU-T Rec, “H.264/ISO/IEC 14496-10 AVC,” 2003.
[2]
“Standard for Television: VC-1 Compressed Video Bitstream Format and Decoding Process,” SMPTE 421M, 2006.
[3]
GB/T 20090.1, “Information technology - Advanced coding of audio and video – Part 1: System,” Chinese AVS standard.
[4]
G. J. Sullivan and J.-R. Ohm, “Recent developments in standardization of high efficiency video coding (HEVC),” in Applications of Digital Image Processing XXXIII, vol. 7798 of Proceedings of SPIE, August 2010.
[5]
K. Ugur, K. Andersson, A. Fuldseth et al., “High performance, low complexity video coding and the emerging hevc standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 12, pp. 1688–1697, 2010.
[6]
C. C. Ju, Y. C. Chang, C. Y. Cheng et al., “A full-HD 60fps AVS/H.264/VC-1/MPEG-2 video decoder for digital home applications,” in International Symposium on VLSI Design, Automation and Test (VLSI-DAT '11), pp. 117–120, April 2011.
[7]
Joint Collaborative Team – Video Coding, CE10: Core transform design for HEVC, JCTVC-G495, Geneva, Switzerland, 2011.
[8]
S. Lee and K. Cho, “Architecture of transform circuit for video decoder supporting multiple standards,” Electronics Letters, vol. 44, no. 4, pp. 274–276, 2008.
[9]
S. Kim, H. Chang, S. Lee, and K. Cho, “VLSI design to unify IDCT and IQ circuit for multistandard video decoder,” in 12th International Symposium on Integrated Circuits (ISIC '09), pp. 328–331, December 2009.
[10]
H. Qi, Q. Huang, and W. Gao, “A low-cost very large scale integration architecture for multistandard inverse transform,” IEEE Transactions on Circuits and Systems II, vol. 57, no. 7, pp. 551–555, 2010.
[11]
S. Lee and K. Cho, “Circuit implementation for transform and quantization operations of H.264/MPEG-4/VC-1 video decoder,” in International Conference on Design and Technology of Integrated Systems in Nanoscale Era (DTIS '07), pp. 102–107, September 2007.
[12]
K. A. Wahid, M. Martuza, M. Das, and C. McCrosky, “Efficient hardware implementation of integer cosine transforms for multiple video codecs,” Journal of Real-Time Image Processing. In press.
[13]
G. Liu, “An area-efficient IDCT architecture for multiple video standards,” in 2nd International Conference on Information Science and Engineering (ICISE '10), pp. 3518–3522, December 2010.
[14]
C. P. Fan and G. A. Su, “Efficient low-cost sharing design of fast 1-D inverse integer transform algorithms for H.264/AVC and VC-1,” IEEE Signal Processing Letters, vol. 15, pp. 926–929, 2008.
[15]
C. Fan and G. Su, “Fast algorithm and low-cost hardware-sharing design of multiple integer transforms for VC-1,” IEEE Transactions on Circuits and Systems II, vol. 56, pp. 788–792, 2009.
[16]
D. Zhou, Z. You, J. Zhu et al., “A 1080p@60fps multi-standard video decoder chip designed for power and cost efficiency in a system perspective,” in Symposium on VLSI Circuits, pp. 262–263, June 2009.
[17]
C. P. Fan and Y. L. Lin, “Implementations of low-cost hardware sharing architectures for fast 8 × 8 and 4 × 4 integer transforms in H.264/AVC,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 90, no. 2, pp. 511–516, 2007.
[18]
Y. C. Chao, S. T. Wei, C. H. Kao, B. D. Liu, and J. F. Yang, “An efficient architecture of multiple 8×8 transforms for H.264/AVC and VC-1 decoders,” in 1st International Conference on Green Circuits and Systems (ICGCS '10), pp. 595–598, June 2010.
[19]
Y. Li, Y. He, and S. Mei, “A highly parallel joint VLSI architecture for transforms in H.264/AVC,” Journal of Signal Processing Systems, vol. 50, no. 1, pp. 19–32, 2008.