This work relates to the regions-of-interest (ROI) coding that is a desirable feature in future applications based on the scalable video coding, which is an extension of the H.264/MPEG-4 AVC standard. Due to the dramatic technological progress, there is a plurality of heterogeneous devices, which can be used for viewing a variety of video content. Devices such as smartphones and tablets are mostly resource-limited devices, which make it difficult to display high-quality content. Usually, the displayed video content contains one or more ROI(s), which should be adaptively selected from the preencoded scalable video bitstream. Thus, an efficient scalable ROI video coding scheme is proposed in this work, thereby enabling the extraction of the desired regions-of-interest and the adaptive setting of the desirable ROI location, size, and resolution. In addition, an adaptive bit-rate control is provided for the region-of-interest scalable video coding. The performance of the presented techniques is demonstrated and compared with the joint scalable video model reference software (JSVM 9.19), thereby showing significant bit-rate savings as a tradeoff for the relatively low PSNR degradation. 1. Introduction Recently, significant changes have taken places in the content distribution network industry. The availability of cheaper and more powerful devices (such as smartphones and tablets, which have the ability to play, create, and transmit video content on various mobile networks) places unprecedented demands for high capacity and low-latency communications paths. The reduction of cost of digital video cameras, along with development of user-generated video sites (e.g., Vimeo, YouTube), has stimulated the new user-generated content sector. Growing premium content coupled with advanced video technologies, such as the Internet TV, will replace conventional technologies (e.g., cable or satellite TV) in the near future [1]. In this context, high-definition, highly interactive networked media applications pose challenges to network operators. The variety of end-user devices with different capabilities, ranging from smartphones with relatively small displays and restricted processing power to high-end PCs with high-definition displays, has stimulated a significant interest in effective technologies for providing video content in various spatial formats, employing limited computational complexity resources and operating under low bit-rates [2]. Much of the attention in the field of video adaptation is currently directed to the scalable video coding (SVC) extension [3] of
References
[1]
T. Spangler, “Lured by online video, digital broadcasts, more cable TV customers are cutting their service,” 2008, http://www.multichannel.com/article/85964-Cover_Story_Breaking_Free.php.
[2]
D. Grois, E. Kaminsky, and O. Hadar, “Optimization methods for H. 264/AVC video coding,” in The Handbook of MPEG Applications: Standards in Practice, chapter 7, pp. 175–204, John Wiley and Sons, 2011.
[3]
H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding extension of the H.264/AVC standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 9, pp. 1103–1120, 2007.
[4]
T. Wiegand, G. J. Sullivan, G. Bj?ntegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, 2003.
[5]
D. Grois and O. Hadar, “Recent trends in online mutimedia education for heterogeneous end-user devices based on Scalable Video Coding,” in Proceedings of the IEEE Global Engineering Education Conference (EDUCON '13), pp. 1141–1146, March 2013.
[6]
D. Grois, E. Kaminsky, and O. Hadar, “Dynamically adjustable and scalable ROI video coding,” in Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB '10), Shanghai, China, March 2010.
[7]
M.-J. Chen, M.-C. Chi, C.-T. Hsu, and J.-W. Chen, “ROI Video coding based on H.263+ with robust skin-color detection technique,” IEEE Transactions on Consumer Electronics, vol. 49, no. 3, pp. 724–730, 2003.
[8]
P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. I511–I518, December 2001.
[9]
Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” in Proceedings of the Computational Learning Theory (Eurocolt ’95), pp. 23–37, Springer, 1995.
[10]
L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254–1259, 1998.
[11]
D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of non-rigid objects using mean shift,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), pp. 142–149, Hilton Head Island, South Carolina, June 2000.
[12]
P. Lambert, D. de Schrijver, D. van Deursen, W. de Neve, Y. Dhondt, and R. van de Walle, “A real-time content adaptation framework for exploiting ROI scalability in H.264/AVC,” Advanced Concepts for Intelligent Vision Systems, vol. 4179, pp. 442–453, 2006.
[13]
F. Manerba, J. Benois-Pineau, R. Leonardi, and B. Mansencal, “Multiple moving object detection for fast video content description in compressed domain,” Eurasip Journal on Advances in Signal Processing, vol. 2008, Article ID 231930, 15 pages, 2008.
[14]
C. K?s and H. Nicolas, “Compressed domain indexing of scalable H.264/SVC streams,” Signal Processing, vol. 24, no. 6, pp. 484–498, 2009.
[15]
C. Hanfeng, Z. Yiqiang, and Q. Feihu, “Rapid object tracking on compressed video,” in Proceedings of the 2nd IEEE Pacific Rim Conference on Multimedia, pp. 1066–1071, October 2001.
[16]
W. Zeng, J. Du, W. Gao, and Q. Huang, “Robust moving object segmentation on H.264/AVC compressed video using the block-based MRF model,” Real-Time Imaging, vol. 11, no. 4, pp. 290–299, 2005.
[17]
W. You, M. S. H. Sabirin, and M. Kim, “Moving object tracking in H. 264/AVC bitstream,” in Multimedia Content Analysis and Mining, vol. 4577, pp. 483–492, Springer, Heldelberg, Germany, 2007.
[18]
V. Thilak and C. D. Creusere, “Tracking of extended size targets in H. 264 compressed video using the probabilistic data association filter,” in Real-Time Image and Video Processing (EUSIPCO '04), Proceedings of SPIE, pp. 281–284, September 2004.
[19]
H. 264/AVC, Draft ITU-T Rec. and Final Draft Intl. Std. of Joint Video Spec. (H. 264/AVC), Joint Video Team, Doc. JVT-G050, 2003.
[20]
“Applications and requirement for scalable video coding,” JVT ISO/IEC JTC1/SC29/WG11 Doc. N6880, Hong-Kong, China, 2005.
[21]
T. Wiegand, et al., “ISO/IEC, 14496-10:200X/Amd. 3 Part 10: Advanced Video Coding—AMENDMENT 3: Scalable Video Coding Joint Draft ITU-T Rec. H. 264/ISO/IEC, 14496-10/Amd. 3 Scalable video coding,” Joint Video Team Doc. JVT-X201, July 2007.
T. Wiegand, G. Sullivan, J. Reichel, H. Schwarz, and M. Wien, “Joint draft 8 of SVC amendment,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q. 6 9 (JVT-U201), Hangzhou, China, October 2006.
[24]
H. Chen, Z. Han, R. Hu, and R. Ruan, “Adaptive FMO selection strategy for error resilient H.264 coding,” in Proceedings of the International Conference on Audio, Language and Image Processing (ICALIP '08), pp. 868–872, Shanghai, China, July 2008.
[25]
Z. Lu, et al., “CE8: ROI-based scalable video coding,” JVT-O308, Busan, Korea, April 2005.
[26]
T. C. Thang, et al., “Spatial scalability of multiple ROIs in surveillance video,” JVT-O037, Busan, Korea, April, 2005.
[27]
Z. Lu, “Perceptual region-of-interest (ROI) based Scalable Video Coding,” JVT-O056, Busan, Korea, April 2005.
[28]
M. Shoaib and A. Cai, “Efficient residual prediction with error concealment in extended spatial scalability,” in Proceedings of the 9th Annual Wireless Telecommunications Symposium (WTS '10), April 2010.
[29]
E. Francois and J. Vieron, “Extended spatial scalability: a generalization of spatial scalability for non dyadic configurations,” in Proceedings of the IEEE International Conference on Image Processing (ICIP '06), pp. 169–172, October 2006.
[30]
Y. Hu, D. Rajan, and L.-T. Chia, “Detection of visual attention regions in images using robust subspace analysis,” Journal of Visual Communication and Image Representation, vol. 19, no. 3, pp. 199–216, 2008.
[31]
L. Liu, S. Zhang, X. Ye, and Y. Zhang, “Error resilience schemes of H.264/AVC for 3G conversational video services,” in Proceedings of the 5th International Conference on Computer and Information Technology (CIT '05), pp. 657–661, Binghamton, New York, USA, September 2005.
[32]
O. Ndili and T. Ogunfunmi, “On the performance of a 3D flexible macroblock ordering for H.264/AVC,” in Proceedings of the International Conference on Consumer Electronics (ICCE '06), pp. 37–38, January 2006.
[33]
H. K. Arachchi, W. A. C. Fernando, S. Panchadcharam, and W. A. R. J. Weerakkody, “Unequal error protection technique for ROI based H.264 video coding,” in Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE '06), pp. 2033–2036, Ottawa, Canada, May 2006.
[34]
D. Grois, E. Kaminsky, and O. Hadar, “Adaptive bit-rate control for region-of-interest scalable video coding,” in Proceedings of the 26th Convention of Electrical and Electronics Engineers in Israel (IEEEI '10), pp. 761–765, Eilat, Israel, November 2010.
[35]
Z. Li, F. Pan, K. P. Lim, G. Feng, X. Lin, and S. Rahardja, “Adaptive basic unit layer rate control for JVT,” Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q. 6) Doc. JVT-G012, Pattaya, Thailand, 2003.
[36]
Z. G. Li, W. Yao, S. Rahardja, and S. Xie, “New framework for encoder optimization of scalable video coding,” in Proceedings of the IEEE Workshop on Signal Processing Systems (SiPS '07), pp. 527–532, October 2007.
[37]
D. Grois and O. Hadar, “Recent advances in Region-of-Interest coding,” in Recent Advances on Video Coding, J. Del Ser Lorente, Ed., pp. 49–76, 2011.
[38]
D. Grois and O. Hadar, “Advances in Region-of-Interest video and image processing,” in Multimedia Networking and Coding, R. A. Farrugia and C. J. Debono, Eds., pp. 76–123, IGI Global, 2012.
[39]
D. Grois and O. Hadar, “Region-of-Interest processing and coding techniques: overview of recent trends and directions,” in Intelligent Multimedia Technologies For Networking Applications: Techniques and Tools, D. Kanellopoulos, Ed., pp. 126–155, IGI Global, 2013.
[40]
Y. Liu, Z. G. Li, and Y. C. Soh, “Rate control of H.264/AVC scalable extension,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 1, pp. 116–121, 2008.
[41]
L. Xu, S. Ma, D. Zhao, and W. Gao, “Rate control for scalable video model,” in Proceedings of the Visual Communications and Image Processing Conference, pp. 525–534, July 2005.
[42]
T. Anselmo and D. Alfonso, “Constant Quality Variable Bit-Rate control for SVC,” in Proceedings of the 11th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS '10), April 2010.
[43]
H. Roodaki, H. R. Rabiee, and M. Ghanbari, “Rate-distortion optimization of scalable video codecs,” Signal Processing, vol. 25, no. 4, pp. 276–286, 2010.
[44]
J. Ribas-Corbera, P. A. Chou, and S. L. Regunathan, “A generalized hypothetical reference decoder for H.264/AVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 674–687, 2003.
[45]
K.-P. Lim, G. Sullivan, and T. Wiegand, “Text description of joint model reference encoding methods and decoding concealment methods,” Study of ISO/IEC, 14496-10 and ISO/IEC, 14496-5/ AMD6 and Study of ITU-T Rec. H. 264 and ITU-T Rec. H. 2. 64. 2, in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Doc. JVT-O079, Busan, Korea, April 2005.
[46]
T. Chiang and Y.-Q. Zhang, “A new rate control scheme using quadratic rate distortion model,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 7, no. 1, pp. 246–250, 1997.
[47]
E. Kaminsky, D. Grois, and O. Hadar, “Dynamic computational complexity and bit allocation for optimizing H.264/AVC video compression,” Journal of Visual Communication and Image Representation, vol. 19, no. 1, pp. 56–74, 2008.
[48]
T. Wiegand, H. Schwarz, A. Joch, F. Kossentini, and G. J. Sullivan, “Rate-constrained coder control and comparison of video coding standards,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 688–703, 2003.
[49]
D. Grois, E. Kaminsky, and O. Hadar, “Buffer control in H.264/AVC applications by implementing dynamic complexity-rate-distortion analysis,” in Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB '09), May 2009.
[50]
D. Grois, E. Kaminsky, and O. Hadar, “ROI adaptive scalable video coding for limited bandwidth wireless networks,” in Proceedings of the IFIP Wireless Days (WD '10), Venice, Italy, October 2010.
[51]
D. Grois and O. Hadar, “Efficient adaptive bit-rate control for Scalable Video Coding by using computational complexity-rate-distortion analysis,” in Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB '11), Nuremberg, Germany, June 2011.
[52]
D. Grois and O. Hadar, “Complexity-aware adaptive spatial pre-processing for ROI scalable video coding with dynamic transition region,” in Proceedings of the 18th IEEE International Conference on Image Processing (ICIP '11), pp. 741–744, Brussels, Belgium, September 2011.