Remote dynamic three-dimensional (3D) scene reconstruction renders the motion structure of a 3D scene remotely by means of both the color video and the corresponding depth maps. It has shown a great potential for telepresence applications like remote monitoring and remote medical imaging. Under this circumstance, video-rate and high resolution are two crucial characteristics for building a good depth map, which however mutually contradict during the depth sensor capturing. Therefore, recent works prefer to only transmit the high-resolution color video to the terminal side, and subsequently the scene depth is reconstructed by estimating the motion vectors from the video, typically using the propagation based methods towards a video-rate depth reconstruction. However, in most of the remote transmission systems, only the compressed color video stream is available. As a result, color video restored from the streams has quality losses, and thus the extracted motion vectors are inaccurate for depth reconstruction. In this paper, we propose a precise and robust scheme for dynamic 3D scene reconstruction by using the compressed color video stream and their inaccurate motion vectors. Our method rectifies the inaccurate motion vectors by analyzing and compensating their quality losses, motion vector absence in spatial prediction, and dislocation in near-boundary region. This rectification ensures the depth maps can be compensated in both video-rate and high resolution at the terminal side towards reducing the system consumption on both the compression and transmission. Our experiments validate that the proposed scheme is robust for depth map and dynamic scene reconstruction on long propagation distance, even with high compression ratio, outperforming the benchmark approaches with at least 3.3950 dB quality gains for remote applications.
References
[1]
Kitazaki M, Kobiki H, Maloney LT (2008) Effect of pictorial depth cues, binocular disparity cues and motion parallax depth cues on lightness perception in three-dimensional virtual scenes. PLoS ONE 3: e3177.
[2]
Gao Y, Tang J, Hong R, Yan S, Dai Q, et al. (2012) Camera constraint-free view-based 3D object retrieval. IEEE Transactions on Image Processing 21: 2269–2281.
[3]
Harding G, Harris JM, Bloj M (2012) Learning to use illumination gradients as an unambiguous cue to three dimensional shape. PLoS ONE 7: e35950.
[4]
Valverde-Islas LE, Arrangoiz E, Vega E, Robert L, Villanueva R, et al. (2011) Visualization and 3d reconstruction of flame cells of ?italic?taenia solium?/italic? (cestoda). PLoS ONE 6: e14754.
[5]
Jung JH, Hong K, Park G, Chung I, Park JH, et al. (2010) Reconstruction of three-dimensional occluded object using optical flow and triangular mesh reconstruction in integral imaging. Optics Express 18: 26373–26387.
[6]
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing 21: 4290–4303.
[7]
Ribes D, Parafita J, Charrier R, Magara F, Magistretti PJ, et al. (2010) Julide: A software tool for 3d reconstruction and statistical analysis of autoradiographic mouse brain sections. PLoS ONE 5: e14094.
[8]
Meir A, Rubinsky B (2009) Distributed network, wireless and cloud computing enabled 3-d ultrasound; a new medical technology paradigm. PLoS ONE 4: e7974.
[9]
Granot Y, Ivorra A, Rubinsky B (2008) A new concept for medical imaging centered on cellular phone technology. PLoS ONE 3: e2075.
[10]
Worringham C, Rojek A, Stewart I (2011) Development and feasibility of a smartphone, ecg and gps based system for remotely monitoring exercise in cardiac rehabilitation. PLoS ONE 6: e14669.
[11]
Bergrath S, Reich A, Rossaint R, Rrtgen D, Gerber J, et al. (2012) Feasibility of prehospital teleconsultation in acute stroke - a pilot study in clinical routine. PLoS ONE 7: e36796.
[12]
Henry P, Krainin M, Herbst E, Ren X, Fox D (2010) Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: International Symposium on Experimental Robotics. 1–6.
[13]
Cho JH, Kim SY, Ho YS, Lee K (2008) Dynamic 3d human actor generation method using a time-of-flight depth camera. IEEE Transactions on Consumer Electronics 54: 1514–1521.
[14]
Smolic A, Mller K, Merkle P, Fehn C, Kauff P, et al.. (2006) 3d video and free viewpoint video - technologies, applications and mpeg standards. In: International Conference on Multimedia and Exposition.
[15]
Fehn C (2004) Depth-image-based rendering (dibr), compression, and transmission for a new approach on 3d-tv. In: Stereoscopic Displays and Virtual Reality Systems XI, SPIE. 93–104.
[16]
Gao Y, Dai Q, Zhang N (2010) 3D model comparison using spatial structure circular descriptor. Pattern Recognition 43: 1142–1151.
[17]
Leyvand T, Meekhof C, Wei YC, Sun J, Guo B (2011) Kinect identity: Technology and experience. Computer 44: 94–96.
[18]
Kolb A, Barth E, Koch R (2008) Tof-sensors: New dimensions for realism and interactivity. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 1–6.
[19]
Tanimoto M (2012) Ftv: Free-viewpointtelevision. Signal Processing: Image Communication 27: 555–570.
[20]
Varekamp C, Barenbrug B (2007) Improved depth propagation for 2d-to-3d video conversion using key-frames. In: European Conference on Visual Media Production (IETCVMP). 1–7.
[21]
Yang Y, Liu Q, Ji R, Gao Y (2012) Dynamic 3d scene depth reconstruction via optical flow field rectification. PloS one 7: e47041.
[22]
Yan X, Yang Y, Er G, Dai Q (2011) Depth map generation for 2d-to-3d conversion by limited user inputs and depth propagation. In: 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON): 1–4.
[23]
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the h. 264/avc video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 13: 560–576.
[24]
Wang Y, Ostermann J, Zhang YQ (2002) Video Processing and Communications. Prentice-Hall.
[25]
Muller K, Smolic A, Dix K, Merkle P, Kauff P, et al.. (2008) View synthesis for advanced 3d video systems. EURASIP Journal on Image and Video Processing Volume 2008, Article ID 438148: doi:10.1155/2008/438148.
[26]
Orchard MT, Sullivan GJ (1994) Overlapped block motion compensation: An estimation theoretic approach. IEEE Transactions on Image Processing 3: 693–699.
[27]
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: From error visibility to structural similarity. IEEE Transaction on Image Processing 13: 600–612.
[28]
Ohm JR (2011) Call for proposals on 3d video coding technology. ISO/IEC JTC1/SC29/WG11, Doc W12036.
[29]
Bjontegaard G (2001) Calculation of average psnr differences between rd-curves. ITU-T SG-16 Video Coding Experts Group (VCEG).
[30]
Lee C, Ho YS (2008) View synthesis tools for 3d video. ISO/IEC JTC1/SC29/WG11, Doc M15851.