oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Perceptual Video Quality Measurement Based on Generalized Priority Model  [PDF]
S. Bharath,S. Jaganath,J. Prakash
International Journal of Computer Science and Mobile Computing , 2013,
Abstract: We consider factors not only in a packet, but also in itslocality, to account for possible temporal and spatial maskingeffects. We apply our visibility model to packet priority for a videostream, when the network gets jam-packed at an in-between router;the router is able to choose which packets to drop such that visualquality of the video is minimally crashed. To show the effectivenessof our visibility model and its corresponding packet priority method,experiments are done to compare our perceptual-quality-basedpacket priority approach with existing Drop tail & hint track, Meansquare error priority methods. The result shows that our prioritymethod produces videos of higher perceptual quality for differentnetwork conditions. Our model was developed using data from highencoding-rate videos, and designed for high-quality video sent overa mostly reliable network; however, the experiments show the modelis valid to different encoding rates.
Optimization of the Block-level Bit Allocation in Perceptual Video Coding based on MINMAX  [PDF]
Chao Wang,Xuanqin Mou,Lei Zhang
Computer Science , 2015,
Abstract: In video coding, it is expected that the encoder could adaptively select the encoding parameters (e.g., quantization parameter) to optimize the bit allocation to different sources under the given constraint. However, in hybrid video coding, the dependency between sources brings high complexity for the bit allocation optimization, especially in the block-level, and existing optimization methods mostly focus on frame-level bit allocation. In this paper, we propose a macroblock (MB) level bit allocation method based on the minimum maximum (MINMAX) criterion, which has acceptable encoding complexity for offline applications. An iterative-based algorithm, namely maximum distortion descend (MDD), is developed to reduce quality fluctuation among MBs within a frame, where the Structure SIMilarity (SSIM) index is used to measure the perceptual distortion of MBs. Our extensive experimental results on benchmark video sequences show that the proposed method can greatly enhance the encoding performance in terms of both bits saving and perceptual quality improvement.
A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration  [cached]
Steven van de Par,Armin Kohlrausch,Richard Heusdens,Jesper Jensen
EURASIP Journal on Advances in Signal Processing , 2005, DOI: 10.1155/asp.2005.1292
Abstract: Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant) norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.
Novel rate distortion optimization strategy based on perceptual properties of texture and luminance
基于纹理和亮度感知特性的率失真优化策略

Yu Like,Dai Feng,Zhang Yongdong,Lin Shouxun,
俞力克
,代锋,张勇东,林守勋

中国图象图形学报 , 2012,
Abstract: RDO (rate-distortion optimization) plays an important role in video coding systems and has a great effect on coding efficiency.The most widely used RDO strategy uses MSE or other similar metrics for distortion modeling,which is not a good metric for subjective evaluation.In order to improve the perceptual quality,a novel perceptual distortion model is firstly proposed which takes the perceptual properties of texture and luminance into consideration.Based on the perceptual distortion model,a TL-RDO (texture and luminance based RDO) strategy is proposed which adjusts the Lagrangian multiplier dynamically according to visual perception.The simulation result shows that TL-RDO gets higher coding efficiency than the famous QP-RDO.Moreover,it has low computational consumption compared to other perceptual RDO strategies,which is suitable for real-time systems.
Q-STAR:A Perceptual Video Quality Model Considering Impact of Spatial, Temporal, and Amplitude Resolutions  [PDF]
Yen-Fu Ou,Yuanyi Xue,Yao Wang
Computer Science , 2012,
Abstract: In this paper, we investigate the impact of spatial, temporal and amplitude resolution (STAR) on the perceptual quality of a compressed video. Subjective quality tests were carried out on a mobile device. Seven source sequences are included in the tests and for each source sequence we have 27 test configurations generated by JSVM encoder (3 QP levels, 3 spatial resolutions, and 3 temporal resolutions), resulting a total of 189 processed video sequences (PVSs). Videos coded at different spatial resolutions are displayed at the full screen size of the mobile platform. Subjective data reveal that the impact of spatial resolution (SR), temporal resolution (TR) and quantization stepsize (QS) can each be captured by a function with a single content-dependent parameter. The joint impact of SR, TR and QS can be accurately modeled by the product of these three functions with only three parameters. We further find that the quality decay rates with SR and QS, respectively are independent of TR, and likewise, the decay rate with TR is independent of SR and QS, respectively. However, there is a significant interaction between the effects of SR and QS. The overall quality model is further validated on five other datasets with very high accuracy. The complete model correlates well with the subjective ratings with a Pearson Correlation Coefficient (PCC) of 0.991.
Rate Model for Compressed Video Considering Impacts Of Spatial, Temporal and Amplitude Resolutions and Its Applications for Video Coding and Adaptation  [PDF]
Zhan Ma,Hao Hu,Meng Xu,Yao Wang
Computer Science , 2012,
Abstract: In this paper, we investigate the impacts of spatial, temporal and amplitude resolution (STAR) on the bit rate of a compressed video. We propose an analytical rate model in terms of the quantization stepsize, frame size and frame rate. Experimental results reveal that the increase of the video rate as the individual resolution increases follows a power function. Hence, the proposed model expresses the rate as the product of power functions of the quantization stepsize, frame size and frame rate, respectively. The proposed rate model is analytically tractable, requiring only four content dependent parameters. We also propose methods for predicting the model parameters from content features that can be computed from original video. Simulation results show that model predicted rates fit the measured data very well with high Pearson correlation (PC) and small relative root mean square error (RRMSE). The same model function works for different coding scenarios (including scalable and non-scalable video, temporal prediction using either hierarchical B or IPPP structure, etc.) with very high accuracy (average PC $>$ 0.99), but the values of model parameters differ. Using the proposed rate model and the quality model introduced in a separate work, we show how to optimize the STAR for a given rate constraint, which is important for both encoder rate control and scalable video adaptation. Furthermore, we demonstrate how to order the spatial, temporal and amplitude layers of a scalable video in a rate-quality optimized way.
Error Resilient Video Transmission over Wireless Networks Based on Wyner-Ziv Coding of Motion Regions  [cached]
Tao Sheng,Jingli Zhou,Zhengbing Hu
Journal of Networks , 2009, DOI: 10.4304/jnw.4.9.905-912
Abstract: The multipath fading and shading of the wireless networks usually lead to the loss or error of video packets which results in significant video quality degradation. Existing approaches with forward error correction (FEC) or error concealment are unable to provide the desired robustness in video transmission. In this work, we develop a novel motion-based Wyner-Ziv coding (MWZC) scheme by leveraging distributed source coding (DSC) ideas for error robustness. The MWZC scheme is based on the fact that motion regions of a given video frame are particularly important to both objective and perceptual video quality and hence should be given preferential Wyner-Ziv coding based embedded protection. To achieve high coding efficiency, we determine the underlining motion regions based on a rate-distortion model. Within the framework of H.264/AVC specification, motion region determination can be efficiently implemented using Flexible Macroblock Ordering (FMO) and Data Partitioning (DP). The bit stream consists of two parts: the systematic portion generated from conventional H.264/AVC bit stream and the supplementary bit stream generated by the proposed feedback free rate allocation algorithm for Wyner-Ziv coding of motion regions. Experimental results demonstrate that the proposed scheme significantly outperforms both decoder-based error concealment (DBEC) and conventional FEC with DBEC approaches.
The Dynamic Range Paradox: A Central Auditory Model of Intensity Change Detection  [PDF]
Andrew J.R. Simpson, Joshua D. Reiss
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0057497
Abstract: In this paper we use empirical loudness modeling to explore a perceptual sub-category of the dynamic range problem of auditory neuroscience. Humans are able to reliably report perceived intensity (loudness), and discriminate fine intensity differences, over a very large dynamic range. It is usually assumed that loudness and intensity change detection operate upon the same neural signal, and that intensity change detection may be predicted from loudness data and vice versa. However, while loudness grows as intensity is increased, improvement in intensity discrimination performance does not follow the same trend and so dynamic range estimations of the underlying neural signal from loudness data contradict estimations based on intensity just-noticeable difference (JND) data. In order to account for this apparent paradox we draw on recent advances in auditory neuroscience. We test the hypothesis that a central model, featuring central adaptation to the mean loudness level and operating on the detection of maximum central-loudness rate of change, can account for the paradoxical data. We use numerical optimization to find adaptation parameters that fit data for continuous-pedestal intensity change detection over a wide dynamic range. The optimized model is tested on a selection of equivalent pseudo-continuous intensity change detection data. We also report a supplementary experiment which confirms the modeling assumption that the detection process may be modeled as rate-of-change. Data are obtained from a listening test (N = 10) using linearly ramped increment-decrement envelopes applied to pseudo-continuous noise with an overall level of 33 dB SPL. Increments with half-ramp durations between 5 and 50,000 ms are used. The intensity JND is shown to increase towards long duration ramps (p<10?6). From the modeling, the following central adaptation parameters are derived; central dynamic range of 0.215 sones, 95% central normalization, and a central loudness JND constant of 5.5×10?5 sones per ms. Through our findings, we argue that loudness reflects peripheral neural coding, and the intensity JND reflects central neural coding.
A Perceptual Coding Method Based on the Compression of Luma Coefficients
基于人眼感知特性的亮度系数压缩方法

YU Li,GUO Shan,XU Shilin,ZHOU Gang,LI Rong,
喻 莉
,郭 姗,徐士麟,周 刚,李 荣

中国图象图形学报 , 2009,
Abstract: In the conventional video coding system, the distortion is always measured by Mean-square-error (MSE). However, the MSE-based distortion fails to measure the subjective difference of videos. As a result, the property and interest of human visual system (HVS) should be considered by video coder. This paper proposes a new perceptual coding method aiming at compressing the luma coefficients in terms of the eyes sensitivity. By using the pre-quantization strategy, it discards the imperceptible information and improves the compression performance of video codec without contaminating the video subjective quality. The experiment result proves that this perceptual coding method can efficiently reduce the output bit rate of AVS reference software by 8%~40% while guaranteeing the quality of decoded video.
Stereoscopic Visual Attention-Based Regional Bit Allocation Optimization for Multiview Video Coding  [cached]
Yun Zhang,Gangyi Jiang,Mei Yu,Ken Chen
EURASIP Journal on Advances in Signal Processing , 2010, DOI: 10.1155/2010/848713
Abstract: We propose a Stereoscopic Visual Attention- (SVA-) based regional bit allocation optimization for Multiview Video Coding (MVC) by the exploiting visual redundancies from human perceptions. We propose a novel SVA model, where multiple perceptual stimuli including depth, motion, intensity, color, and orientation contrast are utilized, to simulate the visual attention mechanisms of human visual system with stereoscopic perception. Then, a semantic region-of-interest (ROI) is extracted based on the saliency maps of SVA. Both objective and subjective evaluations of extracted ROIs indicated that the proposed SVA model based on ROI extraction scheme outperforms the schemes only using spatial or/and temporal visual attention clues. Finally, by using the extracted SVA-based ROIs, a regional bit allocation optimization scheme is presented to allocate more bits on SVA-based ROIs for high image quality and fewer bits on background regions for efficient compression purpose. Experimental results on MVC show that the proposed regional bit allocation algorithm can achieve over 20~30% bit-rate saving while maintaining the subjective image quality. Meanwhile, the image quality of ROIs is improved by 0.46~0.61 dB at the cost of insensitive image quality degradation of the background image.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.