oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria  [cached]
Andreas Spanias,Ted Painter
EURASIP Journal on Advances in Signal Processing , 2003, DOI: 10.1155/s168761720321009x
Abstract: This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.
Perceptual Coding of Audio Signals Using Adaptive Time-Frequency Transform  [cached]
Umapathy Karthikeyan,Krishnan Sridhar
EURASIP Journal on Audio, Speech, and Music Processing , 2007,
Abstract: Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significantly reduced the cost of bandwidth and miniaturized storage facilities, the rapid increase in the volume of digital audio content constantly compels the need for better compression algorithms. Over the years various perceptually lossless compression techniques have been introduced, and transform-based compression techniques have made a significant impact in recent years. In this paper, we propose one such transform-based compression technique, where the joint time-frequency (TF) properties of the nonstationary nature of the audio signals were exploited in creating a compact energy representation of the signal in fewer coefficients. The decomposition coefficients were processed and perceptually filtered to retain only the relevant coefficients. Perceptual filtering (psychoacoustics) was applied in a novel way by analyzing and performing TF specific psychoacoustics experiments. An added advantage of the proposed technique is that, due to its signal adaptive nature, it does not need predetermined segmentation of audio signals for processing. Eight stereo audio signal samples of different varieties were used in the study. Subjective (mean opinion score—MOS) listening tests were performed and the subjective difference grades (SDG) were used to compare the performance of the proposed coder with MP3, AAC, and HE-AAC encoders. Compression ratios in the range of 8 to 40 were achieved by the proposed technique with subjective difference grades (SDG) ranging from –0.53 to –2.27.
Perceptual Coding of Audio Signals Using Adaptive Time-Frequency Transform  [cached]
Karthikeyan Umapathy,Sridhar Krishnan
EURASIP Journal on Audio, Speech, and Music Processing , 2007, DOI: 10.1155/2007/51563
Abstract: Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significantly reduced the cost of bandwidth and miniaturized storage facilities, the rapid increase in the volume of digital audio content constantly compels the need for better compression algorithms. Over the years various perceptually lossless compression techniques have been introduced, and transform-based compression techniques have made a significant impact in recent years. In this paper, we propose one such transform-based compression technique, where the joint time-frequency (TF) properties of the nonstationary nature of the audio signals were exploited in creating a compact energy representation of the signal in fewer coefficients. The decomposition coefficients were processed and perceptually filtered to retain only the relevant coefficients. Perceptual filtering (psychoacoustics) was applied in a novel way by analyzing and performing TF specific psychoacoustics experiments. An added advantage of the proposed technique is that, due to its signal adaptive nature, it does not need predetermined segmentation of audio signals for processing. Eight stereo audio signal samples of different varieties were used in the study. Subjective (mean opinion score ¢ € ”MOS) listening tests were performed and the subjective difference grades (SDG) were used to compare the performance of the proposed coder with MP3, AAC, and HE-AAC encoders. Compression ratios in the range of 8 to 40 were achieved by the proposed technique with subjective difference grades (SDG) ranging from ¢ € “0.53 to ¢ € “2.27.
An MP-based Harmonic and Individual Lines Sinusoidal Modeling in Parametric Audio Coding
一种基于匹配跟踪的谐波和独立谱线正弦模型实现方案

Wang Jing,Zhao Sheng-hui,Kuang Jing-ming,
王晶
,赵胜辉,匡镜明

电子与信息学报 , 2006,
Abstract: This paper presents an MP-based harmonic and individual lines sinusoidal modeling used in parametric audio coding. The input audio signal can be represented by harmonic plus individual lines and the model parameters (amplitudes, frequencies and phases) are extracted by MP (Matching Pursuit) algorithm which can lower computational complexity. The fundamental frequency is obtained by subharmonic summation. Harmonic frequency points are decided by MP exactly and are curve-fitted by quadric polynomial. The corresponding harmonic amplitudes are simulated by LPC spectrum envelope. In the decoder, audio signal is reconstructed with IFFT and OLA. The experimental results show that this sinusoidal modeling can represent the stationary parts of the audio signal very well.
Advance Source Coding Techniquesfor Audio/Speech Signal: A Survey  [PDF]
Sheetal D. Gunjal,Dr. Rajeshree D. Raut
International Journal of Computer Technology and Applications , 2012,
Abstract: Speech & Audio coding is widely used in application such as digital broad casting, Internet audio or music database to reduce the bit rate of high quality audio signal without comprising the perceptual value. Techniques have also been emerging in recent years that offers enhanced quality bit rate over traditional methods. Wideband audio compression is generally aimed at a quality that is nearly indistinguishable from consumer compact-disc audio. Sub band & transform coding methods contained with sophisticated perceptual coding techniques dominate in this area with good quality bit rates.
Perceptual Audio Hashing Functions  [cached]
?zer Hamza,Sankur Bülent,Memon Nasir,Anar?m Emin
EURASIP Journal on Advances in Signal Processing , 2005,
Abstract: Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.
A Multimedia Application: Spatial Perceptual Entropy of Multichannel Audio Signals  [cached]
Chen Shuixian,Hu Ruimin,Xiong Naixue
EURASIP Journal on Wireless Communications and Networking , 2010,
Abstract: Usually multimedia data have to be compressed before transmitting, and higher compression rate, or equivalently lower bitrate, relieves the load of communication channels but impacts negatively the quality. We investigate the bitrate lower bound for perceptually lossless compression of a major type of multimedia—multichannel audio signals. This bound equals to the perceptible information rate of the signals. Traditionally, Perceptual Entropy (PE), based primarily on monaural hearing measures the perceptual information rate of individual channels. But PE cannot measure the spatial information captured by binaural hearing, thus is not suitable for estimating Spatial Audio Coding (SAC) bitrate bound. To measure this spatial information, we build a Binaural Cue Physiological Perception Model (BCPPM) on the ground of binaural hearing, which represents spatial information in the physical and physiological layers. This model enables computing Spatial Perceptual Entropy (SPE), the lower bitrate bound for SAC. For real-world stereo audio signals of various types, our experiments indicate that SPE reliably estimates their spatial information rate. Therefore, "SPE plus PE" gives lower bitrate bounds for communicating multichannel audio signals with transparent quality.
A Multimedia Application: Spatial Perceptual Entropy of Multichannel Audio Signals  [cached]
Shuixian Chen,Ruimin Hu,Naixue Xiong
EURASIP Journal on Wireless Communications and Networking , 2010, DOI: 10.1155/2010/182627
Abstract: Usually multimedia data have to be compressed before transmitting, and higher compression rate, or equivalently lower bitrate, relieves the load of communication channels but impacts negatively the quality. We investigate the bitrate lower bound for perceptually lossless compression of a major type of multimedia—multichannel audio signals. This bound equals to the perceptible information rate of the signals. Traditionally, Perceptual Entropy (PE), based primarily on monaural hearing measures the perceptual information rate of individual channels. But PE cannot measure the spatial information captured by binaural hearing, thus is not suitable for estimating Spatial Audio Coding (SAC) bitrate bound. To measure this spatial information, we build a Binaural Cue Physiological Perception Model (BCPPM) on the ground of binaural hearing, which represents spatial information in the physical and physiological layers. This model enables computing Spatial Perceptual Entropy (SPE), the lower bitrate bound for SAC. For real-world stereo audio signals of various types, our experiments indicate that SPE reliably estimates their spatial information rate. Therefore, “SPE plus PE” gives lower bitrate bounds for communicating multichannel audio signals with transparent quality.
A Sinusoidal Modeling Method Based on Matching-Pursuits with Perceptual Gradient
基于匹配跟踪的感知梯度正弦建模方法

ZHANG Wen-Yao,XU Gang,WANG Yu-Guo,
张文耀
,许刚,王裕国

软件学报 , 2003,
Abstract: As an adaptive algorithm of signal decomposition, matching pursuits provides a new framework for sinusoidal modeling of speech and audio signal. In this paper, the procedure of sinusoidal modeling using matching pursuits is analyzed as well as the sinusoidal modeling algorithm using perceptually weighted matching pursuits. And a method of sinusoidal modeling with perceptual gradient is proposed. The proposed method, which adopts the adaptive feature of matching pursuits, computes dynamically a masking threshold from the currently synthesized signal using the psychoacoustic model. With the threshold, it extracts the most perceptually significant component from the residual signal. Therefore, the perceptual information contained in the synthesized signal increases as quickly as possible. The quality of the synthesized speech by this approach is rather high even if the model precision is low. Experiments prove that the method in this paper uses the features of hearing system in a better way, and the modeling is reasonable and efficient. Both the objective compare of SNR and the subjective listening test show the rationality and superiority of the new method.
Parametric Coding of Stereo Audio  [cached]
Jeroen Breebaart,Steven van de Par,Armin Kohlrausch,Erik Schuijers
EURASIP Journal on Advances in Signal Processing , 2005, DOI: 10.1155/asp.2005.1305
Abstract: Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional) audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.