An adaptive redundant speech transmission (ARST) approach to improve the perceived speech quality (PSQ) of speech streaming applications over wireless multimedia sensor networks (WMSNs) is proposed in this paper. The proposed approach estimates the PSQ as well as the packet loss rate (PLR) from the received speech data. Subsequently, it decides whether the transmission of redundant speech data (RSD) is required in order to assist a speech decoder to reconstruct lost speech signals for high PLRs. According to the decision, the proposed ARST approach controls the RSD transmission, then it optimizes the bitrate of speech coding to encode the current speech data (CSD) and RSD bitstream in order to maintain the speech quality under packet loss conditions. The effectiveness of the proposed ARST approach is then demonstrated using the adaptive multirate-narrowband (AMR-NB) speech codec and ITU-T Recommendation P.563 as a scalable speech codec and the PSQ estimation, respectively. It is shown from the experiments that a speech streaming application employing the proposed ARST approach significantly improves speech quality under packet loss conditions in WMSNs.
References
[1]
Almalkawi, IT; Zapata, MG; AI-Karaki, JN; Morillo-Pozo, J. Wireless multimedia sensor networks: current trends and future directions. Sensors 2010, 10, 6662–6717, doi:10.3390/s100706662. 22163571
[2]
Akyildiz, IF; Melodia, T; Chowdhury, KR. A survey on wireless multimedia sensor networks. Comput. Netw 2007, 51, 921–960, doi:10.1016/j.comnet.2006.10.002.
[3]
Mangharam, R; Rowe, A; Rajkumar, R; Suzuki, R. Voice over Sensor Networks. Proceedings of 27th IEEE International Real-Time Systems Symposium (RTSS), Rio de Janeiro, Brazil, 5–8 December 2006; pp. 291–302.
[4]
Brunelli, D; Maggiorotti, M; Benini, L; Bellifemine, FL. Analysis of audio streaming capability of Zigbee networks. Lect. Note. Comput. Sci. (LNCS) 2008, 4913, 189–204.
[5]
Park, NI; Kim, HK; Jung, MA; Lee, SR; Choi, SH. Burst packet loss concealment using multiple codebooks and comfort noise for CELP-type speech coders in wireless sensor networks. Sensors 2011, 11, 5323–5336, doi:10.3390/s110505323. 22163902
[6]
Li, L; Xing, G; Sun, L; Liu, Y. QVS: Quality-Aware Voice Streaming for Wireless Sensor Networks. Proceedings of International Conference on Distributed Computing Systems (ICDCS), Montreal, QC, Canada, 22–26 June 2009; pp. 450–457.
[7]
Aghdasi, HS; Abbaspour, M; Moghadam, ME; Samei, Y. An energy-efficient and high-quality video transmission architecture in wireless video-based sensor networks. Sensors 2008, 8, 4529–4559, doi:10.3390/s8084529.
[8]
Petracca, M; Litovsky, G; Rinotti, A; Tacca, M; De Martin, JC; Fumagalli, A. Perceptual Based Voice Multi-Hop Transmission over Wireless Sensor Networks. Proceedings of IEEE Symposium on Computers and Communications (ISCC), Sousse, Tunisia, 5–8 July 2009; pp. 19–24.
[9]
Perkins, C; Hodson, O; Hardman, V. A survey of packet loss recovery techniques for streaming audio. IEEE Network 1998, 12, 40–48, doi:10.1109/65.730750.
[10]
Jayant, NS; Christensen, SW. Effects of packet losses in waveform coded speech and improvements due to an odd-even sample-interpolation procedure. IEEE Trans. Commun 1981, 29, 101–109, doi:10.1109/TCOM.1981.1094975.
[11]
3GPP. Substitution and Muting of Lost Frames for Full Rate Speech Channels. 3GPP TS 06.11;; 3GPP: Sophia-Antipolis, France, 2000.
[12]
Wasem, OJ; Goodman, DJ; Dvorak, CA; Page, HG. The effect of waveform substitution on the quality of PCM packet communications. IEEE Trans. Acoust. Speech Sign. Process 1988, 36, 342–348, doi:10.1109/29.1530.
[13]
Sanneck, H; Stenger, A; Younes, KB; Girod, B. A New Technique for Audio Packet Loss Concealment. Proceedings of IEEE Global Telecommunications Conference (GLOBECOM), London, UK, 18–22 November 1996; pp. 48–52.
[14]
Salami, R; Laflamme, C; Adoul, J-P; Kataoka, A; Hayashi, S; Moriya, T; Lamblin, C; Massaloux, D; Proust, S; Kroon, P; Shoham, Y. Design and description of CS-ACELP: A toll quality 8 kb/s speech coder. IEEE Trans. Acoust. Speech Sign. Process 1998, 6, 116–130, doi:10.1109/89.661471.
[15]
3GPP. Mandatory Speech Codec Speech Processing Functions; AMR Speech Codec; Error Concealment of Lost Frames. 3GPP TS 26.091;; 3GPP: Sophia-Antipolis, France, 2010.
[16]
Wang, J-F; Wang, J-C; Yang, J-F; Wang, J-J. A voicing-driven packet loss recovery algorithm for analysis-by-synthesis predictive speech coders over Internet. IEEE Trans. Multimedia 2001, 3, 98–107, doi:10.1109/6046.909597.
[17]
Hardman, V; Sasse, MA; Handley, M; Watson, A. Reliable Audio for Use over the Internet. Proceedings of Internet Society’s International Networking Conference (INET), Honolulu, HA, USA, 27–29 June 1995; pp. 171–178.
[18]
Rosenberg, J; Schulzrinne, H. An RTP Payload Format for Generic Forward Error Correction. RFC 1999. RFC 2733.
[19]
Podolsky, M; Romer, C; McCanne, S. Simulation of FEC-Based Error Control for Packet Audio on the Internet. Proceedings of 17th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), San Francisco, CA, USA, 29 March–2 April 1998; pp. 505–515.
[20]
Rein, S; Fitzek, FHP; Reisslein, M. Voice quality evaluation for wireless packet communication systems: A tutorial and performance results for ROHC. IEEE Wireless Commun 2005, 12, 60–76, doi:10.1109/MWC.2005.1404574.
Single-Ended Method for Objective Speech Quality Assessment in Narrow-Band Telephony Applications. ITU-T Recommendation P. 563;; ITU: Geneva, Switzerland, 2004.
[23]
Sjoberg, J; Westerlund, M; Lakaniemi, A; Xie, Q. Real-time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs. RFC 2002. RFC 3267.
[24]
Schulzrinne, H; Casner, S; Frederick, R; Jacobson, V. RTP: A Transport Protocol for Real-Time Applications. RFC 1996. RFC 1889.
[25]
Zhao, J; Govindan, R. Understanding packet delivery performance in dense wireless sensor networks. Proceedings of International Conference on Embedded Networked Sensor Systems (SenSys), Los Angeles, CA, USA, 5–7 November 2003; pp. 1–13.
[26]
Merazka, F. Improved packet loss recovery using interleaving for CELP-type speech coders in packet networks. IAENG Int. J. Comput. Sci 2009, 36, 1–5.
[27]
Multi-Lingual Speech Database for Telephonometry; NTT-AT: Tokyo, Japan, 1994.
[28]
Software Tools for Speech and Audio Coding Standardization. ITU-T Recommendation G.191;; ITU: Geneva, Switzerland, 1996.
[29]
Li, Y; Cai, W; Ji, W; Zhao, T. Loss Temporal Dependency Tomography in Wireless Sensor Network. Proceedings of International Conference on Wireless Communications, Networking and Mobile Computing (WiCom), Shanghai, China, 21–23 September 2007; pp. 2352–2355.
[30]
Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs. ITU-T Recommendation P.862;; ITU: Geneva, Switzerland, 2001.
[31]
Hu, Y; Loizou, PC. Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Language Process 2008, 16, 229–238, doi:10.1109/TASL.2007.911054.
[32]
Goudarzi, M; Sun, L; Ifeachor, E. PESQ and 3SQM Measurement of Voice Quality over Live 3G Networks. Proceedings of the Measurement of Speech, Audio and Video Quality in Networks (MESAQIN), Prague, Czech Republic, 11–12 June 2009; pp. 1–10.
[33]
Ho, M-J; Mostafa, A. AMR Call Quality Measurement Based on ITU-T P.862.1 PESQ-LQO. Proceedings of IEEE Vehicular Technology Conference (VTC), Montreal, QC, Canada, 25–28 September 2006; pp. 1–5.
[34]
Werner, M; Junge, T; Vary, P. Quality Control for AMR Speech Channels in GSM Networks. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, QC, Canada, 17–21 May 2004; pp. 1076–1079.
[35]
Salami, R; Laflamme, C; Bessette, B; Adoul, J-P. ITU-T G.729 Annex A: Reduced complexity 8 kb/s CS-ACELP codec for digital simultaneous voice and data. IEEE Commun. Mag 1997, 35, 56–63.