A bandwidth extension (BWE) algorithm from wideband to superwideband (SWB) is proposed for a scalable speech/audio codec that uses modified discrete cosine transform (MDCT) coefficients as spectral parameters. The superwideband is first split into several subbands that are represented as gain parameters and normalized MDCT coefficients in the proposed BWE algorithm. We then estimate normalized MDCT coefficients of the wideband to be fetched for the superwideband and quantize the fetch indices. After that, we quantize gain parameters by using relative ratios between adjacent subbands. The proposed BWE algorithm is embedded into a standard superwideband codec, the SWB extension of G.729.1 Annex E, and its bitrate and quality are compared with those of the BWE algorithm already employed in the standard superwideband codec. It is shown from the comparison that the proposed BWE algorithm relatively reduces the bitrate by around 19% with better quality, compared to the BWE algorithm in the SWB extension of G.729.1 Annex E. 1. Introduction In early speech communication services, narrowband codecs having a bandwidth of around 3.4？kHz were commonly used since the available network bandwidth was quite limited. These services could provide sufficient quality for comprehension, but it was generally agreed that they did not satisfy users' increasing expectations for higher sound quality. Due to the advances in network technologies, however, this transmission bandwidth has recently been increased [1–3]. Thus, a great deal of research has been focused on further extending the bandwidth of speech and/or audio signals from narrowband to wideband, superwideband, and audio band [4–6]. There are two different kinds of approaches for extending the bandwidth according to whether or not the side information is available, as shown in Figure 1. As depicted in Figure 1(a), it is usual to realize bandwidth extension by using the side information that is transmitted from the encoder. On the other hand, it is also possible to extend bandwidth only at the decoder without any side information , which is shown in Figure 1(b). In other words, instead of using the side information, artificial bandwidth extension can estimate the higher band signal from the lower band signal by using a pattern recognition algorithm such as hidden Markov models (HMMs)  Gaussian mixture models (GMMs)  and [10–15]. While artificial bandwidth extension algorithms do not require any additional bits for sending the side information, their performance is somewhat restricted depending on the performance
C. Lamblin, “Recent audio/speech coding developments in ITU-T and future trends,” in Proceedings of the European Signal Processing Conference (EUSIPCO '08), Plenary Lecture, Lausanne, Switzerland, 2008.
J. A. Kang and H. K. Kim, “An adaptive packet loss recovery method based on real-time speech quality assessment and redundant speech transmission,” International Journal of Innovative Computing, Information and Control, vol. 7, no. 12, pp. 6773–6783, 2011.
J. A. Kang and H. K. Kim, “Adaptive redundant speech transmission over wireless multimedia sensor networks based on estimation of perceived speech quality,” Sensors, vol. 11, no. 9, pp. 8469–8484, 2011.
N. I. Park and H. K. Kim, “Artificial bandwidth extension of narrowband speech applied to CELP-type speech coding,” Information-International Interdisciplinary Journal, vol. 16, no. 3(B), pp. 3153–3164, 2013.
Y. R. Oh, Y. G. Kim, M. Kim, H. K. Kim, M. S. Lee, and H. J. Bae, “Phonetically balanced text corpus design using a similarity measure for a stereo super-wideband speech database,” IEICE Transactions on Information and Systems, vol. E94-D, no. 7, pp. 1459–1466, 2011.
H. Pulakka, L. Laaksonen, M. Vainio, J. Pohjalainen, and P. Alku, “Evaluation of an artificial speech bandwidth extension method in three languages,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 6, pp. 1124–1137, 2008.
J. H. Park, H. K. Kim, M. B. Kim, and S. R. Kim, “A user voice reduction algorithm based on binaural signal separation for portable digital imaging devices,” IEEE Transactions on Consumer Electronics, vol. 58, no. 2, pp. 679–684, 2012.
J. A. Kang, C. J. Chun, H. K. Kim, M. B. Kim, and S. R. Kim, “A smart background music mixing algorithm for portable digital imaging devices,” IEEE Transactions on Consumer Electronics, vol. 57, no. 3, pp. 1258–1263, 2011.
Y. R. Oh, J. S. Yoon, H. K. Kim, M. B. Kim, and S. R. Kim, “A voice-driven scene-mode recommendation service for portable digital imaging devices,” IEEE Transactions on Consumer Electronics, vol. 55, no. 4, pp. 1739–1747, 2009.
M. Tammi, L. Laaksonen, A. R？m？, and H. Toukomaa, “Scalable superwideband extension for wideband coding,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '09), pp. 161–164, Taiwan, April 2009.
B. Geiser, P. Jax, P. Vary et al., “Bandwidth extension for hierarchical speech and audio coding in ITU-T Rec. G.729.1,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2496–2509, 2007.