Cochlear implants (CIs) require efficient speech processing to maximize information transmission to the brain, especially in noise. A novel CI processing strategy was proposed in our previous studies, in which sparsity-constrained non-negative matrix factorization (NMF) was applied to the envelope matrix in order to improve the CI performance in noisy environments. It showed that the algorithm needs to be adaptive, rather than fixed, in order to adjust to acoustical conditions and individual characteristics. Here, we explore the benefit of a system that allows the user to adjust the signal processing in real time according to their individual listening needs and their individual hearing capabilities. In this system, which is based on MATLABR , SIMULINKR and the xPC TargetTM environment, the input/outupt (I/O) boards are interfaced between the SIMULINK blocks and the CI stimulation system, such that the output can be controlled successfully in the manner of a hardware-in-the-loop (HIL) simulation, hence offering a convenient way to implement a real time signal processing module that does not require any low level language. The sparsity constrained parameter of the algorithm was adapted online subjectively during an experiment with normal-hearing subjects and noise vocoded speech simulation. Results show that subjects chose different parameter values according to their own intelligibility preferences, indicating that adaptive real time algorithms are beneficial to fully explore subjective preferences. We conclude that the adaptive real time systems are beneficial for the experimental design, and such systems allow one to conduct psychophysical experiments with high ecological validity.
References
[1]
Berouti, M.; Schwartz, R.; Makhoul, J. Enhancement of Speech Corrupted by Acoustic Noise. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1979), Washington, DC, USA, 2–4 April 1979; pp. 208–211.
[2]
Ephraim, Y.; Malah, D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 1984, 32, 1109–1121.
[3]
Lockwood, P.; Boudy, J.; Blanchet, M. Non-linear Spectral Subtraction (NSS) and Hidden Markov Models for Robust Speech Recognition in Car Noise Environments. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1992), San Francisco, CA, USA, 23–26 March 1992; Volume 1, pp. 265–268.
[4]
Gannot, S.; Burshtein, D.; Weinstein, E. Iterative and sequential Kalman filter-based speech enhancement algorithms. IEEE Trans. Speech Audio Process 1998, 6, 373–385.
[5]
Martin, R. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process 2001, 9, 504–512.
[6]
Loizou, P.C. Speech Processing in Vocoder-centric Cochlear Implants. In Cochlear and Brainstem Implants; Meller, A., Ed.; Karger: Basel, Switzerland, New York, NY, USA, 2006; Volume 64, pp. 109–143.
[7]
Roberts, W.; Ephraim, Y.; Lev-Ari, H. A Brief Survey of Speech Enhancement. In Microelectronics, 2nd ed.; Whitaker, J.C., Ed.; CRC Press: Boca Raton, FL, USA, 2006. Chapter 20; pp. 1–11.
[8]
Hussain, A.; Chetouani, M.; Squartini, S.; Bastari, A.; Piazza, F. Nonlinear speech enhancement: An overview. Lect. Note. Comput. Sci. 2007, 4391, 217–248.
[9]
Nie, K.; Drennan, W.; Rubinstein, J. Cochlear Implant Coding Strategies and Device Programming. In Ballenger's Otorhinolaryngology: Head and Neck Surgery; Snow, J.B., Wackym, P.A., Ballenger, J.J., Eds.; People's Medical Publishing House: Shelton, CT, USA, 2009. Chapter 33; pp. 389–394.
[10]
Mohammadiha, N.; Leijon, A. Nonnegative Matrix Factorization Using Projected Gradient Algorithms with Sparseness Constraints. Proceedings of the 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2009), Ajman, UAE, 14–17 December 2009; pp. 418–423.
[11]
Hendriks, R.; Gerkmann, T. Noise correlation matrix estimation for multi-microphone speech enhancement. IEEE Trans. Audio Speech Lang. Process 2012, 20, 223–233.
[12]
Zhong, X.; Premkumar, A. Particle filtering approaches for multiple acoustic source detection and 2-D direction of arrival estimation using a single acoustic vector sensor. IEEE Trans. Signal Process. 2012, 60, 4719–4733.
[13]
Wilson, B.; Dorman, M. The surprising performance of present-day cochlear implants. IEEE Trans. Biomed. Eng. 2007, 54, 969–972.
[14]
Hu, H.; Sang, J.; Lutman, M.E.; Bleeck, S. Simulation of Hearing Loss Using Compressive Gammachirp Auditory Filters. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, 2–27 May 2011; pp. 5428–5431.
[15]
Loizou, P.C. Speech Enhancement: Theory and Practive; CRC Press: Boca Raton, FL, USA, 2007.
[16]
Cooke, M. A glimpsing model of speech perception in noise. J. Acoust. Soc. Am. 2006, 119, 1562–1573.
[17]
Li, G. Speech Perception in a Sparse Domain. Ph.D. Thesis, University of Southampton, Southampton, UK, 2008.
[18]
Hu, H.; Li, G.; Chen, L.; Sang, J.; Wang, S.; Lutman, M.; Bleeck, S. Enhanced Sparse Speech Processing Strategy for Cochlear Implants. Proceedings of the 19th European Signal Processing Conference (EUSIPCO 2011), Barcelona, Spain, 29 August–2 September 2011; pp. 491–495.
[19]
Li, G.; Lutman, M.; Wang, S.; Bleeck, S. Relationship between speech recognition in noise and sparseness. Int. J. Audiol. 2012, 51, 75–82.
[20]
Lee, D.; Seung, H. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791.
[21]
Lee, D.; Seung, H. Algorithms for Non-negative Matrix Factorization. Proceedings of the 25th Annual Conference on Neural Information Processing Systems (NIPS 2011), Granada, Spain, 5–10 December 2001; pp. 556–562.
[22]
Smaragdis, P.; Brown, J. Non-negative Matrix Factorization for Polyphonic Music Transcription. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New York, NY, USA, 19–22 October 2003; pp. 177–180.
[23]
Spratling, M. Learning image components for object recognition. J. Mach. Learn. Res. 2006, 7, 793–815.
[24]
Cichocki, A.; Zdunek, R.; Amari, S. New Algorithms for Non-Negative Matrix Factorization in Applications to Blind Source Separation. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), Toulouse, France, 14–19 May 2006; Volume 5, p. p. V.
[25]
Zdunek, R.; Cichocki, A. Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems. Comput. Intell. Neurosci. 2008, 2008, 939567:1–939567:13.
[26]
Potluru, V.; Calhoun, V. Group Learning Using Contrast NMF : Application to Functional and Structural MRI of Schizophrenia. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS 2008), Seattle, WA, USA, 18–21 May 2008; pp. 1336–1339.
[27]
Rennie, S.; Hershey, J.; Olsen, P. Efficient Model-based Speech Separation and Denoising Using Non-negative Subspace Analysis. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), Las Vegas, CA, USA, 30 March–4 April 2008; pp. 1833–1836.
[28]
Schmidt, M. Single-Channel Source Separation Using Non-Negative Matrix Factorization. Ph.D. Thesis, Technical University of Denmark, Lyngby, Denmark, 2008.
[29]
Shashanka, M.; Raj, B.; Smaragdis, P. Probabilistic latent variable models as nonnegative factorizations. Comput. Intell. Neurosci. 2008, 2008, 947438:1–947438:9.
[30]
Cichocki, A.; Zdunek, R.; Phan, A.; Amari, S. Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation; Wiley: Weinheim, Germany, 2009; p. p. 500.
[31]
Fevotte, C.; Bertin, N.; Durrieu, J. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Comput. 2009, 21, 793–830.
[32]
Mysore, G.; Smaragdis, P.; Raj, B. Non-negative Hidden Markov Modeling of Audio with Application to Source Separation. Proceedings of the 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10), St. Malo, France, 27–30 September 2010; pp. 140–148.
[33]
Mohammadiha, N.; Gerkmann, T.; Leijon, A. A New Linear MMSE Filter for Single Channel Speech Enhancement based on Nonnegative Matrix Factorization. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2011), New Paltz, NY, USA, 16–19 October 2011; pp. 45–48.
[34]
Wang, J.; Lai, S.; Li, M. Improved image fusion method based on NSCT and accelerated NMF. Sensors 2012, 12, 5872–5887.
[35]
Wang, W. Squared Euclidean Distance Based Convolutive Non-negative Matrix Factorization with Multiplicative Learning Rules for Audio Pattern Separation. Proceedings of the 7th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2007), Cairo, Egypt, 15–18 December 2007; pp. 347–352.
[36]
Wang, W.; Cichocki, A.; Chambers, J. A multiplicative algorithm for convolutive non-negative matrix factorization based on squared euclidean distance. IEEE Trans. Signal Process. 2009, 57, 2858–2864.
[37]
Hoyer, P. Non-negative Sparse Coding. Proceedings of the 2002 12th IEEEWorkshop on Neural Networks for Signal Processing, Valais, Switzerland, 4–6 September 2002; pp. 557–565.
[38]
Hu, H.; Mohammadiha, N.; Taghia, J.; Leijon, A.; Lutman, M.; Wang, S. Spasity Level in a Non-negative Matrix Factorization Based Speech Strategy in Cochlear Implants. Proceedings of the 19th European Signal Processing Conference (EUSIPCO 2012), Bucharest, Romania, 27–31 August 2012; pp. 2432–2436.
[39]
Hu, H.; Sang, J.; Lutman, M.E.; Bleeck, S. Non-Negative Matrix Factorization on the Envelope Matrix in Cochlear Implant. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Vancouver, Canada, 26–31 May 2013; pp. 7790–7794.
[40]
Ali, H.; Lobo, A.; Loizou, P. On the Design and Evaluation of the PDA-based Research Platform for Electric and Acoustic Stimulation. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2012), San Diego, CA, USA, 3–7 July 2012; pp. 2493–2496.
[41]
Dawson, P.; Mauger, S.; Hersbach, A. Clinical evaluation of signal-to-noise ratio-based noise reduction in Nucleus cochlear implant recipients. Ear Hear 2011, 32, 382–390.
[42]
MathWorks. xPC Target ?: Getting Started Guide. 2012. Available online: http://www.mathworks.co.uk/help/pdf_doc/xpc/xpctargetgs.pdf (accessed on 12 February 2012).
[43]
Hoyer, P. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 2004, 5, 1457–1469.
[44]
Virtanen, T. Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 2007, 15, 1066–1074.
[45]
Morup, M.; Madsen, K.; Hansen, L. Approximate L0 Constrained Non-negative Matrix and Tensor Factorization. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS 2008), Washington, DC, USA, 18–21 May 2008; pp. 1328–1331.
[46]
Patrick, J.; Busby, P.; Gibson, P. The development of the Nucleus Freedom Cochlear implant system. Trends Amplif. 2006, 10, 175–200.
[47]
Shannon, R.; Zeng, F.; Kamath, V.; Wygonski, J.; Ekelid, M. Speech recognition with primarily temporal cues. Science 1995, 270, 303–304.
[48]
Lutman, M.; Clark, J. Speech identification under simulated hearing-aid frequency response characteristics in relation to sensitivity, frequency resolution, and temporal resolution. J. Acoust. Soc. Am. 1986, 80, 1030–1040.
[49]
Kasturi, K.; Loizou, P.; Dorman, M.; Spahr, T. The intelligibility of speech with ‘holes’ in the spectrum. J. Acoust. Soc. Am. 2002, 112, 1102–1111.
[50]
MathWorks. Simulink ?: Developing S-Functions. 2012. Available online: http://www.mathworks.com/help/pdf_doc/simulink/sfunctions.pdf (accessed on 12 February 2012).
[51]
Stone, M.; Fullgrabe, C.; Moore, B. Benefit of high-rate envelope cues in vocoder processing: Effect of number of channels and spectral region. J. Acoust. Soc. Am. 2008, 124, 2272–2282.
[52]
Ma, J.; Hu, Y.; Loizou, P.C. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 2009, 125, 3387–3405.
[53]
Chen, F.; Loizou, P. Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech. J. Acoust. Soc. Am. 2010, 128, 3715–3723.
[54]
Chen, C.; Hendriks, R.; Heusdens, R.; Jensen, J. An algorithm for intelligibility prediction of time and frequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 2011, 19, 2125–2136.
[55]
Steeneken, H. A physical method for measuring speech transmission quality. J. Acoust. Soc. Am. 1980, 67, 318–326.
[56]
Goldsworthy, R.; Greenberg, J. Analysis of speech-based speech transmission index methods with implications for nonlinear operations. J. Acoust. Soc. Am. 2004, 116, 3679–3689.
[57]
Bench, J.; Kowal, A.; Bamford, J. The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children. Br. J. Audiol. 1979, 13, 108–112.