Despite recent advances in the area of home telemonitoring, the challenge of automatically detecting the sound signatures of activities of daily living of an elderly patient using nonintrusive and reliable methods remains. This paper investigates the classification of eight typical sounds of daily life from arbitrarily positioned two-microphone sensors under realistic noisy conditions. In particular, the role of several source separation and sound activity detection methods is considered. Evaluations on a new four-microphone database collected under four realistic noise conditions reveal that effective sound activity detection can produce significant gains in classification accuracy and that further gains can be made using source separation methods based on independent component analysis. Encouragingly, the results show that recognition accuracies in the range 70%–100% can be consistently obtained using different microphone-pair positions, under all but the most severe noise conditions. 1. Introduction 1.1. Home Telemonitoring The devotion to one’s parents has been displayed in various cultures throughout time and, as such, has usually included the care of one’s parents in old age. In colonial times, the care of frail aged persons was primarily the responsibility of the family. Today in the West, elderly people are primarily cared for in hospitals, nursing homes, or by their families. The United Nations predicts that by 2100, 28.1% of the world population will be aged 65 years or older, compared with 10.0% in 2000 and 6.9% in 1990 . The resulting increased demand on the health system, coupled with decreasing taxpayer support and fewer younger people to care for the elderly, will introduce significant pressure on aged care services. The main needs of these services are monitoring and supporting elderly people. For example, the fifth highest cause of death for elderly people is falls , and while the falls themselves may not be easily preventable, in many circumstances the deaths following them are, with appropriate monitoring and support. Consequently, significant research interest has been focused towards home-telecare solutions allowing elderly people to live safely and independently in their homes. In recent years, it has been suggested that sound “signatures” are well suited to automated telemonitoring of elderly people and superior to video cameras from the perspective of privacy . Telemonitoring using sound signatures is a relatively less explored area in the literature, in comparison with other techniques such as gait parameters and
D. Istrate, E. Castelli, M. Vacher, L. Besacier, and J. F. Serignat, “Information extraction from sound for medical telemonitoring,” IEEE Transactions on Information Technology in Biomedicine, vol. 10, no. 2, pp. 264–274, 2006.
B. G. Celler, T. Hesketh, W. Earnshaw, and E. Ilsar, “Instrumentation system for the remote monitoring of changes in functional health status of the elderly at home,” in Proceedings of the 16th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 908–909, November 1994.
N. Noury, T. Herve, V. Rialle et al., “Monitoring behaviour in the home using a smart fall sensor and position sensors,” in Proceedings of the 1st Annual International Conference on Microtechnologies in Medicine and Biology, pp. 607–610, 2000.
A. Fleury, M. Vacher, and N. Noury, “SVM-based multimodal classification of activities of daily living in health smart homes: sensors, algorithms, and first experimental results,” IEEE Transactions on Information Technology in Biomedicine, vol. 14, no. 2, pp. 274–283, 2010.
M. Vacher, D. Istrate, L. Besacier, J. F. Serignat, and E. Castelli, “Sound detection and classification for medical telesurvey,” in Proceedings of the IASTED International Conference on Biomedical Engineering, pp. 395–399, ACTA Press, Innsbruck, Austria, February 2004.
G. Virone, D. Istrate, M. Vacher, N. Noury, J. F. Serignat, and J. Demongeot, “First steps in data fusion between a multichannel audio acquisition and an information system for home healthcare,” in Proceddings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 1364–1367, September 2003.
E. Castelli, M. Vacher, D. Istrate, L. Besacier, and J. F. Serignat, “Habitat telemonitoring system based on the sound surveillance,” in Proceedings of the International Conference on Information Communication Technologies in Health, 2003.
M. Vacher, D. Istrate, F. Portet et al., “The sweet-home project: audio technology in smart homes to improve well-being and reliance,” in Proceedings of the IEEE International Conference on Engineering in Medicine and Biology, pp. 5291–5294, August 2011.
H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind source separation combining independent component analysis and beamforming,” Eurasip Journal on Applied Signal Processing, vol. 2003, no. 11, pp. 1135–1146, 2003.
F. Kraft, R. Malkin, T. Schaaf, and A. Waibel, “Temporal ICA for classification of acoustic events in kitchen environment,” in Proceedings of the International Conference on Speech and Language Processing, pp. 2689–2692, 2005.
Y. Wang, J. An, V. Sethu, and E. Ambikairajah, “Perceptually motivated pre-filter for speech enhancement using Kalman filtering,” in Proceedings of the 6th International Conference on Information, Communications and Signal Processing (ICICS '07), pp. 1–4, Singapore, December 2007.
B. S. Atal and L. R. Rabiner, “Pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, no. 3, pp. 201–212, 1976.
D. Maunder, E. Ambikairajah, J. Epps, and B. Celler, “Dual-microphone sounds of daily Life classification for telemonitoring in a noisy environment,” Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 2008, pp. 4636–4639, 2008.
D. H. Johnson and S. R. DeGraaf, “Improving the resolution of bearing in passive sonar arrays by eigenvalue analysis,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 30, no. 4, pp. 638–647, 1982.
N. C. Laydrus, E. Ambikairajah, and B. Celler, “Automated sound analysis system for home telemonitoring using shifted delta cepstral features,” in Proceedings of the 15th International Conference on Digital Signal Processing (DSP '07), pp. 135–138, Cardiff, UK, July 2007.
P. A. Torres-Carrasquillo, E. Singer, M. A. Kohler, R. J. Greene, D. A. Reynolds, and J. R. Deller, “Approaches to language identification using Gaussian mixture models and shifted delta cepstral features,” in Proceedings of the International Conference of Spoken Language Processing, pp. 89–92, 2002.
F. Allen, E. Ambikairajah, and J. Epps, “Language identification using warping and the shifted delta cepstrum,” in Proceedings of the IEEE 7th Workshop on Multimedia Signal Processing (MMSP '05), pp. 1–4, Shanghai, China, November 2005.