As a basic study into 3-D audio display systems, this paper reports the conditions of moving sound image velocity and time-step where a discrete moving sound image is perceived as continuous motion. In this study, the discrete moving sound image was presented through headphones and ran along the ear-axis. The experiments tested the continuity of a discrete moving sound image using various conditions of velocity (0.25, 0.5, 0.75, 1, 2, 3, and 4?m/s) and time-step (0, 0.02, 0.04, 0.06, 0.08, 0.10, 0.12, and 0.14?s). As a result, the following were required in order to present the discrete moving sound image as continuous movement. (1) The 3-D audio display system was required to complete the sound image presentation process, including head tracking and HRTF simulation, in a time shorter than 0.02?s, in order to present sound image movement at all velocities. (2) A processing time longer than 0.1?s was not acceptable. (3) If the 3-D audio display system only presented very slow movement (less than about 0.5?m/s), processing times ranging from 0.04?s to 0.06?s were still acceptable. 1. Introduction 3-D audio display technology is important for virtual reality technologies. Simulation of the head-related transfer function (HRTF) using digital signal processing is the key technology for a 3-D audio display. In principle, sound signals are processed digitally for HRTF simulation and are presented to the listener through headphones. “Head tracking” technologies that control the virtual sound field according to the listener’s head position and orientation are well-known techniques to enhance the sound localization of a virtual sound image [1, 2] and are frequently used with HRTF simulation. Head tracking is based on position and orientation sensing technologies, and various sensors, such as magnetic 6 degrees of freedom (6DOF) sensors [3, 4], a gyro [5], a global positioning system (GPS) [6], and a camera [7], have been used for head tracking with a 3-D audio display. However, these sensors need adequate processing time to obtain the position and orientation. For example, the sampling frequency of GPS is about 5–20?Hz (i.e., the processing time for position and orientation sampling is 0.05–0.2?s). The long processing time makes the 3-D audio display “discrete,” which means that the sound image cannot move continuously and alternates between static and jumping. During the time the system is processing, the sound image cannot move and must remain static, and then after processing, the sound image can change its location. Even if the 3-D audio display processing is
References
[1]
D. R. Begault, E. M. Wenzel, and M. R. Anderson, “Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions on the spatial perception of a virtual speech source,” Journal of the Audio Engineering Society, vol. 49, no. 10, pp. 904–916, 2001.
[2]
G. Wersényi, “Effect of emulated head-tracking for reducing localization errors in virtual audio simulation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 2, pp. 247–252, 2009.
[3]
K.-U. Doerr, H. Rademacher, S. Huesgen, and W. Kubbat, “Evaluation of a low-cost 3D sound system for immersive virtual reality training systems,” IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 2, pp. 204–212, 2007.
[4]
Y. Seki and T. Sato, “Development of auditory orientation training system for the blind by using 3-D sound,” in Proceedings of the ACM Conference on Human Factors in Computing Systems (CVHI '06), CD-ROM, Kufstein, Austria, 2006.
[5]
M. Ohuchi, Y. Iwaya, Y. Suzuki, and T. Munekata, “Cognitive-map forming of the blind in a virtual sound environment,” in Proceedings of the 12th International Conference on Auditory Display, pp. 1–7, London, UK, 2006.
[6]
J. M. Loomis, J. R. Marston, R. G. Golledge, and R. L. Klatzky, “Personal guidance system for people with visual impairment: a comparison of spatial displays for route guidance,” Journal of Visual Impairment and Blindness, vol. 99, no. 4, pp. 219–232, 2005.
[7]
M. L. Cascia, S. Sclaroff, and V. Athitsos, “Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 4, pp. 322–336, 2000.
[8]
T. Z. Strybel, A. M. Witty, and D. R. Perrott, “Auditory apparent motion in the free field: The effects of stimulus duration and separation,” Perception & Psychophysics, vol. 52, no. 2, pp. 139–143, 1992.
[9]
S. Lakatos, “Temporal constraints on apparent motion in auditory space,” Perception & Psychophysics, vol. 54, no. 2, pp. 139–144, 1993.
[10]
K. Mizushima, S. Nakanishi, and M. Morimoto, “Continuity of a moving sound image caused by successive signals from two discretely located loudspeakers,” Journal of the Acoustical Society of Japan, vol. 15, no. 3, pp. 179–187, 1994.
[11]
T. Z. Strybel and M. L. Menges, “Auditory apparent motion between sine waves differing in frequency,” Perception, vol. 27, no. 4, pp. 483–495, 1998.
[12]
D. R. Perrott and A. D. Musicant, “Minimum auditory movement angle: binaural localization of moving sound sources,” Journal of the Acoustical Society of America, vol. 62, no. 6, pp. 1463–1466, 1977.
[13]
D. W. Grantham, “Adaptation to auditory motion in the horizontal plane: effect of prior exposure to motion on motion detectability,” Perception & Psychophysics, vol. 52, no. 2, pp. 144–150, 1992.
[14]
D. R. Perrott and K. Marlborough, “Minimum audible movement angle: marking the end points of the path traveled by a moving sound source,” Journal of the Acoustical Society of America, vol. 85, no. 4, pp. 1773–1775, 1989.
[15]
D. W. Chandler and D. W. Grantham, “Minimum audible movement angle in the horizontal plane as a function of stimulus frequency and bandwidth, source azimuth, and velocity,” Journal of the Acoustical Society of America, vol. 91, no. 3, pp. 1624–1636, 1992.
[16]
D. R. Perrott and J. Tucker, “Minimum audible movement angle as a function of signal frequency and the velocity of the source,” Journal of the Acoustical Society of America, vol. 83, no. 4, pp. 1522–1527, 1988.
[17]
T. Z. Strybel, C. L. Manligas, and D. R. Perrott, “Minimum audible movement angle as a function of the azimuth and elevation of the source,” Human Factors, vol. 34, no. 3, pp. 267–275, 1992.
[18]
D. W. Grantham, B. W. Y. Hornsby, and E. A. Erpenbeck, “Auditory spatial resolution in horizontal, vertical, and diagonal planes,” Journal of the Acoustical Society of America, vol. 114, no. 2, pp. 1009–1022, 2003.
[19]
S. Carlile and V. Best, “Discrimination of sound source velocity in human listeners,” Journal of the Acoustical Society of America, vol. 111, no. 2, pp. 1026–1035, 2002.
[20]
M. Agaeva, “Velocity discrimination of auditory image moving in vertical plane,” Hearing Research, vol. 198, no. 1-2, pp. 1–9, 2004.
[21]
S. Getzmann, “Effects of velocity and motion-onset delay on detection and discrimination of sound motion,” Hearing Research, vol. 246, no. 1-2, pp. 44–51, 2008.
[22]
J. A. Altman and O. V. Viskov, “Discrimination of perceived movement velocity for fused auditory image in dichotic stimulation,” Journal of the Acoustical Society of America, vol. 61, no. 3, pp. 816–819, 1977.
[23]
D. R. Perrott, B. Costantino, and J. Ball, “Discrimination of moving events which accelerate or decelerate over the listening interval,” Journal of the Acoustical Society of America, vol. 93, no. 2, pp. 1053–1057, 1993.
[24]
H. Kietz, “Das r?umliche Horen,” Acustica, vol. 3, no. 2, pp. 73–86, 1953.
[25]
B. McA, Sayers, and F. E. Toole, “Acoustic-mage lateralization judgments with binaural transients,” Journal of the Acoustical Society of America, vol. 36, no. 6, pp. 1199–1205, 1964.