function of the space filter associated with the transformation that occurs as the acoustic waveform travels from the source to the eardrum. (In the time domain, the same transformation is achieved by convolving the transmitted time signal with the impulse response of the filter.) For binaural presentation, one such filter is applied for each of the two ears. Inasmuch as most of the work on virtual environments has focused on anechoic space, aside from the time delay corresponding to the distance between source and ear, the filter is determined solely by the reflection, refraction, and absorption associated with the body, head, and ears of the listener. Thus, the transfer functions have been referred to as head-related transfer functions (HRTFs). Of course, when realistic reverberant environments are considered, the transfer functions are influenced by the acoustic structure of the environment as well as that of the human body.

Estimates of HRTFs for different source locations are obtained by direct measurements using probe microphones in the listener's ear canals, by roughly the same procedure using mannequins, or by the use of theoretical models (Wightman and Kistler, 1989a,b; Wenzel, 1992; Gierlich and Genuit, 1989). Once HRTFs are obtained, the simulation is achieved by monitoring head position and orientation and providing, on a more or less continuous basis, the appropriate HRTFs for the given source location and head position/orientation.

The process of measuring HRTFs for a set of listeners is nontrivial in terms of time, skill, and equipment. Although restriction of HRTF measurements to the lower frequencies is adequate for localization in azimuth, it is not adequate for vertical localization, for externalization, or for elimination of front-back ambiguities (particularly if no head movement is involved). Thus, ideally, a sampling frequency of roughly 40 kHz is required (corresponding to an upper limit for hearing of 20 kHz). Similarly, if the HRTFs are measured in an acoustic environment with reverberation, the associated impulse responses can be very long (e.g., more than a second). These two facts, combined with the desire to measure HRTFs at many source locations, in many environments, and for many different listeners, can result in a monstrously time-consuming measurement program.

Preliminary investigations have demonstrated that, without training, nonindividualized HRTFs cause greater localization error, particularly in elevation and in front-back discrimination, than do HRTFs measured from each subject (Wenzel et al., 1993a). These results have been interpreted as showing that HRTFs must be measured for each individual subject in order to achieve maximum localization performance in an auditory SE. However, essentially all of the work done to date has been done without regard for the effects that could be achieved by means of sensorimotor adaptation (e.g., see Welch, 1978). We suspect that if subjects were given

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement