General comments on resolution and information transfer rates, most of which are not modality specific, are presented in the Overview, in the discussion of the current state of the SE field. Here we supplement those previous, more general remarks with information that is specific to the auditory channel.
Most work on auditory resolving power has focused on artificially simple stimuli (in particular, tone pulses) or speech sounds. Except for a small amount of work directed toward aiding individuals with hearing impairments, relatively little attention has been given to the resolution of environmental sounds. Nevertheless, knowledge of the normal auditory system's ability to resolve arbitrary sounds is quite advanced. Thus, for example, there is an extensive literature, both experimental and theoretical, on the ability to discriminate between two similar sounds, or to perceive a target sound in the presence of a background masking sound (presented simultaneously or temporally separated). On the whole, knowledge in this area appears to be adequate for most SE design purposes. Useful information on auditory resolution can be found in the general texts on audition cited above, in the Handbook of Human Perception and Human Performance (Boff et al., 1986), in the Engineering Compendium (Boff and Lincoln, 1988), and, most importantly, in the many articles published each year by the Journal of the Acoustical Society of America.
Issues related to information transfer rates are much more complex because such rates depend not only on basic resolving power, but also on factors related to learning, memory, and perceptual and cognitive organization. It appears that the upper limits on information transfer rates for spoken speech and Morse code, two methods of encoding information acoustically for which there exist subjects who have had extensive training in deciphering the code, are roughly 60 bits/s and 20 bits/s, respectively. Unfortunately, we are unaware of any estimates of the information transfer rate for the perception of music. We would guess, however, that the rate lies between the above two, with a value closer to that of speech than of Morse code (because of the much higher dimensionality of the stimulus set in music than in Morse code). Although we cannot prove it, we suspect that the rate achieved with spoken speech is close to the maximum that can be achieved through the auditory channel. We say this not because we believe that speech is special, in the sense argued by various speech experts (e.g., Liberman et al., 1968), but rather because of the high perceptual dimensionality of speech sounds and because of the enormous amount of learning associated with the development of speech communication. Furthermore, except possibly for the case of music, we believe that it would be extremely difficult to achieve a comparatively