Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
акустика / xie_bosun_spatial_sound_principles_and_applications.pdf
Скачиваний:
158
Добавлен:
04.05.2023
Размер:
28.62 Mб
Скачать

38 Spatial Sound

Figure 1.20 Calculated ILD as a function of azimuth at different ka with the spherical head model.

ILD varies dramatically with azimuth at high frequencies, such as ka of 4.0 and 8.0 (in Figure 1.20). Additionally, the maximum ILD for a sinusoidal sound stimulus (with a single frequency component) does not appear at an azimuth of 90°, where the contralateral ear is exactly opposite the sound source. This finding is due to the enhancement in sound pressure in the contralateral ear by the in-phase interference of multipath diffracted sounds around the spherical head. For a complex sound wave with multiple frequency components, such as octave noise, ILD varies relatively smoothly with azimuth. However, an actual human head is not a perfect sphere, and it is composed of the pinnae and other fine structures. Therefore, the relationship between ILD, sound source direction, and frequency is more complicated than that for a spherical head. Nevertheless, the results from the spherical head model are adequate for qualitatively interpreting some localization phenomena.

1.6.3 Cone of confusion and head movement

ITD and ILD are regarded as two dominant localization cues at low and high frequencies, respectively, which were first stated in classic “duplex theory” proposed by Lord Rayleigh in 1907. However, a set of ITD and ILD are inadequate for determining the unique position of a sound source. In fact, an infinite number of spatial positions possess identical differences in path lengths to the two ears (i.e., identical ITD). When the curved surface of the spherical head is disregarded and when the two ears are approximated by two separated points in a free space, the points with identical ITD form a cone around the interaural axis in a threedimensional space, which is called “cone of confusion” (Figure 1.21). In the cone of confusion, ITD alone is insufficient for determining an exclusive sound source position. Similarly, for a spherical head model and at a far-field distance comparatively longer than the head radius, an infinite point set exists in space within which the ILDs are identical for all points. For an actual human head, even when its nonspherical form and curved surface are considered, the corresponding ITD and ILD are still insufficient for identifying the unique position of a sound source because they do not vary monotonously with the source position. In this case, the cone of confusion persists, but it is no longer a strict cone.

An extreme case of the cone of confusion is the median plane in which the sound pressures received by the two ears are nearly identical; thus, ITD and ILD are zero. In another case, two sound sources are located at the front–back mirror positions at the azimuths of 45° and

Sound field, spatial hearing, and sound reproduction 39

Figure 1.21 Cone of confusion in sound source localization.

135° in the horizontal plane. The resultant ITD and ILD for the two sound source positions are identical as far as a symmetrical spherical head is concerned. ITD and ILD can determine only the cone of confusion in which the sound source is located but not the unique spatial position of the sound source. Therefore, Rayleigh’s duplex theory is only effective for lateral localization and ineffective for the front–back and vertical localization.

To address this problem, Wallach (1940) hypothesized that ITD and ILD change introduced by head-turning may be another localization cue (i.e., a dynamic cue). For example, when the head in Figure 1.22 is fixed, ITDs and ILDs for sources at the front (0°) and rear (180°) in the

Figure 1.22 Changes in ITD caused by head rotation: sound sources in the (a) front and (b) the rear.

40  Spatial Sound

horizontal plane are both zero because of the symmetry of the head. Hence, the two source positions are indistinguishable in terms of ITD and ILD cues. However, head rotation can introduce a change in ITD. If the head is rotated anticlockwise (to the left) around the vertical axis, the right ear comes closer to the front sound source, and the left ear comes closer to the rear sound source. That is, for the same head rotation, the ITD for the front sound source changes from zero to negative; by contrast, the ITD for the rear sound source changes from zero to positive. If the head is rotated clockwise (to the right) around the vertical axis, a completely opposite situation occurs. Head rotation changes not only the ITD but also the ILD and the sound pressure spectra in the ears, although ILD is not a monotonic function of the source azimuth. Therefore, dynamic information aids localization. Previous experiments preliminarily confirmed that head rotation around a vertical axis is necessary to resolve the front–back ambiguity in horizontal localization. This conclusion has been further verified by some recent experiments (Wightman and Kistler, 1999) and applied to virtual auditory displays (Section 11.10.2). In addition, experimental evidence has indicated that the change in ITD provides major dynamic information about front–back localization (Macpherson, 2011).

Wallach also hypothesized that head-turning provides information for vertical localization. Follow-up studies have attempted to verify this hypothesis through experiments. However, completely and experimentally excluding contributions from other vertical localization cues (such as spectral cues; Section 1.6.4) was difficult. Wallach’s hypothesis was not widely explored because of the lack of sufficient experimental support. Since the 1990s, nevertheless, the problem of vertical localization has attracted renewed attention to develop a virtual auditory display. Perrett and Noble (1997) first experimentally verified Wallach’s hypothesis. Our own work (Rao and Xie, 2005) further demonstrated that the change in ITD introduced by the head movement in two degrees of freedom (turning around the vertical and front–back axes, respectively, i.e., rotating and pivoting or yawing and rolling) provides information for localization in the median plane at low frequencies and allows the quantitative verification of Wallach’s hypothesis to be given (Chapter 6). Some recent experiments have also confirmed the contributions of head-turning to vertical localization (Ashby et al., 2013, 2014). In addition, other experiments have investigated the range and pattern of head movement made by listeners (Kim et al., 2013).

1.6.4 Spectral cues

Many studies have suggested that the spectral feature caused by the reflection and diffraction in the pinna and around the head and torso provides helpful information on vertical localization and front-back disambiguity. In contrast to binaural cues (ITD and ILD), the spectral cue is a monaural cue (Wightman and Kistler, 1997).

Batteau (1967) proposed a simplified model to explain the pinna effect. Figure 1.23 shows that direct and reflected sounds arrive at the entrance to the ear canal. The relative delay between the direct and reflected sounds is direction dependent because the incident sounds from different spatial directions are likely to be reflected by the different parts of the pinna. Therefore, peaks and notches in the sound pressure spectra caused by the interference between the direct and reflected sounds are also direction dependent, thereby providing information for directional localization. In Batteau’s model, the pinna effect is described as a combination of two reflections with different magnitudes, i.e., A1 and A2, and different time delays, i.e., τ1 and τ2. Hence, the transfer function of the pinna, including one direct and two reflected sounds, is expressed as

H f 1 A1 exp j2 f 1 A2 exp j2 f 2 .

(1.6.8)

Sound field, spatial hearing, and sound reproduction 41

Figure 1.23 Pinna interacting with incident sounds from two typical directions.

Batteau’s model achieved limited success because of its considerable simplification. The dimension of the pinna is about 65 mm, so it functions effectively only if the frequency is above 2–3 kHz. At this frequency, the sound wavelength is comparable with the dimension of the pinna. Moreover, the effect of the pinna is prominent at frequencies above 5–6 kHz. The pinna also has a complex and irregular surface, so it cannot be regarded as a reflective plane from the perspective of geometrical acoustics within the entire audible frequency range. This characteristic is the inherent drawback of Batteau’s model. Further studies have pointed out that the pinna reflects and diffracts the incident sound in a complex manner (Lopez-Poveda and Meddis, 1996). The interference among direct and multipath reflected/diffracted sounds acts as a filter and then modifies the incident sound spectrum as direction-dependent notches and peaks. In addition, this interference is highly sensitive to the shape and dimension of the pinna, which differs among individuals. Therefore, the spectral information provided by the pinna is an extremely individualized localization cue.

Shaw and Teranishi (1968) and Shaw (1974) investigated the effect of the pinna in terms of wave acoustics and proposed a resonance model, which demonstrates that resonances within pinna cavities and the ear canal form a series of resonance modes at mid and high frequencies of 3, 5, 9, 11, and 13 kHz. This model successfully interprets the peaks in the pressure spectra; among them, the peak at 3 kHz for the first salient resonance is derived from the quarter wavelength resonance of the ear canal, though the existence of the pinna extends the effective length of the ear canal. Hearing is also most sensitive around this frequency (Section 1.3.2). Moreover, the magnitudes of high-order resonance modes vary with the direction of