Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
акустика / xie_bosun_spatial_sound_principles_and_applications.pdf
Скачиваний:
158
Добавлен:
04.05.2023
Размер:
28.62 Mб
Скачать

Binaural reproduction and virtual auditory display  561

HRTF-based filters are constantly updated according to the listener’s head position and orientation information detected by the head tracker. Spherical microphone array recording retains dynamic information and overcomes the shortcomings of conventional artificial head recordings. In practical implementation, the head turning to a specified direction is equivalent to turning the sound field in opposite directions; therefore, dynamic information can be incorporated by applying a rotation transformation to the independent signals of virtual Ambisonics (Section 9.4.2, Enzner et al., 2013). In this case, HRTF-based filters corresponding to M virtual loudspeaker directions are sufficient, but directional continuous HRTFs and updating HRTFs are not required.

11.11 SUMMARY

Binaural reproduction and VAD aim to reconstruct binaural sound pressures. The related binaural signals can be obtained by binaural recording from an artificial head or human subject, and they can also be obtained by binaural synthesis.

HRTFs, which are essential for binaural synthesis, can be acquired using three methods: measurement, calculation, and customization. The technique for far-field HRTF measurement is mature, and many far-field HRTF databases have been established. Near-field HRTF measurements are relatively difficult. Rare near-field HRTF databases for artificial head and human subjects are available. The analytical solutions of HRTFs can only be calculated for rare simplified head/torso models. Based on the scanning of the anatomical surfaces of subjects, numerical calculations, such as the BEM, yield HRTFs within the full audible frequency range with an appropriate accuracy. HRTF customization has various methods. HRTF customization is simpler than measurement or calculation and usually yields a modest perceived performance. However, the accuracy of customization is inferior to that of measurement or calculation. HRTF customization should be improved in many aspects. HRTFs and HRIRs exhibit various features in time and frequency domains. These features are closely related to auditory localization.

Binaural synthesis is often implemented using digital filters. HRTF-based filters can be implemented using different filter models and designed using different methods. The mini- mum-phase approximation and spectral smoothing of HRTFs simplify filters.

Measurement yields HRTFs at discrete and finite directions. HRTFs in unmeasured directions can be reconstructed or estimated from the measured data by using various interpolation schemes.

A directional continuous HRTF can be decomposed via spatial basis functions. Spatial harmonics are a common type of spatial basis function. The spatial harmonic decomposition of HRTFs is closely related to the directional interpolation of HRTFs from which the Shannon–Nyquist spatial sampling theorem of HRTFs can be derived. The spatial interpolation of HRTFs and the signal mixing of multichannel surround sounds are closely related to each other. Some signal mixing methods in multichannel sound reproduction are analogous to certain interpolation and recovery schemes for HRTFs. Under the theoretical framework of spatial function sampling, interpolation, and reconstruction, various spatial sound techniques are unified. The analogy between multichannel sound reproduction and HRTF interpolation enables the interchanging of some of the methods used for the two fields.

HRTFs can also be decomposed using spectral shape basis functions. PCA is an effective statistical algorithm for deriving basis functions. It eliminates the correlations among HRTFs so that HRTFs can be simply represented by the weighted sum of a small set of spectral shape basis functions.

562  Spatial Sound

The analogy between the signal mixing of multichannel sound and spatial interpolation of HRTFs or the basis function decomposition of HRTFs are applied to simplify the signal processing of binaural synthesis for multiple and moving virtual sources, resulting in virtual loudspeaker-based algorithms and basis function decomposition-based algorithms.

The equalization of the transfer characteristics of the headphone-to-ear canal is needed in binaural reproduction, which is realized by inverse HpTF filters. Some types of headphones exhibit poor repeatability in HpTF measurements, which are related to the compression deformation of pinnae by headphones. Ideally, individualized HpTFs should be used for headphone equalization. Reversal error, elevation error, and lateralization often occur in the headphone presentation of binaural signals. These defects are caused by the absence of dynamic cues and errors in the high-frequency spectral cue in static binaural reproduction. Reflections are vital to the externalization of a virtual source. Based on externalization, auditory distance perception in binaural reproduction can be controlled by reflection and binaural synthesis with near-field HRTFs.

Crosstalk cancellation is required when binaural signals are reproduced through loudspeakers. Binaural synthesis and crosstalk cancellation can be merged as transaural processing. Transaural reproduction with two frontal loudspeakers can recreate stable virtual sources in frontal-horizontal directions. A given crosstalk cancellation and transaural processing are only effective for a specified listening position and head orientation. Therefore, the listening region for transaural reproduction is narrow, and the perceived timbre coloration often occurs in binaural reproduction through loudspeakers. Nevertheless, timbre coloration can be reduced by the constant-power equalization algorithm.

Lateralization occurs when stereophonic sound and multichannel sound signals are directly presented by a pair of headphones. Binaural synthesis is applied to convert stereophonic sound and multichannel sound signals for headphone presentation. The transaural method is used for the stereophonic expansion and virtual reproduction of multichannel sound.

A complete VAD should include a reflection-modeling component hereafter called a VAE. Therefore, various methods exist for binaurally simulating reflections. A dynamic and realtime virtual auditory environment system simulates the dynamic auditory information caused by head movement and thus accurately recreates various auditory events.