Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
акустика / xie_bosun_spatial_sound_principles_and_applications.pdf
Скачиваний:
158
Добавлен:
04.05.2023
Размер:
28.62 Mб
Скачать

Microphone and signal simulation techniques  269

where θS is the target source azimuth in the original sound field. The normalized magnitude of C, L, and R microphone outputs maximize to a unit at θS = 0° and ±74°, respectively. When a target source is located midway between the main axis directions of two adjacent microphones, the normalized magnitudes of these two microphone outputs decrease by −3 dB with respect to the maximal on-axis output of a unit. A direct realization of the microphones with a second-order directivity may be difficult, but the signals given in Equation (7.2.1) can be derived for the outputs of an appropriate microphone array.

The virtual source localization performance in the reproduction of signals captured by the aforementioned coincident microphone array can be analyzed on the basis of the theorems presented in Section 3.2. Similar to the case of two-channel stereophonic sound, virtual source positions in the reproduction of three frontal channels may not be exactly consistent with those of the actual source at the original stage, but recreating the relative position distribution of virtual sources in reproduction is enough for live recording.

7.2.4  Microphone techniques for ambience recording and combination with frontal localization information recording

As stated in Section 7.2.3, two separate microphone arrays can be used to capture the frontal localization and ambient information in 5.1-channel recording. For live recording in a concert hall, ambiences are mainly reflections. In this case, ambient information is usually recorded with a wide-spaced microphone array arranged relatively far from the sources. The resultant decorrelated reflected signals recreate subjective sensations similar to those in the concert hall by using the direct method stated in Section 3.1. For 5.1-channel recording, the outputs of the ambient microphone array may be fed to the two surround channels only; accordingly, the three frontal channel microphones should be involved in the recording of ambient information. Alternatively, the outputs of ambient microphone array may be fed to frontal and surround channels to recreate the sensations of envelopment in reproduction. Many techniques for ambience recording have been developed, but some of them are based on experience rather than strict acoustic theory. The combinations of ambient microphone and frontal channel microphone arrays in Section 7.2.3 result in various practical 5.1-chan- nel microphone techniques. Furthermore, 5.1-channel recording with two separate microphone arrays is flexible. The performance of frontal localization information recording and ambience recording can be optimized separately with the less restrictive relation, and the direct-to-reverberation ratio in the recording is easily controlled. Various combinations of the frontal channel and ambient microphone arrays are available for a practical choice. An appropriate electronic delay may be supplemented to ambient signals to reduce their influence on frontal localization according to the precedence effect.

In the direct method of ambience recording, a pair of wide-spaced microphones is used to capture the decorrelated reflected signals. The theoretical basis of this method is expressed in Equations (1.2.29) and (1.2.30). An example of the combination of the frontal channel and ambient microphone arrays is the Fukada tree shown in Figure 7.9 (Fukada et al., 1997; Fukada, 2001). The configuration of three frontal microphones, including left (L), center (C), and right (R) microphones, is similar to that of the Decca tree. The Decca tree for twochannel stereophonic recording involves three omnidirectional microphones. The captured signals include frontal localization information and rear reflected information, and they are then reproduced by a pair of frontal stereophonic loudspeakers. In 5.1-channel reproduction, the frontal and rear information is reproduced by frontal and surround loudspeakers, respectively. Accordingly, three cardioid microphones with their main axes pointing to ± 55° to ± 65° in the LF and RF directions and to 0° in the C direction are used in the Fukada tree to capture the frontal source and frontal reflected signals. The directivity of the three frontal

270 Spatial Sound

Figure 7.9 Fukada tree.

microphones reduces the captured power of rear reflections. Two omnidirectional outrigger microphones, namely, LL and RR, are sometimes added outside the left and right microphones. The outputs of the outrigger microphones are usually panned between the left and left surround (or right and right surround) channels to increase the recording width of the frontal stage. A pair of left-back (surround) and right-back (surround) cardioid microphones, denoted by LS and RS, are used to record surround channels. They are located at the reverberation radius of the hall and spaced at a distance not less than the reverberation radius. Their main axes point to ±135° to ±150°. The outputs of LS and RS microphones are dominated by decorrelated reverberation from the rear. As stated in Section 7.2.3, the three frontal channel signals captured with wide-spaced microphone arrays such as Fukada tree result in the degraded quality of a virtual source. However, the three frontal microphones capture the ambience from the frontal at the same time. Wide-spaced microphone arrays reduce the cross-correlation among the outputs and then improve the auditory spatial impression in reproduction.

In addition to the wide-spaced microphone array, a near-coincident pair whose main axes pointing to the left-back and right-back directions or even a XY coincident pair (or its equivalent MS pair) can also be used to capture rear reflection. The combination of this near-coincident pair and three appropriate frontal channel microphones results in a complete 5.1-channel microphone technique. In contrast to the main microphone array in Section 7.2.2, the two back (surround) microphones in this technique are located far from the three frontal microphones (e.g., at a distance of 2–3 m or more). Accordingly, the outputs of two back microphones are mainly rear reflections, and they possess a low correlation with the three frontal channel outputs so that the summing localization between the frontal and rear channels is ignored. A pair of coincident or near-coincident rear microphones is not enough to record the decorrelated reverberation signals. However, when the outputs of these two microphones are fed to a pair of (rear) surround loudspeakers, a subjective sensation similar to those caused by reflections in a hall may be recreated by using the indirect method in Section 3.1. For example, the 5.1-channel recording technique suggested by DPA involves an array similar to the Decca tree to capture the frontal channel signals and a near-coincident ORTF pair to capture the surround channel signals (Nymand, 2003). The distance between the adjacent frontal microphones varies from 0.6 m to 1.2 m. The rear ORTF pair is located 8–10 m from the frontal array. Berg and Rumsey (2002) also used a near-coincident cardioid pair to capture rear reflections, but they utilized three coincident microphones for the recording of three frontal channels.

Microphone and signal simulation techniques 271

Figure 7.10 IRT cross.

Ambience can also be captured by four cardioid or omnidirectional microphones with a square arrangement. The four outputs are fed to the L, R, LS and the RS channels (Theile, 2001). The configuration of microphones is shown in Figure 7.10 and is called IRT-cross. When four cardioid microphones are used, the main axes of the microphones point to the LF, RF, LB and RB directions. The distance between the adjacent microphones varies from 0.25 m to 0.4 m. The distance of omnidirectional microphones is usually larger than that of cardioid microphones because the directivity of cardioid microphones also contributes to the decorrelation of the recorded signals in a reflected sound field. Theile combined the OCT frontal microphone array in Figure 7.7 with the IRT-cross in Figure 7.10 to construct a complete 5.1-channel microphone technique. The IRT-cross is located some distance behind the three frontal channel microphone array.

Hamasaki and Hiyama (2003) of NHK also proposed to use an array of four directional microphones with a square arrangement to capture the reflections in halls. This array is termed the Hamasaki square. The outputs of four microphones are fed to the L, R, and LS and the RS channels. Various configurations of the directivities of microphones are found in the Hamasaki square. Configuration 1 in Figure 7.11 (a) involves four bidirectional microphones with their main axes pointing to the lateral directions. It aims to capture lateral reflections and restrain the frontal direct sound and rear reflections to reduce their influence on frontal localization in reproduction. Configuration 2 in Figure 7.11 (b) involves two bidirectional microphones and two cardioid microphones. The main axes of cardioid microphones point to the rear directions. This configuration aims to capture rear reflections. Configuration 3 in Figure 7.11 (c) involves four bidirectional microphones with their main axes pointing to the lateral directions and two cardioid microphones with their main axes pointing to the rear. This configuration aims to capture lateral and rear reflections. The spaces between microphones in Figure 7.11 are chosen in terms of the correlation of the microphone outputs in the reverberation field, usually within the range of 2–3 m.

The Hamasaki square can be combined with other arrays to construct a complete 5.1-chan- nel microphone technique. Hamasaki’s original scheme was a combination of the frontal array in Figure 7.8 with a wide-spaced cardioid pair to capture rear/surround channel signals. The cardioid pair is located 2–3 m behind the frontal array and spaced apart at a distance of about 3 m. The main axes of the cardioid pair point to the left-back and right-back directions to capture the rear reflections. A Hamasaki square is shown in Figure 7.11(a) can be added to the above array to capture the ambient signals, and the outputs are mixed to the L, R, and the LS and the RS channels. The microphones in the Hamasaki square are spaced apart at a distance of 1 m (smaller than the latter choice of 2–3 m). In another scheme, Hamasaki also

272 Spatial Sound

Figure 7.11 The Hamasaki square: (a) configuration 1; (b) configuration 2; (c) configuration 3.

suggested using a five-microphone array similar to Figure 7.4 to capture the three frontal channel signals, but supercardioid microphones instead of cardioid microphones are used. The microphones are spaced apart at a distance of 1.5 m. An omnidirectional pair spaced apart by 4 m is also added. Their outputs are low-pass filtered with a crossover frequency of 250 Hz and mixed to the left and right channels to enhance the low-frequency recording. A Hamasaka square is placed 2–10 m behind the frontal array, which is determined on the basis of the required ratio of direct and reflected sounds in the captured signals.

Klepko (1997) proposed to use an omnidirectional pair placed in the two ears of an artificial head to capture ambient signals and combine them with a lined array of three microphones (Section 7.2.3) for 5.1-channel recording. The artificial head is placed 1.24 m behind the line array. As stated in Section 1.4, an artificial head simulates the anatomical structures of a real human from the perspective of acoustics. Binaural signals from artificial head recording are originally appropriate for headphone presentation. As stated in Section 11.8, a crosstalk cancellation processing should be supplemented when binaural signals are reproduced through loudspeakers. The effect of the head shadow partly plays a natural role in crosstalk cancellation because surround loudspeakers in 5.1-channel reproduction are arranged at the azimuths (±110°). Therefore, crosstalk cancellation processing is omitted in Klepko’s scheme. However, the final binaural signals in reproduction undergo the scattering and diffraction of the head/pinna twice (i.e., one is in the course of recording with the artificial head, and the other is in the course of reproduction to a listener, resulting in variation in the spectra in final binaural signals (pressures). Thus, timbre coloration occurs. Consequently, binaural signals from artificial head recording should be equalized.

The original purpose of artificial head recording is to make up for the deficiency of other methods, e.g., to recreate a virtual source within the rear region of ±90° with a pair of surround loudspeakers in 5.1-channel reproduction and to recreate the sensations similar to that in a hall by the indirect methods in Section 3.1. However, as stated in Section 11.8, even if crosstalk cancellation is included, the listening region of binaural signal reproduction via loudspeakers is narrowed. For a pair of surround loudspeakers with a wide span angle of 140°, a slight lateral translation of the head position spoils virtual source localization. On the other hand, the binaural signals captured by an artificial head in a nearly diffused reverberation field possess approximately equal power spectra and random phases. The scattering and diffracting effects of the artificial head enhance the randomness of binaural signals so that they are decorrelated. When reproduced by a pair of surround loudspeakers, these decorrelated signals lead to the sensation of envelopment in reproduction, and the perceived effect is less sensitive to the listening position. Therefore, using the artificial head for ambience recording is effective.