- •Preface
- •Acknowledgments
- •Contents
- •1 Introduction
- •1.1 Auditory Temporal and Spatial Factors
- •1.2 Auditory System Model for Temporal and Spatial Information Processing
- •2.1 Analysis of Source Signals
- •2.1.1 Power Spectrum
- •2.1.2 Autocorrelation Function (ACF)
- •2.1.3 Running Autocorrelation
- •2.2 Physical Factors of Sound Fields
- •2.2.1 Sound Transmission from a Point Source through a Room to the Listener
- •2.2.2 Temporal-Monaural Factors
- •2.2.3 Spatial-Binaural Factors
- •2.3 Simulation of a Sound Field in an Anechoic Enclosure
- •3 Subjective Preferences for Sound Fields
- •3.2.1 Optimal Listening Level (LL)
- •3.2.4 Optimal Magnitude of Interaural Crosscorrelation (IACC)
- •3.3 Theory of Subjective Preferences for Sound Fields
- •3.4 Evaluation of Boston Symphony Hall Based on Temporal and Spatial Factors
- •4.1.1 Brainstem Response Correlates of Sound Direction in the Horizontal Plane
- •4.1.2 Brainstem Response Correlates of Listening Level (LL) and Interaural Crosscorrelation Magnitude (IACC)
- •4.1.3 Remarks
- •4.2.2 Hemispheric Lateralization Related to Spatial Aspects of Sound
- •4.2.3 Response Latency Correlates of Subjective Preference
- •4.3 Electroencephalographic (EEG) Correlates of Subjective Preference
- •4.3.3 EEG Correlates of Interaural Correlation Magnitude (IACC) Changes
- •4.4.1 Preferences and the Persistence of Alpha Rhythms
- •4.4.2 Preferences and the Spatial Extent of Alpha Rhythms
- •4.4.3 Alpha Rhythm Correlates of Annoyance
- •5.1 Signal Processing Model of the Human Auditory System
- •5.1.1 Summary of Neural Evidence
- •5.1.1.1 Physical Characteristics of the Ear
- •5.1.1.2 Left and Right Auditory Brainstem Responses (ABRs)
- •5.1.1.3 Left and Right Hemisphere Slow Vertex Responses (SVRs)
- •5.1.1.4 Left and Right Hemisphere EEG Responses
- •5.1.1.5 Left and Right Hemisphere MEG Responses
- •5.1.2 Auditory Signal Processing Model
- •5.2 Temporal Factors Extracted from Autocorrelations of Sound Signals
- •5.3 Auditory Temporal Window for Autocorrelation Processing
- •5.5 Auditory Temporal Window for Binaural Processing
- •5.6 Hemispheric Specialization for Spatial Attributes of Sound Fields
- •6 Temporal Sensations of the Sound Signal
- •6.1 Combinations of Temporal and Spatial Sensations
- •6.2 Pitch of Complex Tones and Multiband Noise
- •6.2.1 Perception of the Low Pitch of Complex Tones
- •6.2.3 Frequency Limits of Missing Fundamentals
- •6.3 Beats Induced by Dual Missing Fundamentals
- •6.4 Loudness
- •6.4.1 Loudness of Sharply Filtered Noise
- •6.4.2 Loudness of Complex Noise
- •6.6 Timbre of an Electric Guitar Sound with Distortion
- •6.6.3 Concluding Remarks
- •7 Spatial Sensations of Binaural Signals
- •7.1 Sound Localization
- •7.1.1 Cues of Localization in the Horizontal Plane
- •7.1.2 Cues of Localization in the Median Plane
- •7.2 Apparent Source Width (ASW)
- •7.2.1 Apparent Width of Bandpass Noise
- •7.2.2 Apparent Width of Multiband Noise
- •7.3 Subjective Diffuseness
- •8.1 Pitches of Piano Notes
- •8.2 Design Studies of Concert Halls as Public Spaces
- •8.2.1 Genetic Algorithms (GAs) for Shape Optimization
- •8.2.2 Two Actual Designs: Kirishima and Tsuyama
- •8.3 Individualized Seat Selection Systems for Enhancing Aural Experience
- •8.3.1 A Seat Selection System
- •8.3.2 Individual Subjective Preference
- •8.3.3 Distributions of Listener Preferences
- •8.5 Concert Hall as Musical Instrument
- •8.5.1 Composing with the Hall in Mind: Matching Music and Reverberation
- •8.5.2 Expanding the Musical Image: Spatial Expression and Apparent Source Width
- •8.5.3 Enveloping Music: Spatial Expression and Musical Dynamics
- •8.6 Performing in a Hall: Blending Musical Performances with Sound Fields
- •8.6.1 Choosing a Performing Position on the Stage
- •8.6.2 Performance Adjustments that Optimize Temporal Factors
- •8.6.3 Towards Future Integration of Composition, Performance and Hall Acoustics
- •9.1 Effects of Temporal Factors on Speech Reception
- •9.2 Effects of Spatial Factors on Speech Reception
- •9.3 Effects of Sound Fields on Perceptual Dissimilarity
- •9.3.1 Perceptual Distance due to Temporal Factors
- •9.3.2 Perceptual Distance due to Spatial Factors
- •10.1 Method of Noise Measurement
- •10.2 Aircraft Noise
- •10.3 Flushing Toilet Noise
- •11.1 Noise Annoyance in Relation to Temporal Factors
- •11.1.1 Annoyance of Band-Pass Noise
- •11.2.1 Experiment 1: Effects of SPL and IACC Fluctuations
- •11.2.2 Experiment 2: Effects of Sound Movement
- •11.3 Effects of Noise and Music on Children
- •12 Introduction to Visual Sensations
- •13 Temporal and Spatial Sensations in Vision
- •13.1 Temporal Sensations of Flickering Light
- •13.1.1 Conclusions
- •13.2 Spatial Sensations
- •14 Subjective Preferences in Vision
- •14.1 Subjective Preferences for Flickering Lights
- •14.2 Subjective Preferences for Oscillatory Movements
- •14.3 Subjective Preferences for Texture
- •14.3.1 Preferred Regularity of Texture
- •15.1 EEG Correlates of Preferences for Flickering Lights
- •15.1.1 Persistence of Alpha Rhythms
- •15.1.2 Spatial Extent of Alpha Rhythms
- •15.2 MEG Correlates of Preferences for Flickering Lights
- •15.2.1 MEG Correlates of Sinusoidal Flicker
- •15.2.2 MEG Correlates of Fluctuating Flicker Rates
- •15.3 EEG Correlates of Preferences for Oscillatory Movements
- •15.4 Hemispheric Specializations in Vision
- •16 Summary of Auditory and Visual Sensations
- •16.1 Auditory Sensations
- •16.1.1 Auditory Temporal Sensations
- •16.1.2 Auditory Spatial Sensations
- •16.1.3 Auditory Subjective Preferences
- •16.1.4 Effects of Noise on Tasks and Annoyance
- •16.2.1 Temporal and Spatial Sensations in Vision
- •16.2.2 Visual Subjective Preferences
- •References
- •Glossary of Symbols
- •Abbreviations
- •Author Index
- •Subject Index
18 |
2 Temporal and Spatial Aspects of Sounds and Sound Fields |
music motifs. In general, a recommended integration interval of the signal shall be discussed in Section 5.3 as a temporal window.
2.2 Physical Factors of Sound Fields
2.2.1Sound Transmission from a Point Source through a Room to the Listener
Let us consider the sound transmission from a point source in a free field to the two ear canal entrances. Let p(t) be the source signal as a function of time, t, at its source point, and gl(t), and gr(t) be the impulse responses between the source point r0 and the two ear entrances. Then the sound signals arriving at the entrances are expressed by,
f1(t) = p(t) g1(t)
(2.12)
fr(t) = p(t) gr(t)
where the asterisk denotes convolution.
The impulse responses gl,r(t) include the direct sound and reflections wn(t – tn) in the room as well as the head-related impulse responses hnl,r(t), so that,
∞ |
|
g1,r(t) = Anwn(t− tn) hnl,r(t) |
(2.13) |
n=0 |
|
where n denotes the number of reflections with a horizontal angle, ξn and elevation ηn; n = 0 signifies the direct sound (ξ0 = 0, y0 = 0),
A0W0(t − t0) = δ(t), t0 = 0, A0 = 1,
where δ(t) is the Dirac delta function; An is the pressure amplitude of the n-th reflection n > 0; wn(t) is the impulse response of the walls for each path of reflection arriving at the listener, tn being the delay time of reflection relative to that of the direct sound; and hnl,r(t) are impulse responses for diffraction of the head and pinnae for the single sound direction of n. Therefore, Equation (2.12) becomes
∞ |
|
f1,r(t) = p(t) Anwn(t− tn) hnl,r(t) |
(2.14) |
n=0 |
|
When the source has a certain directivity, the p(t) is replaced by pn(t).
2.2 Physical Factors of Sound Fields |
19 |
2.2.2 Temporal-Monaural Factors
As far as the auditory system is concerned, all factors influencing any subjective attribute must be included in the sound pressures at the ear entrances; these are expressed by Equation (2.14). The first important item, which depends on the source program, is the sound signal p(t). This is represented by the ACF defined by Equation (2.4). The ACF is factored into the energy of the sound signal p(0) and the normalized ACF as expressed by Equations (2.4)–(2.6).
The second term is the set of impulse responses of the reflecting walls, Anwn(t –tn). The amplitudes of reflection relative to that of the direct sound, A1, A2, . . ., are determined by the pressure decay due to the paths dn, such that
An = d0/dn |
(2.15) |
where d0 is the distance between the source point and the center of the listener’s head. The impulse responses of reflections to the listener are wn(t – tn) with the delay times of t1, t2, . . . relative to that of the direct sound, which are given by
tn = (dn − d0)/c |
(2.16) |
where c is the velocity of sound (m/s). These parameters are not physically independent, in fact, the values of An are directly related to tn in the manner of
tn = d0(1/An − 1)/c |
(2.17) |
In addition, the initial time delay gap between the direct sound and the first reflectiont1 is statistically related to t2, t3. . . and depends on the dimensions of the room. In fact the echo density is proportional to the square of the time delay (Kuttruff, 1991). Thus, the initial time delay gap t1 is regarded as a representation of both sets of tn and An (n = 1, 2. . .).
Another item is the set of the impulse responses of the n-th reflection, wn(t) being
expressed by |
|
wn(t) = wn(t)(1) wn(t)(2) ... wn(t)(i) |
(2.18) |
where wn(t)(i) is the impulse response of the i-th wall in the path of the n-th reflection from the source to the listener. Such a set of impulse responses may be represented by a statistical decay rate, namely the subsequent reverberation time, Tsub, because wn(t)(i) includes the absorption coefficient as a function of frequency. This coefficient is given by
αn(ω)(i) = 1 − |Wn(ω)(i)|2 |
(2.19) |
According to Sabine’s formula (1900), the subsequent reverberation time is approximately calculated by
20 |
2 Temporal and Spatial Aspects of Sounds and Sound Fields |
||||
|
Tsub ≈ |
KV |
(2.20) |
||
|
|
|
|
||
|
|
|
S |
||
|
|
α |
|||
where K is a constant (about 0.162), V is the volume of the room (m3), S is the total surface (m2), and α is the average absorption coefficient of the walls. The denominator of Equation (2.22) can be calculated more precisely as a function of the frequency by taking into account specific values of absorption coefficient as a function of frequency α(ω)i and surface area Si for each room surface i:
|
S(ω) = α(ω)iSi |
(2.21) |
α |
i
where ω = 2π f, f is the frequency.
2.2.3 Spatial-Binaural Factors
Two sets of head-related impulse responses for two ears hnl,r(t) constitute the remaining objective item. These two responses hnl(t) and hnr(t) play an important role in sound localization and spatial impression but are not mutually independent objective factors. For example, hnl(t) hnr(t) for the sound signals in the median plane, and there are certain relations between them for any other directions to a listener. In addition, the interaural time deference (IATD) and the interaural level difference (IALD) are not mutually independent factors of sound fields. In fact, a certain relationship between the IATD and the IALD can be expressed for a single directional sound arriving at a listener for a given source signal and thus for any sound field with multiple reflections. A particular example is that, when the IATD is zero, then the IALD is nearly zero as well.
Therefore, to represent the interdependence between two impulse responses, a single factor may be introduced, that is, the interaural crosscorrelation function (IACF) between the sound signals at both ears fl(t) and fr(t), which is defined by
|
(τ) |
= |
lim |
1 |
+T f |
(t)f |
(t |
+ |
τ)dt, |
τ |
1 ms |
(2.22) |
|
||||||||||||
lr |
|
T → ∞ 2T −T 1 |
r |
|
|
| | ≤ |
|
|
||||
where f’l(t) and f’r(t) are obtained by signals fl,r(t) after passing through the A- weighted network, which corresponds to the ear’s sensitivity, s(t). It has been shown that ear sensitivity may be characterized by the physical ear system including the external and the middle ear (Ando, 1985, 1998).
The normalized interaural crosscorrelation function is defined by
1r(τ) |
|
φ1r(τ) = √ 11(0) rr(0) |
(2.23) |
2.2 Physical Factors of Sound Fields |
21 |
where ll(0) and rr(0) are the ACFs at τ = 0 for the left and right ears, respectively, or the sound energies arriving at both ears, and τ the interaural time delay possibly within plus and minus 1 ms. Also, from the denominator of Equation (2.23), we obtain the binaural listening level (LL) such that,
LL = 10log[ (0)/(0)reference] |
(2.24) |
where (0) = [ ll(0) rr(0)]1/2, which is the geometrical mean of the sound energies arriving at the two ears, and (0)reference is the reference sound energy.
If discrete reflections arrive after the direct sound, then the normalized interaural crosscorrelation is expressed by,
|
|
|
N |
|
|
|
|
|
|
A2 (n) |
(τ ) |
||
1r(N)(τ ) = |
|
|
n=0 |
1r |
|
|
|
|
|
|
(2.25) |
||
|
N |
|
N |
|
||
|
n=0 |
A2 11(n) |
(0) |
A2 rr(n)(0) |
||
|
|
n=0 |
||||
where we put wn(t) = δ(t) for the sake of convenience, and (lrn)(τ ) is the interaural crosscorrelation of the n-th reflection, ll(τ)(n) and rr(τ)(n) are the respective sound energies arriving at the two ears from the n-th reflection. The denominator of Equation (2.25) corresponds to the geometric mean of the sound energies at the two ears.
The magnitude of the interaural crosscorrelation is defined by
IACC = |φ1r(τ)|max |
(2.26) |
for the possible maximum interaural time delay, say,
|τ| ≤ 1 ms
For several music motifs, the long-time IACF (2T = 35 s) was measured for each single reflected sound direction arriving at a dummy head (Table D.1, Ando, 1985). These data may be used for the calculation of the IACF by Equation (2.25).
For example, measured values of the IACF using music motifs A and B are shown in Fig. 2.7.
The interaural delay time, at which the IACC is defined as shown in Fig. 2.8, is the τIACC. Thus, both the IACC and τIACC may be obtained at the maximum value of IACF.
For a single source signal arriving from the horizontal angle ξ defined by τξ, the interaural time delay corresponds to τIACC. When it is observed τIACC = 0 in a room, then usually a frontal sound image and a well-balanced sound field are perceived (the preferred condition).
22 |
2 Temporal and Spatial Aspects of Sounds and Sound Fields |
Fig. 2.7 Interaural crosscorrelation function IACF for a single sound as a function of angle of incidence measured at the ear entrances of a dummy head. Left: music motif A (Gibbons). Right: music motif B (Arnold)
The width of the IACF, defined by the interval of delay time at a value of δ below the IACC, corresponding to the just-noticeable-difference (JND) of the IACC, is given by the WIACC (Fig. 2.8). Thus, the apparent source width (ASW) may be perceived as a directional range corresponding mainly with the WIACC. A welldefined directional impression corresponding to the interaural time delay τIACC is perceived when listening to a sound with a sharp peak in the IACF with a small value of WIACC. On the other hand, when listening to a sound field with a low value for the IACC <0.15, then a subjectively diffuse sound is perceived (Damaske and Ando, 1972).
