- •Preface
- •Acknowledgments
- •Contents
- •1 Introduction
- •1.1 Auditory Temporal and Spatial Factors
- •1.2 Auditory System Model for Temporal and Spatial Information Processing
- •2.1 Analysis of Source Signals
- •2.1.1 Power Spectrum
- •2.1.2 Autocorrelation Function (ACF)
- •2.1.3 Running Autocorrelation
- •2.2 Physical Factors of Sound Fields
- •2.2.1 Sound Transmission from a Point Source through a Room to the Listener
- •2.2.2 Temporal-Monaural Factors
- •2.2.3 Spatial-Binaural Factors
- •2.3 Simulation of a Sound Field in an Anechoic Enclosure
- •3 Subjective Preferences for Sound Fields
- •3.2.1 Optimal Listening Level (LL)
- •3.2.4 Optimal Magnitude of Interaural Crosscorrelation (IACC)
- •3.3 Theory of Subjective Preferences for Sound Fields
- •3.4 Evaluation of Boston Symphony Hall Based on Temporal and Spatial Factors
- •4.1.1 Brainstem Response Correlates of Sound Direction in the Horizontal Plane
- •4.1.2 Brainstem Response Correlates of Listening Level (LL) and Interaural Crosscorrelation Magnitude (IACC)
- •4.1.3 Remarks
- •4.2.2 Hemispheric Lateralization Related to Spatial Aspects of Sound
- •4.2.3 Response Latency Correlates of Subjective Preference
- •4.3 Electroencephalographic (EEG) Correlates of Subjective Preference
- •4.3.3 EEG Correlates of Interaural Correlation Magnitude (IACC) Changes
- •4.4.1 Preferences and the Persistence of Alpha Rhythms
- •4.4.2 Preferences and the Spatial Extent of Alpha Rhythms
- •4.4.3 Alpha Rhythm Correlates of Annoyance
- •5.1 Signal Processing Model of the Human Auditory System
- •5.1.1 Summary of Neural Evidence
- •5.1.1.1 Physical Characteristics of the Ear
- •5.1.1.2 Left and Right Auditory Brainstem Responses (ABRs)
- •5.1.1.3 Left and Right Hemisphere Slow Vertex Responses (SVRs)
- •5.1.1.4 Left and Right Hemisphere EEG Responses
- •5.1.1.5 Left and Right Hemisphere MEG Responses
- •5.1.2 Auditory Signal Processing Model
- •5.2 Temporal Factors Extracted from Autocorrelations of Sound Signals
- •5.3 Auditory Temporal Window for Autocorrelation Processing
- •5.5 Auditory Temporal Window for Binaural Processing
- •5.6 Hemispheric Specialization for Spatial Attributes of Sound Fields
- •6 Temporal Sensations of the Sound Signal
- •6.1 Combinations of Temporal and Spatial Sensations
- •6.2 Pitch of Complex Tones and Multiband Noise
- •6.2.1 Perception of the Low Pitch of Complex Tones
- •6.2.3 Frequency Limits of Missing Fundamentals
- •6.3 Beats Induced by Dual Missing Fundamentals
- •6.4 Loudness
- •6.4.1 Loudness of Sharply Filtered Noise
- •6.4.2 Loudness of Complex Noise
- •6.6 Timbre of an Electric Guitar Sound with Distortion
- •6.6.3 Concluding Remarks
- •7 Spatial Sensations of Binaural Signals
- •7.1 Sound Localization
- •7.1.1 Cues of Localization in the Horizontal Plane
- •7.1.2 Cues of Localization in the Median Plane
- •7.2 Apparent Source Width (ASW)
- •7.2.1 Apparent Width of Bandpass Noise
- •7.2.2 Apparent Width of Multiband Noise
- •7.3 Subjective Diffuseness
- •8.1 Pitches of Piano Notes
- •8.2 Design Studies of Concert Halls as Public Spaces
- •8.2.1 Genetic Algorithms (GAs) for Shape Optimization
- •8.2.2 Two Actual Designs: Kirishima and Tsuyama
- •8.3 Individualized Seat Selection Systems for Enhancing Aural Experience
- •8.3.1 A Seat Selection System
- •8.3.2 Individual Subjective Preference
- •8.3.3 Distributions of Listener Preferences
- •8.5 Concert Hall as Musical Instrument
- •8.5.1 Composing with the Hall in Mind: Matching Music and Reverberation
- •8.5.2 Expanding the Musical Image: Spatial Expression and Apparent Source Width
- •8.5.3 Enveloping Music: Spatial Expression and Musical Dynamics
- •8.6 Performing in a Hall: Blending Musical Performances with Sound Fields
- •8.6.1 Choosing a Performing Position on the Stage
- •8.6.2 Performance Adjustments that Optimize Temporal Factors
- •8.6.3 Towards Future Integration of Composition, Performance and Hall Acoustics
- •9.1 Effects of Temporal Factors on Speech Reception
- •9.2 Effects of Spatial Factors on Speech Reception
- •9.3 Effects of Sound Fields on Perceptual Dissimilarity
- •9.3.1 Perceptual Distance due to Temporal Factors
- •9.3.2 Perceptual Distance due to Spatial Factors
- •10.1 Method of Noise Measurement
- •10.2 Aircraft Noise
- •10.3 Flushing Toilet Noise
- •11.1 Noise Annoyance in Relation to Temporal Factors
- •11.1.1 Annoyance of Band-Pass Noise
- •11.2.1 Experiment 1: Effects of SPL and IACC Fluctuations
- •11.2.2 Experiment 2: Effects of Sound Movement
- •11.3 Effects of Noise and Music on Children
- •12 Introduction to Visual Sensations
- •13 Temporal and Spatial Sensations in Vision
- •13.1 Temporal Sensations of Flickering Light
- •13.1.1 Conclusions
- •13.2 Spatial Sensations
- •14 Subjective Preferences in Vision
- •14.1 Subjective Preferences for Flickering Lights
- •14.2 Subjective Preferences for Oscillatory Movements
- •14.3 Subjective Preferences for Texture
- •14.3.1 Preferred Regularity of Texture
- •15.1 EEG Correlates of Preferences for Flickering Lights
- •15.1.1 Persistence of Alpha Rhythms
- •15.1.2 Spatial Extent of Alpha Rhythms
- •15.2 MEG Correlates of Preferences for Flickering Lights
- •15.2.1 MEG Correlates of Sinusoidal Flicker
- •15.2.2 MEG Correlates of Fluctuating Flicker Rates
- •15.3 EEG Correlates of Preferences for Oscillatory Movements
- •15.4 Hemispheric Specializations in Vision
- •16 Summary of Auditory and Visual Sensations
- •16.1 Auditory Sensations
- •16.1.1 Auditory Temporal Sensations
- •16.1.2 Auditory Spatial Sensations
- •16.1.3 Auditory Subjective Preferences
- •16.1.4 Effects of Noise on Tasks and Annoyance
- •16.2.1 Temporal and Spatial Sensations in Vision
- •16.2.2 Visual Subjective Preferences
- •References
- •Glossary of Symbols
- •Abbreviations
- •Author Index
- •Subject Index
120 |
6 Temporal Sensations of the Sound Signal |
Fig. 6.22 Scale values of DS obtained by the PCT: , complex tone (F0 = 500 Hz) with 3000-Hz and 3500-Hz pure-tone components; , 500-Hz pure tone; •, 3000-Hz pure tone
Here τ 1 is extracted from the stimulus ACF. Figure 6.23 shows the normalized stimulus ACF in which τ 1 corresponds to the missing fundamental, i.e., the pitch that is heard for fundamental periodicities below roughly 1,200 Hz (Section 6.2.3).
Scale values of individual listeners were also compiled (see Section 9.1 in Ando, 1998). Goodness of fit results for the two-factor model of duration perception are listed in Table 6.2 for 10 subjects. These individual data confirmed the abovementioned results within the range of 1 standard deviation, except for subjects M.K. and K.A., whose value of d reflected poor fits, exceeding 22.2% and 19.4% (K > 7), respectively.
The significant results of this study are summarized below.
1.Apparent stimulus duration DS depends primarily on the duration of the signal and secondarily on signal periodicity τ1 (pure-tone frequency or complex-tone fundamental frequency).
2.Effects of the τ1 extracted from the ACF on DS are almost the same on the scale value for the pure-tone (τ1 = 2 ms) and complex-tone (τ1 = 2 ms) stimuli. The apparent duration DS of the pure-tone stimulus (τ1 = 0.33 ms = 1/3000 Hz) with the higher pitch is significantly shorter than that of the pure-tone and complextone stimuli with the lower pitch (τ1 = 2 ms = 1/500 Hz).
3.Apparent duration DS can be readily expressed as a function of D and τ1 for both pure and complex tones.
6.6 Timbre of an Electric Guitar Sound with Distortion
Timbre is defined as an aspect of sound quality that is independent of loudness, pitch, and duration. It encompasses those perceived qualities of sound that distinguish two notes of equal pitch, loudness, and duration that are played on
6.6 Timbre of an Electric Guitar Sound with Distortion |
121 |
Fig. 6.23 Demonstrations of the NACF analyzed for the complex tone. (a) Complex tone with the components of 3000 Hz and 3250 Hz (F0 = 250 Hz). (b) Complex tone with the components of 3000 Hz and 3500 Hz (F0 = 500 Hz). (c) Complex tone with the components of 3000 Hz and 4000 Hz (F0 = 1000 Hz)
Table 6.2 Results of tests of goodness of fit for 10 subjects. (For the method of goodness of fit, see Ando and Singh, 1996; Ando, 1998)
Subject |
K1 |
d (%)2 |
M.K. |
8 |
22.2 |
D.G. |
6 |
16.7 |
S.K. |
6 |
16.7 |
M.N. |
4 |
11.1 |
K.A. |
7 |
19.4 |
N.K. |
6 |
16.7 |
D.B. |
4 |
11.1 |
N.A. |
5 |
13.9 |
M.A. |
5 |
13.9 |
S.S. |
6 |
16.7 |
|
|
|
1K is the number of poor responses.
2d = 2 K/F(F – 1), where F is the number of stimuli used for the judgment. In this investigation, F = 9. Thus, if K = 8, then d = 28%.
122 |
6 Temporal Sensations of the Sound Signal |
different musical instruments. Timbre is often described in terms of sound texture or coloration.
In this experiment, we investigated differences in timbre that are produced from electric guitar notes that were processed using different distortion effects. We discuss the relationship between these timbral differences and a temporal factor extracted from the ACF Wφ(0). As shown in Fig. 2.1, this factor Wφ(0) reflects the relative width of the ACF peak at its zero-lag origin. Wφ(0) is defined by the first delay time φ(τ ) at which the normalized ACF declines to half its maximal value (i.e., 0.5). It is worth noting that this factor Wφ(0) in the monaural autocorrelation function (ACF) is analogous to factor WIACC in the interaural correlation function (IACF).
An electric guitar with “distortion” is a primary instrument of pop and rock music. Previously, Marui and Wartens (2005) investigated timbral differences using of three types of nonlinear distortion processors with differing levels of Zwicker Sharpness (Zwicker and Fastl, 1999). In this study, we examined whether timbre can be described in terms of temporal factors extracted from the running ACF of the source signal. We wanted to determine whether one can distinguish notes that are played with different degrees of distortion despite their identical pitch, loudness, and duration.
6.6.1 Experiment 1 – Peak Clipping
The purpose of this experiment is to find the ACF correlate of distortion. We changed the strength of distortion by the use of a computer. The distortion of music signal p(t) was processed by a computer program that peak clipped the signal to keep it within a given cutoff amplitude range (±C) and below a corresponding cutoff sound pressure level (CL). The signal was hard-limited in amplitude such that for |p(t)| ≤ C
p(t) = p(t) |
(6.13) |
and for |p(t)| > C |
|
p(t) = +C, p(t) ≥ C; p(t) = −C, p(t) ≤ −C |
(6.13b) |
where C is the cutoff pressure amplitude, and its cutoff level CL is defined by |
|
CL = 20log10 C/ | p(t)| max |
(6.14) |
with |p(t)max| being the maximum amplitude of the signal.
The value of cutoff level CL relative to the unclipped sound pressure level was varied from 0 to –49 dB in 7 dB steps, yielding a set of eight test stimuli. As indicated in Table 6.3, pitch, signal duration, and listening level were fixed. The subjects were 19 students (male and female, all 20 years of age). Subjects listened to three stimuli and judged timbral dissimilarity. The number of stimulus combina-
6.6 Timbre of an Electric Guitar Sound with Distortion |
123 |
|
Table 6.3 Conditions of Experiments 1 and 2 |
||
|
|
|
Condition |
Experiment 1 |
Experiment 2 |
|
|
|
(1) Conditions fixed |
|
|
Note (pitch) |
A4 (220 Hz) |
A4 (220 Hz) |
|
By use of third string and |
By use of third string and |
|
second fret |
second fret |
Listening level in LAE (dB) |
80 |
70 |
Signal duration (s) |
4.0 |
1.5 |
(2) Conditions varied |
|
|
CL (dB) by Equation (6.12) |
Eight signals tested |
|
|
changing the cutoff level |
|
|
for 0–49 dB (7 dB step) |
|
Distortion type |
– |
Three different types: VINT, |
|
|
CRUNCH, and HARD |
Drive level |
– |
Three levels due to the strength |
of distortion: 50, 70, 90 by the effector Type ME-30 (Boss, Roland, Hamamatsu, Japan)
tions in this experiment was 8C3 = 56 triads. The dissimilarity matrix was constructed according to the dissimilarity judgments. The value 2 was assigned to the most different pair, 1 to the neutral pair, and 0 to the most similar pair. After multidimensional scaling analysis, we obtained the scale value (SV). This value is different from the scale value obtained by the method of comparative judgment (PCT).
We analyzed contributions to the scale value SV of other factors, for example, the mean value of Wφ(0), the decay rate of SPL (dBA/s), and the mean value of φ1 (pitch strength). It was found that the most significant factor contributing to the SV was the mean value of Wφ(0). Certain correlations between the mean value of Wφ(0) and other factors were found, so that the mean value of Wφ(0) is considered as representative. The scale value of perceived timbral dissimilarity as a function of the mean value of Wφ(0) is shown in Fig. 6.24. The correlation between the SV and the value of Wφ(0) is 0.98 (p < 0.01).
Fig. 6.24 Results of regression analysis for SV and the mean value of Wφ(0) (Experiment 1)
