Добавил:

Sekretar kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Ростовский Государственный Медицинский Университет

Предмет:

Медицина общая

Файл:

Ординатура / Офтальмология / Английские материалы / Auditory and Visual Sensations_Ando, Cariani_2009.pdf

Скачиваний:

Добавлен:

28.03.2026

Размер:

12.86 Mб

Скачать

☆

<<< < Предыдущая 10 11 12 13 14 15 16 17 18 19 20 2122 / 6522 23 24 25 26 27 28 29 30 31 32 33 34 > Следующая >>>

6.2 Pitch of Complex Tones and Multiband Noise

is represented as the linear combination of hemisphere-speciﬁc factors can explain these differences. Temporal sensations SL and spatial sensations SR can thus be modeled in terms of the contributions of different factors that dominate in the neural response:

SL = fL(x11) + fL(x21) + ...	+ fL(xI1),1 = 1,2, ... L	(6.5)
SR = fR(x1r) + fR(x2r) + ...	+ fR(xIr),1 = 1,2, ... R

where L + R = J.

Individual differences in the weighting of factors can also produce differences in sensation and preference. Even for such temporal and spatial sensations, there are substantial individual differences due to multiple physical factors, as expressed by Equations (6.1) and (6.2). Individual differences can be caused both by differing individual sensitivities and/or unique responses to the various factors. These differences of sensation and preference can be seen as characteristics of different individual listeners who have distinct auditory and visual “personalities.”

Subjective responses that are related to the overall intensity of the evoked perceptual experience (e.g., preference or annoyance) can be expressed by both temporal and spatial factors, SL and SR, so that

S = SL + SR

(6.6)

Going back to the theory of subjective preference of the sound ﬁeld described in Section 3.3, each of the scale values in Equation (6.6) may be given by SL = S2 + S3 and SR = S1 + S4. It is worth noting that on the subjective preference judgments for the sound ﬁeld, factors such as τ1, φ1 extracted from the ACF and WIACC extracted from the IACF play a minor inﬂuence. In the following sections, we shall discuss temporal sensations according to the guideline given by Equation (6.6) with limited signiﬁcant temporal factors.

6.2 Pitch of Complex Tones and Multiband Noise

Pitches that are heard at the fundamental frequencies of harmonic complex tones have relatively straightforward correlates in patterns of major peaks in their autocorrelation functions (ACFs). The pitch period corresponds to the time delay (τ1) of the ﬁrst major peak. The pitch of multiband “complex noise” is also described by the value of τ1, and its strength is related to the value φ1. The autocorrelation model for pitch sensation holds for fundamental frequencies below 4-5 kHz and for missing fundamentals below 1200 Hz.

6.2.1 Perception of the Low Pitch of Complex Tones

Most of the sounds in tonal music that constitute the notes of melodies and harmonies are harmonic complex tones rather than pure tones. A harmonic complex

94	6 Temporal Sensations of the Sound Signal

tone consists of a series of partials whose frequencies (f1, f2, f3, . . ., fm) are integer multiples (n = 1, 2, 3, . . ., m) of its fundamental frequency (F0). Such harmonic complexes produce the strongest pitches at their fundamentals, so long as these periodicities lie in the existence region of musical tonality (roughly 30–5000 Hz). Other, weaker pitches can also be heard that correspond to individual partials, especially the ﬁrst ﬁve (harmonic number n < 6). What is interesting is that harmonic complexes having no energy at the fundamental frequency in their power spectra (i.e., they have only “upper” partials) can still produce strong “low” pitch at the fundamental itself. It is thus the cases for complex tones with a “missing fundamental” that strong pitches are heard that correspond to no individual frequency component, and this raises deep questions about whether patterns of pitch perception are consistent with frequency-domain representations. In order to save the notion of the auditory system as a general Fourier processor, it becomes necessary to postulate a complicated central harmonic analyzer.

As a result of these difﬁculties, some auditory theorists (Seebeck, 1844; Wever, 1949; Licklider, 1951; Rose, 1980) have instead sought temporal explanations for pitch, pointing to the elegance with which time-domain representations cope with the phenomenon of the missing fundamental. In the ACF, the positions of major peaks, which reﬂect the fundamental, are unchanged. Temporal theories have the advantage of explaining pitch perception of both low-frequency pure tones and complex tones in terms of the same central representations and mechanisms. They account for those pitch phenomena most important for music and speech (i.e., for periodicities between 30 and 4000 Hz). These explanations notwithstanding, it is also clear that temporal representations cannot account for high-frequency hearing and the (atonal) pitches evoked by pure tones with frequencies above 5000 Hz. Moreover, most auditory centers throughout the pathway have spatially ordered frequency maps that mimic the rough tonotopic organization of the cochlea.

For these reasons, many auditory theorists have postulated that hearing is based on dual frequencyand time-domain auditory representations. Maps based on cochlear “place” have been thought to cover the frequency range of pure-tone hearing and cochlear resonances, whereas the temporal representation has been thought to cover the range of periodicities available in neuronal ﬁring patterns (roughly up to 4–5 kHz).

The ﬁrst autocorrelation model developed to account for the pitch of the missing fundamental phenomenon was therefore originally formulated as a “duplex” model (Licklider, 1951). Licklider’s time-delay neural network architecture was similar in many respects to the Jeffress (1948) model of binaural crosscorrelation that had been proposed 3 years earlier. Licklider used a network of delay lines and coincidence counters arranged along the axes of frequency and delay to compute both a central spectrum and a central global temporal autocorrelation representation. Licklider’s later “triplex” model (1959) added a binaural crosscorrelation stage to the duplex model. In a similar vein, Cherry and Sayers (1956) combined autocorrelation and crosscorrelation operations to deal with issues related to aural fusion, sound separation, and directional hearing.

After a series of turns in the evolution of pitch theory (for a historical review, see de Boer, 1976), temporal models were neglected in favor of spectral pat-

6.2 Pitch of Complex Tones and Multiband Noise

tern approaches. In the wake of the difﬁculties with Schouten’s temporal theory, spectral pattern recognition models were proposed to explain the strong low pitches produced by low, perceptually resolved harmonics (Goldstein, 1973; Wightman, 1973a,b; Terhardt, 1974). Two mechanisms were assumed, a spectral pattern mechanism for strong pitches of perceptually resolved low harmonics, and a temporal mechanism for weak pitches of perceptually unresolved high harmonics. Because the best models for low-frequency pure-tone pitch discrimination use interspike interval information, some theorists (Goldstein, 1973) left open the possibility that central representations of frequency might be based on interspike interval information in early auditory stations. Explicit temporal representations were thus marginalized to pitches produced by unresolved harmonics; phenomena that are largely irrelevant for pitch in music and speech.

Beginning in the 1980s, temporal models for pitch that were based on ﬁrst-order interspike intervals (times between successive spikes produced by a given neuron) in the auditory nerve were proposed (Moore, 2003; van Noorden, 1982). In these models, interspike interval information was pooled together from all regions of the auditory nerve to form a temporal population code for frequency and periodicity. By the end of the decade, temporal autocorrelation models for pitch were revived and tested using computer simulations of the cochlea and auditory nerve (Meddis and Hewitt, 1991a,b). These autocorrelation models are based on all-order interspike intervals (times between all spikes produced by a neuron, consecutive and nonconsecutive) rather than ﬁrst-order intervals. Soon after, neurophysiological studies of temporal discharge patterns in the cat auditory nerve (Cariani and Delgutte, 1996a,b; Cariani, 1999, 2001) were conducted to test the temporal models. Taken together, the computer simulations and neurophysiological studies showed that the temporal autocorrelation models based on interspike interval distributions could predict a very wide range of pitch phenomena: pitch of the missing fundamental, pitch equivalence between pure and complex tones, level and phase invariance, pitch shift of inharmonic complex tones, pitch dominance, octave similarity, and the nonspectral pitch of amplitude-modulated noise.

Analogous phenomena have also been observed for nonperiodic, inharmonic complex tones as well as nonstationary sounds (noises). It is important to note that more advanced temporal models go well beyond autocorrelation operations on the stimulus itself to include cochlear ﬁltering and neuronal dynamics. Another line of research in temporal models for pitch has focused on the role of cochlear ﬁltering on the temporal structure of the resulting signals. These studies (Yost et al., 1978; Yost, 1996a,b) used rippled noise stimuli to probe pitch strength, peripheral weighting, and the effects of the dominance region for pitch (Ritsma, 1967).

Time-domain cancellation models involving an array of delay lines and inhibitory gating neurons have also been proposed, and these generally behave in a manner similar to those based on autocorrelation (Cheveigne, 1998, 2004).

Here we propose a model for pitch that is based on a central autocorrelation representation (ACF). The ACF model predicts the pitches not only of complex tones and ripple noise, but also of multiband complex noise with missing fundamentals. Pitch can be calculated by the delay τ1 associated with the ﬁrst major ACF peak, where pitch strength corresponds to the amplitude φ1 of this peak. The main purpose

96	6 Temporal Sensations of the Sound Signal

of the next set of experiments described below is to apply the ACF model to predict the pitch of a harmonic complex with a missing fundamental.

First, a pitch-matching test, comparing pitches of pure and complex tones, was performed to reconﬁrm previous results (Sumioka and Ando, 1996). The test signals were all complex tones consisting of harmonics 3–7 of a 200-Hz fundamental. All tone components had the same amplitudes, as shown in Fig. 6.1. As test signals, the two waveforms of complex tones, (a) in-phases and (b) random-phases, were applied as shown in Fig. 6.2. Starting phases of all components of the in-phase stimuli were set at zero. The phases of the components of random-phase stimuli were randomly set to avoid any periodic peaks in the real waveforms. As shown in Fig. 6.3, the normalized ACF (NACF) of these stimuli were calculated at the integrated interval 2T = 0.8 s. Though the waveforms differ greatly from each other, as shown in Fig. 6.3, their NACFs are identical. The time delay at the ﬁrst maximum peak of the NACF, τ1, equals 5 ms (200 Hz), corresponding to the fundamental frequency. Five 20to 26-year-old musicians participated as subjects in the experiment. Test signals were produced from the loudspeaker in front of each subject in a semi-anechoic chamber. The SPL of each complex tone at the center position of the listener’s head was ﬁxed at 74 dB by analysis of the ACF (0). The distance between a subject and the loudspeaker was 0.8 m ± 1 cm.

Pitch matching results for the ﬁve subjects are shown in Fig. 6.4. The histograms show matching frequencies within each semitone (1/12 octave) band for in-phase and random-phase stimuli. The dominant pitch match of 200 Hz is absent from the spectrum of both stimuli, and this periodicity is not at all apparent in the waveform of the random-phase signal. However, it is readily apparent in the autocorrelation functions of the two stimuli, which are identical to each other (Fig. 6.3). For both in-phase and random-phase conditions, about 60% of the responses clustered within a semitone of the fundamental. There are no major differences in the distributions of pitch-matching data between the two conditions.

For more detail, the averaged values and standard deviations (SD) of the data obtained from each subject at frequencies near 200 Hz are listed in Table 6.1. Results obtained for pitch under the two conditions are deﬁnitely similar. In fact, the pitch strength remains invariant under both conditions. Thus, pitch of complex tones can be predicted from the time delay at the ﬁrst major peak τ 1 of the NACF. This conclusion is in agreement with the ﬁndings of Yost (1996a), who demonstrated that pitch perception of iterated rippled noise determined by the ﬁrst major ACF peak of the stimulus signal.

From Equation (6.6), pitch as one of temporal sensations may be expressed by

S = SL = fL(τ1) ≈ 1/τ1(Hz),

(6.7)

when φ1 = 1.

Individual differences in pitch perception were also found. The results for each subject are indicated in Fig. 6.5. Subjects B and D matched only around the fundamental frequency (200 Hz). About 20% of the responses were clustered around 400 Hz, and the NACF has a distinct dip at τ = 2.5 ms (Fig. 6.3). However, an octave

6.2 Pitch of Complex Tones and Multiband Noise

Fig. 6.1 Complex tone presented with pure-tone components of 600, 800, 1000, 1200, and 1400 Hz without the fundamental frequency of 200 Hz

Fig. 6.2 Waveforms of 200 Hz missing-fundamental complex tones consisting of in-phase components (top) and random-phase components (bottom)

Fig. 6.3 Normalized autocorrelation function (NACF) of the two complex tones with different phase components, τ1 = 5 ms (200 Hz)

shift for a phase change (Lundeen and Small, 1984) was not observed in the results obtained from these subjects. Subjects A and E matched at the fundamental frequency and at the frequency an octave higher. This octave change might be caused by a similarity for the octave relation. The time delay of the ACF for this pitch is

98	6 Temporal Sensations of the Sound Signal

Fig. 6.4 Results of pitch-matching tests for the two complex tones, τ1 = 5 ms (ﬁve subjects)

Table 6.1 Mean and standard deviation (SD) of the pitch-matching test for each subject

Matched frequency, mean value (Hz)			SD (Hz)

Subject	In-phase	Random phase	In-phase	Random phase

A	202.6	201.0	1.89	2.44
B	199.1	198.3	1.70	1.42
C	202.5	202.1	1.18	1.76
D	203.7	201.7	2.29	1.65
E	202.2	202.2	1.87	2.07
Total	201.9	201.0	2.43	2.38

6.2 Pitch of Complex Tones and Multiband Noise

Fig. 6.5 Results of the pitch-matching tests for each of ﬁve subjects. (a–e)

100	6 Temporal Sensations of the Sound Signal

2.5 ms, so this pitch cannot be predicted because of a dip in the ACF structure. None of the subjects matched at τ1 = 10 ms (100 Hz), which is an octave lower than the fundamental frequency, though there is a peak at τ1 = 10 ms (Fig. 6.3). Subject C matched in three categories of center frequencies (200.0, 224.5, and 317.5 Hz). Subject C may have sought a harmonic relation because he is a musician who uses the key of E-ﬂat. Two notes in the E-ﬂat major triad, E-ﬂat and G, correspond to the semitone bins that had center frequencies of 317.5 and 200 Hz respectively. Despite these categorical errors, subject C’s pitch-matches in the vicinity of 200 Hz (Table 6.1) were comparable in accuracy to those of the other subjects.

6.2.2 Pitch of Multiband “Complex Noise”

The purpose of this experiment using complex noise was to determine if the amplitude φ1 of the ﬁrst major autocorrelation peak determines the perceived strength of the pitch.

The experimental method was the same as that of the experiment described in the previous section. The bandwidths of each partial noise, which consist of the bandpass white noise with a cutoff slope of 1080 dB/octave, were changed. The center frequencies of the band noise components were 600, 800, 1000, 1200, and 1400 Hz. The complex signal consisting of band-pass noises with different center frequencies is called here “complex noise.” The bandwidths ( f) of the four components were 40, 80, 120, and 160 Hz (Fig. 6.6). Their waveforms (Fig. 6.7, left plots) have no obvious envelope periodicities. Measured results of the NACF for four conditions are shown on the right of Fig. 6.7. The amplitude of the maximum peak (indicated by arrows in the ﬁgures) in the NACF is increased with decreasing f. Four musicians from the ﬁrst test and a new musician, 20 to 25 years old, participated as subjects in this experiment.

Fig. 6.6 Multiband complex noise containing ﬁve passbands with center frequencies: 600, 800, 1,000, 1,200, and 1,400 Hz. The fundamental frequency is centered on 200 Hz

The probabilities of the matching data counted for each 1/12-octave band are shown in Fig. 6.8. All histograms show that there is a strong tendency to perceive a pitch of 200 Hz for each stimulus. This agrees with the prediction based on the value of τ1. These results indicate that a stimulus with a narrow bandwidth gives a stronger pitch corresponding to 200 Hz than does a stimulus with a wide bandwidth. The standard deviation (SD) for the perceived pitches increased because the value

<<< < Предыдущая 10 11 12 13 14 15 16 17 18 19 20 2122 / 6522 23 24 25 26 27 28 29 30 31 32 33 34 > Следующая >>>

Соседние файлы в папке Английские материалы

#
28.03.202619.66 Mб0Atlas of Oculofacial Reconstruction Principles and Techniques for the Repair of Periocular Defects_Harris_2009.chm
#
28.03.202626.64 Mб0Atlas of Oculoplastic and Orbital Surgery_Spoor_2010.pdf
#
28.03.202611.53 Mб0Atlas of Refractive Surgery_Boyd_2000.pdf
#
28.03.202621.99 Mб0Atlas of Uveitis and Scleritis _Ganesh,Agarwal,George,Biswas_2005.pdf
#
28.03.202679.73 Mб0ATLAS Optical Coherence Tomography of Macular Diseases and Glaucoma 2nd edition_Gupta_2006.pdf
#
28.03.202612.86 Mб0Auditory and Visual Sensations_Ando, Cariani_2009.pdf
#
28.03.20264.82 Mб0Automated Image Detection of Retinal Pathology_Jelinek, Cree_2009.pdf
#
28.03.20264.34 Mб0Basic Ophthalmology_Bradford_2004.djvu
#
28.03.20266.98 Mб0Basic Ophthalmology_Bradford_2004.pdf
#
28.03.20266.36 Mб0Basic Principles of Ophthalmic Surgery_Arnold_2006.pdf
#
28.03.20265.79 Mб0Basic Sciences in Ophthalmology_Velayutham_2009.pdf