Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Учебники / Hearing - From Sensory Processing to Perception Kollmeier 2007

.pdf
Скачиваний:
150
Добавлен:
07.06.2016
Размер:
6.36 Mб
Скачать

Precedence-Effect with Cochlear Implant Simulation

477

ProDePo-method. Subjects were instructed to place the bar at a position corresponding to the perceived position of the sound on the interaural axis.

2.3Extension of the Noise-Band Vocoder to Binaural Presentation

A 16-channel noise-band vocoder was used to simulate CI-processing. The precedence sounds were first filtered with HRTFs. In the vocoder, the HRTF-processed sound was band-pass filtered in 16 logarithmically spaced channels in 300 Hz to 8 kHz. The channel-envelopes were computed by rectification and low-pass filtering at 200 Hz. The envelopes were computed independently for both ears and applied to noise bands (carrier, synthetization noise) of varying interaural correlation. The interaural correlation of the carrier noise is an important additional parameter of the binaural CI-simulation since it determines the compactness of the auditory image independent of the applied envelope. In a way it represents the interaural correlation of the carrier pulses in CIs.

Noise was chosen as the carrier since it evokes pitch corresponding to the region of excitation on the tonotopic axis with weak pitch strength. Temporal modulation of noise-bands can also evoke a temporal pitch sensation, similar to modulating pulse trains in electric hearing. Other carriers, like sinusoids, produce a clear pitch which does not represent perception of the pulse-train carriers used with current CIs. In addition, the harmonicity present in log-spaced sinusoids in the CI-simulation might interfere with perceptual across-channel grouping processes.

2.4Experimental Paradigms

The precedence effect was investigated in a localization dominance experiment. The lead and lag sounds were played from ±30° with a probability of 0.5 that the lead would be on the left. The lead-lag delay was varied within 0–48 ms, with the upper limit depending on the stimulus. In one experimental session subjects were instructed to localize the leftmost of the sound images if they perceived more than a single image. Randomizing the side of lead and lag sounds on every trial thus meant that subjects responded to the lead on one half of the trials and the lag on the other half without having to determine which of the sounds came first. Pointing biases were reduced by localizing the rightmost sound in separate sessions. In other conditions subjects were asked to localize the most dominant or weakest image. Four experimental paradigms were studied:

1.Precedence was studied in the free field with two loudspeakers of the Simulated Open Field Environment placed at ±30°. The ProDePo-method was used.

478

B.U. Seeber and E. Hafter

2.Using identical methods, precedence was studied with virtual acoustics based on subjectively selected non-individual HRTFs (Sect. 2.1).

3.The precedence-stimuli of experiment 2 were processed through a binaural noise-band vocoder to simulate CI-processing (Sect. 2.3). Lateralization was measured with a line-dissection method.

4.A CI-simulation similar to experiment 3 was used, but channel envelopes were quantized in 1.5-ms steps before being applied to the carrier noise. This was done to reduce the impact of ITDs in the envelope.

2.5Subjects and Stimuli

Five normal hearing subjects (<20 dB HL in 300 Hz to 10 kHz) participated in the study, but results are only shown for one subject (female, age 29 years). Three stimuli were used: (1) a burst of white noise (10 ms duration, 300 Hz to 10 kHz), (2) a low-pass noise (10 ms, cut-off at 770 Hz, but playback/vocoder high-pass at 300 Hz), and (3) the spoken CVC-word “shape” (female speaker). The level was roved in 2-dB steps within ±6 dB from a base level of 60 dB(A) (55 dB(A) for the CVC). For each sound 10 trials were collected each for the leads at −30° and +30°.

3Results and Discussion

Figures 1 and 2 compare precedence results for experiment 1 in the free-field and for the identical experiment 2 with virtual acoustics. Data are displayed for one subject for the CVC-word “shape”. Both experiments show similar results: summing localization for zero delay between lead and lag, the well known localization dominance of the lead for short delays, and a split into two perceived images at the lead and lag locations for larger delays (Blauert 1997).

When subjects were instructed to point to the dominant image, they always pointed to the lead location for all delays, whereas the instruction to point to the weakest image corresponded to the lag image for delays larger than the echo-threshold. Although not observable in Figs. 1 and 2, slightly higher variance and some localization bias are visible for most subjects with non-individual HRTFs in experiment 2 compared to the free-field presentation in experiment 1. Another difference seems to be a slight decrease in echo-threshold with virtual acoustics, i.e. two auditory objects were perceived instead of one for slightly shorter delay times. The reasons for the reduction of echo-thresholds are not known. One speculative cause might be a misrepresentation of localization cues caused by the use of nonindividual HRTFs relative to learned individual cues. Another cause might

 

40

 

 

 

 

 

in degrees

20

 

 

 

 

 

 

 

 

 

right / lead

 

direction

 

 

 

 

 

0

 

 

 

left / lag

 

 

 

 

 

dominant

 

Localized

−20

 

 

 

 

 

 

 

 

 

 

 

 

−40

 

 

 

 

 

 

0

10

20

30

40

50

Delay in ms

Fig. 1 Results of experiment 1 on precedence in the free-field. Data from one subject are presented for the CVC-word “shape”. Scattered localization response data and medians are shown as a function of the lead-lag delay time in the precedence experiment. In different sessions the subject was instructed to respond either to the image on the rightmost (+, here: leading), the leftmost (◊, lagging), or the dominant (*) sound image if two images were heard. In the experiment the lead was played randomly from ±30°, but for clarity in the picture the lead is plotted at +30° and the lag at −30°. Data plotted at the lead (+) were combined from data for pointing to the rightmost image if the lead was on the right and from side-inverted data for pointing to the leftmost image if the lead was on the left. Data for the lag (◊) were combined in a similar way

 

40

 

 

 

 

 

in degrees

20

 

 

 

 

 

 

 

 

 

right / lead

 

direction

 

 

 

 

 

0

 

 

 

left / lag

 

 

 

 

dominant

 

 

 

 

 

 

 

 

 

 

weakest

 

Localized

 

 

 

 

 

−20

 

 

 

 

 

 

 

 

 

 

 

 

−40

 

 

 

 

 

 

0

10

20

30

40

50

Delay in ms

Fig. 2 Results of experiment 2 on precedence with virtual acoustics based on selected non-individual HRTFs. Data are presented from one subject for the word “shape”. Scattered localization response data and medians are shown as a function of the lead-lag delay time in the precedence experiment. In different sessions the subject was instructed to respond either to the image on the rightmost (+, here: leading), the leftmost (◊, lagging), the dominant (*), or the weakest (■) sound image if two images were heard. Results from pointing to either the left or rightmost image were combined and plotted as in Fig. 1

480

B.U. Seeber and E. Hafter

be that the visual scale is used in a different way for pointing to sounds that are not well externalized with HRTFs. The results for the two other stimuli (wide-band and low-pass noise bursts) show a similar pattern of summing localization and precedence as that found for the word, but echo-thresholds were shorter.

Experiment 3 investigated effects of CI-simulation on precedence stimuli. Figure 3 shows lateralization results for a short wide-band noise burst. For short delays, summing localization provides a single image lateralized between the ears. Unlike the unprocessed case, this image is still mostly centered at 0.5 ms delay and not lateralized towards the lead. For larger delays an image is heard at the lead and described as the dominant one. At 2 ms delay, two images start to appear and lead and lag images are clearly separated at 4 ms delay. The lag image appears partly suppressed for delays between 0.5 and 4 ms. Despite the processing it is described as the weakest image. The suppression of the lag as well as the dominance of the lead suggest that limited precedence occurs. Since the simulation does not transmit ITDs in the noise carrier precedence must solely be based on ILDs at high frequencies and ITDs in the envelope. These cues also seem to be sufficient to evoke summing localization at short delays. The fact that the single image in summing localization at short delays moves so slowly towards the lead with increasing delay time can be attributed to the low-pass filtering of the envelope in the CI-simulation. The low-pass filtering temporally smears the

 

1

Right ear

 

 

 

position

0.5

 

 

 

 

0

 

 

 

 

Lateralized

 

 

 

 

−0.5

 

 

 

 

 

 

 

 

 

 

−1

Left ear

 

 

 

 

 

 

 

 

 

 

0

5

10

15

Delay in ms

Fig. 3 Results of experiment 3 for precedence with CI-simulation with interaurally uncorrelated carrier noise. The stimulus was a wide-band noise burst of 10 ms duration. Lateralization was measured with the ears depicted by ±1. Results from pointing to either the left or rightmost sound image were combined and plotted similar to Fig. 1, with the leading sound at the right. Figure 2 shows the legend

Precedence-Effect with Cochlear Implant Simulation

481

information at both ears and interaural cues point to the center for an extended period of time.

Figure 4 shows results of experiment 3 with a processed low-pass noise burst. The results are in general similar to those from the wide-band noise burst with two main differences: (1) Summing localization seems not to occur at zero delay between lead and lag; instead two images are reported.

(2) For a delay of 0.5 ms “anomalous localization” is seen in a combined image lateralized towards the lag (Gaskell 1983). For the low-pass noise, the simulation confines binaural information to ITDs in the envelope. The enve- lope-ITDs seem to be sufficient to evoke two perceived images at longer delays, with a slightly more dominant image at the lead. The breakdown of summing localization at zero delay might be due to an inability of the auditory system to integrate two independent low-pass noises at both ears into a single image, despite the common envelope modulation of the signal. It seems that the envelope modulation inherent in the narrow-band noise carriers is stronger to suggest two images than the additional modulation caused by the signal.

“Anomalous localization” towards the lag has been reported to be based on interaural level or phase cues (Gaskell 1983; Tollin and Henning 1999). Because the carrier noise is interaurally uncorrelated, we assume that the effect is here based on ILDs. The strong occurrence of “anomalous localization” is surprising, since the noise covers several frequency bands (300–770 Hz) and correct ITDs should still be present in the envelope.

 

1

Right ear

 

 

 

 

 

position

0.5

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

Lateralized

 

 

 

 

 

 

 

−0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−1

Left ear

 

 

 

 

 

 

 

 

0

5

10

15

20

25

30

Delay in ms

Fig. 4 Results of experiment 3 for precedence with CI-simulation with interaurally uncorrelated carrier noise, but for a 770 Hz low-pass noise burst of 10 ms duration. Lateralization was measured with the ears depicted by ±1. Results from pointing to either the left or rightmost image were combined and plotted similar to Fig. 1 with the leading sound at the right/top. The legend is given in Fig. 2

482

B.U. Seeber and E. Hafter

Precedence for the CVC “shape” can not be seen for the selected subject at any delay and for any correlation of the carrier noise (Figs. 5 and 6). It appears that with an uncorrelated carrier (Fig. 5) the lag image is always audible, even at very short delay times. Even though there is no apparent precedence the lead is reported to be the dominant and the lag the weaker image. Other subjects show more responses at the lead, but again many responses are also present at the lag. The correlated carrier in Fig. 6 centralizes and fuses lead and lag images for short delays. However, two images are heard already at 5 ms delay, which is a far smaller echo-threshold than in the unprocessed case (c.f. Fig. 1). Thus it seems that the decorrelation from the envelope modulation by the signal is sufficient to suggest two images despite the fusion effects of the carrier.

The reasons for this breakdown of precedence are not clear at present. Apparently, the breakdown occurs mostly for ongoing sounds which suggests a change in auditory scene analysis. Two hypotheses can be stated:

1.The incorrect ITDs from the carrier noise at low frequencies and the natural ILDs at high frequencies point to different locations and thus suggest two images. However, for isolated sound sources, across-channel grouping is functioning and CI-listeners hear a single image (Seeber et al. 2004).

2.The missing pitch information prevents across-channel grouping which leads to a split into two images. Pitch and harmonicity information serve as the strongest cues to combine auditory objects, but they are not well represented in CI-listeners as well as in the simulation (Culling and Darwin 1993).

 

1

Right ear

 

 

 

 

 

position

0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

Lateralized

0

 

 

 

 

 

 

−0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−1

Left ear

 

 

 

 

 

 

 

0

10

20

30

40

50

Delay in ms

Fig. 5 Results for precedence with CI-simulation with interaurally uncorrelated carrier noise (experiment 3), but for the word “shape”. Presentation as in Fig. 3

Precedence-Effect with Cochlear Implant Simulation

483

 

1

Right ear

 

 

 

right / lead

 

 

 

 

 

left / lag

 

 

 

 

 

 

 

 

 

 

 

 

 

 

dominant

 

 

0.5

 

 

 

 

weakest

 

position

 

 

 

 

 

 

0

 

 

 

 

 

 

Lateralized

 

 

 

 

 

 

−0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

−1

Left ear

 

 

 

 

 

 

 

0

10

20

30

40

50

Delay in ms

Fig. 6 Results of experiment 3 for precedence with CI-simulation for the word “shape”, but with correlated carrier noise. Presentation as in Fig. 3

In Experiment 4 the impact of envelope-ITDs on precedence was assessed by temporally quantizing the envelope. No changes in localization occurred compared to experiment 3 which suggests a restricted influence of envelope ITDs. Thus, precedence seems to be predominantly based on ILD cues or coarse onset cues.

4Conclusions

The extension of CI-simulation to binaural stimuli provides an interesting new way to study the relative importance of monaural and binaural cues in precedence. Despite obvious limitations of the simple CI-simulation for the prediction of CI-patient performance, the simulation results are congruent with results on precedence from some CI-listeners. Because the simulation results show precedence for short sounds, but strongly reduced precedence for longer sounds, we assume that the test reveals limitations of the auditory system to combine multiple cues to form auditory objects from the view of auditory scene analysis rather than limitations purely in the precedence mechanism. If this assumption proves correct the study of object separation in a precedence setting might be meaningful for concurrent speech segregation.

Acknowledgements. We would like to thank for the support by NIH RO1 DCD 00087 and NOHR grant 018750 (patient studies).

484

B.U. Seeber and E. Hafter

References

Blauert J (1997) Spatial hearing. MIT Press, Cambridge, USA

Culling JF, Darwin CJ (1993) Perceptual separation of simultaneous vowels: within and acrossformant grouping. J Acoust Soc Am 93:3454–3467

Gaskell H (1983) The precedence effect. Hear Res 12:277–303

Seeber B (2002) A new method for localization studies. Acta Acust Acust 88:446–450

Seeber BU, Fastl H (2003) Subjective selection of non-individual head-related transfer functions. In: Brazil E, Shinn-Cunningham B (eds) Proc 9th Int Conf on Aud Display. Boston University Publications Prod Dept, Boston, USA, pp 259–262

Seeber B, Fastl H (2004) Localization cues with bilateral cochlear implants investigated in virtual space – a case study. Proc Joint Congress CFA/DAGA’04, Strasbourg, France, 22.-25.03.2004, vol I. Dt Ges f Akustik, Oldenburg, pp 213–214

Seeber B, Hafter E (2006) Precedence effect with cochlear implants – simulation and results. In: Santi PA (ed) Abstracts of the 29th Annual Midwinter Meeting. Assoc Res Otolaryngol, p 150 Seeber B, Baumann U, Fastl H (2004) Localization ability with bimodal hearing aids and bilat-

eral cochlear implants. J Acoust Soc Am 116:1698–1709

Tollin DJ, Henning GB (1999) Some aspects of the lateralization of echoed sound in man. II. The role of the stimulus spectrum. J Acoust Soc Am 105:838–849

52 Enhanced Processing of Interaural Temporal

Disparities at High-Frequencies: Beyond

Transposed Stimuli

LESLIE R. BERNSTEIN AND CONSTANTINE TRAHIOTIS

1Introduction

At the previous two ISH meetings and in subsequent publications we reported that the processing of interaural temporal disparities (ITDs) within high-frequency auditory channels is enhanced by the use of “transposed” stimuli (Bernstein and Trahiotis 2002, 2003, 2004, 2005). Transposed stimuli are designed to provide envelope-based ITD-information within highfrequency channels similar to the ITD-information provided by the waveform itself within low-frequency channels. The enhancement occurred both in terms of better resolution of ITDs and larger extents of ITD-based laterality, as compared to those measured with conventional high-frequency stimuli. Transposed stimuli also exhibited resistance to the types of binaural interference found with conventional stimuli when remote, low-frequency stimulation occurred simultaneously with the high-frequency stimulation conveying the ITD.

This presentation reports initial attempts to discern which particular aspects of the envelopes of high-frequency waveforms are sufficient to yield enhanced ITD-processing. We report data collected using what we refer to as “raised-sine” stimuli, which were recently described and employed by John et al. (2002) in research concerning auditory evoked potentials. Their method permits one to control the temporal characteristics of the envelopes of high-frequency waveforms by independently varying modulation frequency, modulation depth, and “dead-time/relative peakedness,” while also suitably restricting the spectral content of the stimulus. The patterning of the behavioral data collected with such stimuli reveals graded amounts of enhancement of ITD-processing as the characteristics of the envelope are changed, in a graded manner, from those characteristic of conventional stimuli toward those characteristic of transposed stimuli. This general outcome was observed in ITD-discrimination and ITD-based lateralization experiments.

Departments of Neuroscience and Surgery (Otolaryngology), University of Connecticut Health Center, Farmington, Connecticut, USA, Les@neuron.uchc.edu, Tino@neuron.uchc.edu

Hearing – From Sensory Processing to Perception

B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey (Eds.) © Springer-Verlag Berlin Heidelberg 2007

486

L.R. Bernstein and C. Trahiotis

2“Raised-Sine” Stimuli

The method of John et al. (2002) entails raising a DC-shifted modulator to an exponent greater than or equal to 1.0 prior to multiplication with a carrier. For the case of sinusoidal modulation, the equation used to generate such stimuli is

y(t) = (sin(2pƒc t))(2m(((1 + sin(2pƒmt))/2)n − 0.5) + 1)

where ƒc is the frequency of the carrier, ƒm is the frequency of the modulator, m is the modulation index, and n is the exponent to which the DC-shifted modulator is raised.

The left side of Fig. 1 depicts time-waveforms for instances when a 128-Hz modulating tone was raised to the power 1, 2, 4, or 8 prior to multiplication with a 4-kHz carrier. As shown in the top row, an exponent of 1.0 yields a conventional SAM waveform. The bottom row of the figure depicts the time-waveform of a 128-Hz tone transposed to 4 kHz. Examination of the figure reveals that the dead-time/relative peakedness of the envelope increases directly with the value of the exponent to which the modulator is raised. The right side of the figure displays the long-term spectrum of each stimulus. Note that, for the raised-sine stimuli, the number of sidebands and their spectral extent increase directly with the value of the exponent. Still, for all of the stimuli depicted, the vast majority of the power (greater than 95%) would fall within the approximately 500-Hz- wide auditory filter centered at 4 kHz (Moore 1997).

3Threshold ITDs

Threshold-ITDs were measured with a two-cue, two-alternative, forced choice adaptive task targeting 71% correct. The stimuli employed were all of those depicted in Fig. 1. Transposed stimuli were generated by multiplying rectified, low-pass filtered (2-kHz cutoff) low-frequency tones by high-frequency carriers (Bernstein and Trahiotis 2002). All stimuli were 300 ms long and were presented at an overall level of 70 dB SPL.

Figure 2 displays threshold-ITDs obtained from one well-practiced listener whose results were typical of those measured with other listeners. Note that the threshold-ITDs obtained with the raised-sine stimuli were all smaller than those measured with the SAM tone, decreased with increases in the exponent, and, with an exponent of eight, approximated the threshold obtained with the 128-Hz tone transposed to 4 kHz. Because all of the stimuli had the same rates and depths of modulation, the differing thresholds might be attributable to the differences in the dead/time peakedness of the envelopes, which increased with increases of the exponent.

It is also logically possible that the differences in threshold-ITDs stem from differences among the spectra of the stimuli, per se. In order to test this notion,