Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Auditory and Visual Sensations_Ando, Cariani_2009.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
12.86 Mб
Скачать

9.2 Effects of Spatial Factors on Speech Reception

185

In this investigation another factor Wφ(0)

was not considered because

only more recently, in 2007, was it identified as a factor related to timbre (Section 6.6). The factor Wφ(0) is related to WIACC, because both of them are determined by a signal’s frequency composition. Once the factor Wφ(0) is also taken into consideration, the present results may be explained more precisely.

9.2 Effects of Spatial Factors on Speech Reception

We are interested in the effect of sound fields on the interactions of sounds, and in particular how reverberant environments degrade speech sounds. In these experiments, a loudspeaker located in front of the listener presented single syllables, while continuous white noise as a disturbance was produced from another loudspeaker located at different horizontal angles. Three temporal factors and the sound energy were extracted from the ACF of the speech signal, and three spatial factors were extracted from the IACF. Results show that two factors had significant effects on syllable identification: the effective duration, (τ e)min, in the temporal factors extracted from the running ACF, and the WIACC in the spatial factors extracted from the IACF.

In the previous section, we discussed how temporal factors extracted from the running ACF are related to speech intelligibility in sound fields with single echos. The auditory model was used to attempt to account for the identification of single syllables in noise disturbances from different directions (Ando and Yamasaki, unpublished). It is assumed that the specialization of the human cerebral hemisphere may relate to the highly independent contributions of spatial and temporal factors on speech identification. It may be the case that “cocktail party effects” might well be explained by such specialization of the human brain, because speech is mainly processed in the left hemisphere, while spatial information is independently processed in the right hemisphere at the same time. Based on such a model, we have described temporal and spatial sensations in Chapters 6 and 7, respectively. According to the model shown in Fig. 5.1 three temporal factors associated with the left hemisphere together with the sound energy were extracted from the ACF of the sound signal arriving at one of ear entrances. In addition, three spatial factors associated with the right hemisphere were extracted from the IACF of sound signals arriving at the two ear entrances. The running ACF and the running IACF with the integration interval 2T = 30 ms were analyzed using running steps of 10 ms.

For identification of the speech signals, psychological distances between characteristics of single syllables are calculated by Equation (9.2). The distance is a function of four factors extracted from the ACF, and these are mainly associated with neuronal responses from the left cerebral hemisphere. In addition, to find effects of off-direction noise, three spatial factors are extracted from the IACF, which are associated with the right cerebral hemisphere (Fig. 5.1). The distances due to the spatial factors, DIACC, DτIACC, and DWIACC, respectively, are given by

186

9 Applications (II) – Speech Reception in Sound Fields

 

I

SF

 

 

T

 

 

 

DIACC(X,K) =

|IACCi

IACCi

|

/I

 

X

K

 

 

i=1

 

 

 

 

 

 

 

 

I

SF

 

 

T

 

 

 

DτIACC (X,K) =

|τIACCi

τIAi

 

|

/I

(9.6)

X

CC

K

 

i=1

 

 

 

 

 

 

 

 

I

SF

 

 

T

 

 

 

DWIACC (X,K) =

|WIACCi

WIACCi

|

/I

 

X

K

 

 

i=1

 

 

 

 

 

 

 

In general, shorter distances between the template syllable and the syllables accompanied by noise signify higher intelligibilities. According to multiple regression analysis, the non-identification (NI) rate of syllables that were not matched with the template, has been directly calculated, so that

NI(S0,SX) = SL + SR = [aDτe + bDτ1 + cDφ1]L

(9.7)

+ [dD (0) + eDIACC + fDτIACC + gDWICAA]R

where SL = [aDτe + bDτ1 + cDφ1]L, SR = [dD (0) + eDIACC + fDτIACC + gDWIACC]R, and P(0) is measured in dBA. The seven factors are classified into the left and right hemispheres by the model (Fig. 5.1). Note that the listening level or (0) is associated with the right hemisphere (Table 5.1). Weighting coefficients a through g in Equation (9.7) were determined by maximizing NI with experimental data.

Fourteen single syllables, /pa/ /pu/ /te/ /zo/ /bo/ /yo/ /mi/ /ne/ /kya/ /kyo/ /pya/ /gya/ /nya/ /zya/, with 4-s intervals between syllables, were presented to each subject by the frontal loudspeaker (ξ = 0, the distance to the center of the subject’s head, d = 70 cm ± 1 cm) in an anechoic chamber. The white noise used as a disturbance was continuously produced by one of the loudspeakers located at different horizontal angles: ξ = 30, 60, 90, 120, or 180(d = 70 cm). The sound-pressure level measured in terms of p(0) of both speech signals and the continuous white noise were fixed at 65.0 dBA at the peak level. Ten subjects participated in the experiment, who were asked to identify what syllable was heard.

For example, values of τe extracted from the running ACF for the signal /mi/ with and without the noise (ξ = 90) as a function of time are shown in Fig. 9.6. The important initial half parts of the speech signal indicating (0) < 0.5 as shown in Fig. 9.7 of both template and test syllables with the noise were applied in computation by Equations (9.6) and (9.7).

Results of the non-identification NI rate for some single syllables as a function of the horizontal angle ξ of the noise disturbance are shown in Fig. 9.8. Almost similar tendencies NI of these syllables were found. When the noise arrived from 30, the NI indicated the maxima in the horizontal angle range tested, and when the noise was presented from 120, it was the minima. The same was true for the averaged NI rate as shown in Fig. 9.9.

9.2 Effects of Spatial Factors on Speech Reception

187

Fig. 9.6 Values of effective duration τe extracted from the running ACF for the frontal signal /mi/ only, and the /mi/

with the white noise from ξ = 90

Fig. 9.7 For making comparison, initial pieces analyzed of a frontal single syllable with and without the white noise from ξ = 90

Fig. 9.8 Examples of the percentage of nonidentification (NI) for single syllables as a function of the horizontal angle of the white noise from different horizontal angles ξ. At the horizontal angle ξ = 120, the percentage of NI was minimum for the single syllables

188

9 Applications (II) – Speech Reception in Sound Fields

Fig. 9.9 Averaged percentile of nonidentified syllables with all single syllables tested obtained by the listening test for different angles ξ of white-noise incidence as a disturbance

Because the direct speech sound arrived from the frontal direction to the listener, the value of τIACC is always close to zero being invariant. Thus, this factor was eliminated from the analysis by Equation (9.7) (Table 9.3). The minima of the psychological distance were always found for the noise disturbance from 120, so that the NIs were minima. On the other hand, when the noise disturbance arrived from 30, the distance due to τe for all of the syllables commonly indicated the maxima in six factors.

Table 9.3 Psychological distance calculated due to each of six factors

Horizontal angle of noise

D (0)

Dτe

Dτ1

Dφ1

DIACC

DwIACC

30

0.064

0.420

0.164

0.442

0.248

0.052

60

0.056

0.351

0.247

0.355

0.266

0.049

90

0.063

0.348

0.162

0.401

0.292

0.049

120

0.058

0.279

0.157

0.376

0.270

0.043

180

0.074

0.383

0.171

0.494

0.247

0.071

The weighting coefficients in Equation (9.7) for the six factors are listed in Table 9.4 . According to the weighting coefficients obtained here, the factors τe and WIACC contributed significantly to the NI. For each single syllable, the relationship between the calculated values by Equation (9.7) and the measured values are shown in Fig. 9.10. Obviously, the linear relationship was achieved (r = 0.86, p < 0.01).

Table 9.4 Weighting coefficients determined

 

(0)

τe

τ1

φ1

IACC

WIACC

Coefficient

0.053

0.335

0.028

0.136

0.086

0.384