Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
курсовая / материалы / с80_прикладная_акустика.pdf
Скачиваний:
0
Добавлен:
13.05.2026
Размер:
1.35 Mб
Скачать

M. Larrosa-Navarro, D. de la Prida and A. Pedrero

Most of the research conducted in the last thirty years on C80 has focused on the determination of its just noticeable difference, JND [8–12]. They use a wide variety of musical motifs, different methods of generating the sound fields, diverse participants and multiple methods of evaluation.

One of the first studies focusing on the determination of the JND for clarity was carried out by Cox et al. [8]. Their experiment was based on the use of synthetic sound fields that were designed to have acoustic parameters values typically found in concert halls. The JND was measured for early lateral energy fraction (ELEF), inter-aural cross correlation (IACC), clarity index (C80) and centre time (Ts). The number of subjects ranged from 7 to 10 for the tests. Two musical motifs were used, which were chosen to be varied in style and tempo. Ts was the parameter used in the experiment to assess changes in perceived clarity. The results for the C80 were derived from those obtained for Ts. The experiment shows that the JND was motif dependant, with the slow tempo motif having a difference limen twice as large as the other motif.

Bradley et al. [9] later conducted another study on JND for clarity, this time focusing on speech, C50. They also used synthetic sound fields recreated in an anechoic chamber using 8 loudspeakers. The C50 differences used for the evaluation of clarity in this article have been used as the base case for all future research. A total of 10 subjects between 20 and 60 years of age participated in the listening test. The stimuli used were speech recordings of a male speaker and the results were translated for musical clarity C80.

Ten years later, Ahearn et al. [10] performed a preliminary study on the JND of the C80, which led to a three-part investigation [11]. The test carried out in [10], which is also the first test considered in [11], studied two different synthetic sound fields with initial C80 values of 3 dB and 1 dB. These values were modified to obtain a maximum positive difference of 3 dB from the base case scenarios. The listening test assessed the perception of clarity using three musical motifs, two played by an orchestra and a solo passage. They had somewhat quick moving notes, as the aim was to look for differences in musical clarity. The test was carried out by 51 participants and the results proved to be significant when only the most reliable participants were considered, of which there were 17. The results indicated that musical motifs could have an influence on the value of JND. They also showed that the test might be overly difficult due to the small changes in the C80 and the number of musical motifs. In [11], an additional test was developed to target these difficulties. The number of differences of C80 values studied was reduced from nine to four, with an increase in the C80 level differences to reduce the difficulty of the test. The number of musical motifs was also reduced. Two were used, one of which had not been used in the previous test. This test was carried out by 11 participants and the results were different for each motif under evaluation. A third and final test was made considering the difficulties and possible errors of the previous two test. This test was taken by 28 subjects and only one musical motif was used.

Martellotta [12] also investigated the JND for the C80 but focused on spaces with larger reverberation times than typically

Applied Acoustics 208 (2023) 109370

found in concert halls. He wanted to investigate the possible dependence between the JND of C80 and the reverberation time. Three different sound fields with increasing reverberation times were taken into consideration. To improve the realism of the test he used impulse responses measured in three Italian churches, instead of synthetic sound fields. The musical motifs used in the listening test were a liturgical chant and a fast tempo cello solo passage. 40 participants between the ages of 20 and 55 were tested. As in [10,11], the number of participants was reduced to 13 due to unreliable responses. A difference was observed between the results of the liturgical chant and cello theme, but a t-test confirmed that this change was not statistically significant. JND was not considered to be motif dependent.

It is remarkable how there is no agreement among the aforementioned studies on whether the JND for C80 is motif dependent. Although some studies report differences between the results for different motifs, no systematic research has been conducted to assess the influence of this factor on the results. Most investigations only mention that the choice of the musical piece can be a sensible aspect. It is also worth mentioning that this question has not only been raised in clarity studies, but also in the evaluation of spatial impression aspects [13,14] and on the study of listening preferences in concert halls [15]. In the field of spatial impression Wang [14] has approached this issue but her conclusion neither supports nor disproves whether the perception of spatial parameters such as ASW and LEV is dependent on the musical motif. Still, there seems to be some agreement that the musical motif does affect. In the case of listening preference, the investigation carried out in [15] shows that the preference for acoustic characteristics of a hall is highly individual and is affected by both musical motif and listening position.

The lack of agreement on the effect that musical motif has on the perception of clarity has prompted the present study. A listening test has been carried out to investigate whether musical stimuli affect the perception of clarity. The test considers three base case scenarios, which were obtain by simulating three different rooms.

The range of musical clarity values used in this study was intended to be broad enough to include those obtained in music performance spaces such as concert halls, chamber music halls and churches. To this end, the virtual models of the rooms were modified in order to achieve clarity changes varying between 0.5 to 4.0 dB from the original C80 value of the room. Five musical motifs were used to quantify how high participants rated the clarity of the room. A total of 90 stimuli were employ.

2. Experimental set-up

2.1. Impulse response simulation

In order to study the effect that the musical motif has on the perception of the musical clarity of a room, it is necessary to evaluate a wide range of C80 levels. These levels will be assessed by

Fig. 1. Room models used to obtain the required impulse responses. From left to right: Room A (Elmia Congress and Concert Hall), Room B (Campus Sur auditorium) and Room C (San Cebrián de Mazote).

2

M. Larrosa-Navarro, D. de la Prida and A. Pedrero

Applied Acoustics 208 (2023) 109370

means of a listening test using multiple musical pieces. The stimuli used in the listening test were obtained from the convolution of the selected musical motifs with impulse responses. The impulse responses were created using the virtual models of three venues. The absorption characteristics of the model materials were modified to simulate a total of 18 sound fields with different C80 values.

Three rooms were selected to have C80 values typically found in a concert hall, a chamber musical hall and a church. The model of the room with similar values to a concert hall, Room A, was taken from the data base of the acoustic modelling software Odeon . The name of the room in the software is Elmia RoundRobin2 and it was modelled after the auditorium of the Elmia Congress and Concert Hall in the city of Jönköpin (Sweden). It has a volume of 12,520 m3, a T20 of 2.21 s and a C80 level of 1.75 dB. The virtual models of the other two rooms were created using CAD software and calibrated with measured impulse responses using Odeon . The second venue, Room B, was modelled after the auditorium of the Campus Sur of Universidad Politécnica de Madrid (UPM). It has a

volume of 1,840 m3, a T20 of 1.16 s and a C80 of 2.75 dB. The C80 values obtained from this room are similar to those found in chamber music halls. The last room, Room C, was used to obtain C80 values close to those found in a church. The model was obtained from the research carried out in [16], where the aim was to virtually restore the Hispanic Rite in five pre-Romanesque churches in Spain. The church used for our investigation was San Cebrián de Mazote, which

has a volume of 2,430 m3, a T20 of 2.88 s and a C80 of 5.9 dB. A summary of the relevant acoustic parameter values for the three rooms is shown in Table 1.

Table 1

Summary of acoustical descriptors of the three virtual models.

Venue

Room A

Room B

Room C

 

(Elmia Concert

(Campus Sur

(San Cebrián

 

Hall)

Auditorium)

de Mazote)

 

 

 

 

T20 (s)

2.21

1.16

2.88

EDT (s)

2.30

1.03

3.33

C80 (dB)

1.75

2.75

5.90

The method used to obtain the different levels of C80 is the same as in [9]. It was first applied in [9–11], where the aim was to determine the JND for C50. A listening test was conducted where participants had to compare a base case sound field with other sound fields with different C50 levels. The differences of the C50 levels with respect to the base case were 0.0; 0.5; 1.0; 1.5; 2.5 and 4.0 dB. The clarity range proposed in [9] was considered acceptable for our research.

The three acoustic models of the selected rooms were considered as the base case sound fields. (See Fig. 1). The models were modified to achieve the desired C80 differences. The values of C80 and T20 obtained for the simulated impulse responses after modifications are presented in Table 2 and the energy decay curves for all impulse responses are shown in Fig. 2. The changes to the models consisted of increasing the absorption coefficients of the room’s materials. The modification of the absorption coefficients was carried out proportionally in all materials, with the intention of maintaining the spatial impression of the room. The first row of Table 2 represents the C80 values for the base case sound fields. The following rows correspond to the values obtained for the impulse responses of the modified models. The C80 level of the models was calculated using Odeon , which follows the guidelines of ISO-3382 standard [7], and averaging the C80 values for the 500 Hz and 1000 Hz octave bands [9]. A total of 18 impulse responses were calculated, six for each of the rooms. The total range of C80 values in this study spans from 5.90 dB to 6.75 dB, as shown in Fig. 3.

2.2. Musical motifs

Five musical motifs with different styles, tempos and characteristics were used. The musical motifs chosen had to be short enough not to extend the duration of the listening test considerably and not to bore the subject, but long enough to give the participants time to fully evaluate the acoustic environment [13]. The duration of the selected fragments varies between 10 and 24 s. The reason for the different lengths of the fragments is the desire to have all the musical passages end in a conclusive cadence. Cutting off a musical fragment mid-phrase can cause involuntary tension in

Table 2

C80 and T20 values for each simulated impulse response.

 

Room A

 

 

Room B

 

 

 

 

Room C

 

C80 (dB)

T20 (s)

C80 (dB)

T20 (s)

C80 (dB)

T20 (s)

-1.75

1.99

 

2.75

1.11

 

5.90

2.88

 

-1.25

1.98

 

3.30

1.01

 

5.40

2.64

 

-0.70

1.74

 

3.80

1.02

 

4.90

2.44

 

-0.20

1.64

 

4.25

0.95

 

4.40

2.27

 

0.80

1.36

 

5.25

0.86

 

3.40

1.96

 

2.30

1.14

 

6.75

0.73

 

1.90

1.61

 

Fig. 2. Energy decay curves of simulated impulse responses. From left to right: Room A (Elmia Congress and Concert Hall), Room B (Campus Sur auditorium) and Room C (San Cebrián de Mazote).

3

M. Larrosa-Navarro, D. de la Prida and A. Pedrero

Applied Acoustics 208 (2023) 109370

Fig. 3. Range of C80 values assessed in each of the rooms.

the participants due to the lack of musical resolution. In contrast, a conclusive cadence represents the end of a musical phrase, creating a sense of resolution. Ending a stimuli mid-sentence would allow the duration of all stimuli to be the same but could have an influence on the participants’ concentration and result in poorer performance in the listening test.

Another condition for the selection of musical motifs was to have a wide variety of musical characteristics. Differences in instrumentation, tempo and style were desirable. A previous study was carried out by the authors to assess the musical modulation of a set of 69 anechoic-recorded pieces from [17] and the Odeon repertoire. The musical modulation of a piece represents its rhythmic pattern, so it was expected that the most prominent modulation frequencies would vary according to the tempo and figuration of the piece.

Following the study of modulation, it was decided to use five pieces of different tempo and instrumentation covering a wide range of musical styles. Firstly, it was important that the instrumentation of the pieces was varied, so it was decided to select at least one vocal piece, one solo piece and one piece performed by a large ensemble. Secondly, it was decided that fast and slow tempo pieces should be used for the same instrumental ensemble. In case of the solo pieces, it was of interest that the instruments playing them had a different frequency range. All these conditions were selected in an effort to evaluate how the clarity of a room can affect the musical pieces differently.

Finally, it was decided to use five musical fragments: a vocal piece performed by a 5-person choir, a slow tempo solo piece, a fast tempo solo piece, a slow tempo orchestral piece and a fast tempo orchestral piece. (See Table 3). The five motifs used were:

1.MM1: Flute Concerto No. 1 in G Mayor KV. 313 (1778) by Wolfgang Amadeus Mozart. Recording of a fast musical passage played by a flute at a tempo of 115 bpm. It is a classical style piece and was selected because of its large amount of ornamentation and trills. The main note played is an eighth note and the duration of the fragment is 24 s.

2.MM2: Élégie for cello and orchestra op. 24 (1883) by Gabriel Fauré. Solo performance of a cello passage at a slow tempo (60 bpm). The recording covers four bars where the main note played is the eighth note and the dotted eighth note. It is an impressionistic piece and was selected because of the difference in musical texture from the previous motif and because the cello has a lower frequency range than the flute. It has a duration of 18 s.

3.MM3: Prelude from La Traviata (1853) by Giuseppe Verdi. This 20 s fragment consist of an orchestral passage played in slow tempo, 66 bpm. It corresponds to a five-bar passage in which the strings play a simple, unornamented melody. The wind section carries the accompaniment, consisting of eighth notes and with a marked accentuation of the tempo.

4.MM4: Credo from the Office for the Dead from ancient Hispanic Rite [18]. This fragment was recorded in [16] and is a passage of approximately 14 s long performed by a 5-person choir. It follows the style of ancient Christian liturgical rites and present no instrumental accompaniment. First, a singer performs the motif and then the choir responds to it. It was selected to study a motif that was typical of the style of music usually performed in churches.

5.4th movement of the Ninth Symphony in D Major op. 125 (1824) by Ludwig Van Beethoven. Romantic-style orchestral passage consisting of 7 bars played in a presto tempo of 170 bpm. The instrumental group consist of a string section, a woodwind and brass section and percussion. All the sections present fast and complex melodic lines, highlighting the drum roll performed by the timpani. Although it belongs to the same musical movement as MM3, it was considered sufficiently different due to the musical texture and the use made of the instrumentation. The duration of the fragment is 10 s.

The spectral content of the musical motifs is presented in Fig. 4. It can be seen that the spectral content for all pieces is consistent in the mid-frequency range, but considerably different at low frequencies. The two orchestral pieces have relatively similar spectral

Fig. 4. SPL of the five motifs in octave bands.

Table 3

Musical characteristics of the motifs used.

Motif

Piece

Composer

BPM

Fast/slow

Tempo

Duration

1

Flute Concerto No. 1

W. A.

115

Fast

Allegro

24 s

 

in G Major kV 313

Mozart

 

 

maestoso

 

2

Élégie for cello and

Gabriel

60

Slow

Molto

18 s

 

orchestral op. 24

Fauré

 

 

adagio

 

3

Prelude from La Traviata

Giuseppe

66

Slow

Adagio

20 s

 

 

Verdi

 

 

 

 

4

Liturgical chant

Anonymus

-

Slow

-

14 s

5

4th movement, Ninth

L. V.

170

Fast

Presto

10 s

 

Symphony op. 25

Beethoven

 

 

 

 

 

 

 

 

 

 

 

4

M. Larrosa-Navarro, D. de la Prida and A. Pedrero

shapes, with high levels through the low and middle frequencies and the expected decay at high frequencies. The higher values at low frequencies are due to the presence of the brass section and the double basses of the orchestra. In the case of solo pieces, it is noticeable that the sound level in MM2 is more focused in the lower frequencies, with the highest value in the 250 Hz octave band. Whereas for MM1, the highest-level value is at 1000 Hz due to the higher pitched nature of the flute. In the case of MM4, the highest values are found between the 250 and 1000 Hz bands, with the highest value at 500 Hz. This is consistent with the male vocal range in singing, which is considered to go from E2 (82 Hz) as the lowest note for a bass singer to E5 (659.25 Hz) as the highest note for a countertenor [19].

It is considered that the observed differences in spectral content between the pieces will influence the perceived clarity of the room.

2.3. Criteria for the selection of subjects

A total of 36 subjects participated in the listening test, of which 23 were women and 13 were men. Their ages ranged from 18 to 30 years. It was desirable to have a wide range of typical listeners as a sample. Therefore, the musical education and listening background of the participants varied widely (from inexperienced listeners to professional musicians). Of the entire group of participants, only one indicated that he had previous experience with psychoacoustical experiments.

Participants’ hearing threshold was not tested due to time constraints, but all participants indicated that they had no known hearing problems. All the subjects performed a training session prior to the test.

2.4. Listening test specifications

The purpose of the listening test conducted was to gather information on perceived levels of clarity for different C80 values. These levels were assessed using the five selected musical motifs.

The stimuli used in the listening test were obtained by convolving the simulated impulse responses of the three models and the five musical motifs. This convolution was carried out using the simulation software Odeon . Its auralization tool was used and binaural signals were obtained using one of the head-related transfer functions (HRTF) available in the software. The HRTF was used together with the headphone filter corresponding to the model to be used during the test. The headphones used were the Sennheiser HD-650, which are open-back circumaural headphones. The fixed auralization conditions, as well as the method used for the modification of the room models, ensures that the spatial impression of the rooms remains constant. This is important since changes in the spatial impression of reverberation can have an effect on the perception of clarity [20].

Previous research shows that loudness variability between musical passages can influence the perceived clarity of a room. The investigation carried out in [21] finds a negative correlation between loudness and clarity, meaning that passages played in piano can be perceived more clearly than those played in forte dynamics. In order to ensure comparability between fragments, the anechoic recordings of the five musical motifs were aligned by normalizing their RMS level before being convolved with the simulated impulse responses. The volume configuration remained fixed during the whole test, with the average playback of the convoluted stimuli being 76.7 dB, the maximum level reached in a stimulus was 78.9 dB, and the minimum 72.0 dB. The playback level remained constant for all subjects.

The listening test was divided in three parts, one for each room. Each part is further divided into five sub-parts, one for each musical motif. The three parts were separated by a five-to-ten-minute

Applied Acoustics 208 (2023) 109370

break to reduce the effect that fatigue and auditory memory could have on the following evaluations. Once the test began, participants had to listen to the six stimuli of each sub-part, i.e., a musical motif convolved with the corresponding impulse responses of the rooms. This was done to ensure that participants had a reference of the room and thus avoid the bias due to non-selection of the extreme positions of the scale [22]. Once all the stimuli had been heard, subjects could begin to respond and had the option of listening to the stimuli again. The order in which the rooms, musical motifs and stimuli were presented was randomised for each participant, thus ensuring statistical independence.

It was considered that the best way to assess the clarity of each stimulus was through individual evaluations. The listening test protocol consisted of a Likert scale [23]. It was decided to have a category scale of five fixed positions with labels defining the response for each one. The number of positions was chosen based on the research done by Simms et al. [24] and Farina [5]. In [24] it was found that when more than six positions are fixed in a Likert scale, the accuracy of the test does not increase and, on the contrary, it is more exhausting, and the participants hesitate more when answering the questions. After completing a test with a six-position scale, Farina [5] observed that many of the participants tried to mark a position in the centre of the scale. He concluded that it was preferable to have an odd number of positions as to have a central position.

When it comes to the presentation of the labels, research carried out by Weng and Cheng [25] and Subedi [26] highlighted that the order of presentation does not affect the results. The labels were presents in the order indicated in the UNE-ISO 4121 standard [27]. The question participants had to answer was to indicate how high they considered the level of clarity of each stimulus. The labels on the five-point Likert scale used to answer the question were, from left to right: ‘‘not at all high”, ‘‘slightly high”, ‘‘moderately high”, ‘‘very high” and ‘‘extremely high”.

2.5. Listening test procedure

The listening test was carried out in two different locations. The first one was a dry and silent rehearsal room at the Conservatorio Superior de Música Manuel Massotti Littel of Murcia (Spain). The second one was the anechoic chamber of the Universidad Politécnica de Madrid, UPM (Spain). 13 participants took the test at the first location and the rest of them carried it out at the UPM. All signals were presented through Sennheiser HD-650 headphones. The volume level used in the listening test was fixed and participants were told that it could not be changed at any time.

After welcoming the participants, they were asked to fill in a demographic survey in which they had to answer questions about their age, gender, level of musical education, etc. The purpose of the survey was twofold: first, to obtain additional information that could be used in future analyses and, second, to allow the participants to adjust to the environment in which the test was to be conducted. Once the survey had been completed, they were given an instruction sheet explaining the listening test procedure, what clarity means [28] and the question they were to answer. Written instructions were chosen to reduce the bias due to the interaction between participant and researcher.

As stated in [29] and as noted by Farina in [5], the vocabulary used to describe the subjective quality being assessed is incredibly important. The same term can be understood differently by acousticians and musicians. It may even be unfamiliar to people with no musical education. This could lead to errors in transferring information about the performance of the listening test. In order to avoid this problem, it was decided to describe clarity using the glossary written by Lindau in [28]. The definition given in this glossary for clarity is: ”Clarity/clearness with respect to any character-

5