- •1 Introduction
- •2 Experimental set-up
- •2.1 Impulse response simulation
- •2.2 Musical motifs
- •2.3 Criteria for the selection of subjects
- •2.5 Listening test procedure
- •2.6 Statistical analysis
- •3 Results
- •3.1 Linear regression
- •3.2 Cochran-Mantel-Haenszel test
- •4 Discussion
- •5 Conclusions
- •CRediT authorship contribution statement
- •Declaration of Competing Interest
- •References
M. Larrosa-Navarro, D. de la Prida and A. Pedrero
istic of elements of a sound scene. Impression of how clearly different elements in a scene can be distinguished from each other, how well various properties of individual scene elements can be detected. The term is thus to be understood much broader in the realm of room acoustics, where Clarity is used to predict the impression of declining transparency with increasing reverberation.” This definition was translated into Spanish and included in the instruction sheet.
Before starting the listening test, all participants had a short training session. This was intended to familiarise them with the test interface, the question asked and the answering method. The stimuli used during the training session were a clapping signal convolved with impulse responses with different reverberation times and clarities. The test signal was obtained from the software Odeon and consists of an anechoic recoding of hands being clapped. None of these impulse responses were used during the listening test. Once the training process had been completed and it was certain that the participants understood the task to be performed, the listening test began. As indicated above, the listening test consisted of three parts with a 5-to-10-min break in between. The time it took each participant to complete the test was measured. The average duration was calculated to be 40 min (breaks are not included in this calculation).
2.6. Statistical analysis
Most works that use a Likert-type scale as a method of response to a listening test only study the results through regression [5,30]. Regression analysis estimates the relationship between a dependent variable and one or more independent variables. The result of the regression analysis is a coefficient of determination R2 indicating the proportion of variance in the dependent variable that can be explained by the independent variable. Additionally, the p-value of the regression was calculated, which indicates whether changes in the value of the independent variable are related to changes in the dependent one. P-values lower than 0.05 indicate that the changes can be related. It was decided that the C80 level of the room under study was the independent variable, while perceived clarity was the dependent variable. A regression analysis was performed for each musical motif in each room.
When analyzing data obtained using a Likert scale, it is important to consider whether participants were allowed to answer at any point of the scale or only at fixed positions. In the first case we would be working with a continuous variable, while in the second case it is a categorical one. In some studies using Likert scales, such as [5,30], where participants were only allowed to answer at fixed position on the scale, it is common to use the mean, which usually gives better regressions because decimal numbers are added. However, we consider this addition of decimals might bias the results of the research, given that participants were asked to answer on a discrete scale, thus adding information not specified by them. In our research, the statistic used for the regression is the mode of the answers for each level of C80. The mode was used since it is the most appropriate statistic to represent the most frequently occurring response when discrete samples are used.
Apart from the regression analysis that shows if the participants perceive the changes in clarity, we wanted to carry out an analysis that would allow us to study the interrelationship between C80 levels, musical motifs and perceived clarities. It was considered that the most appropriate analysis for this evaluation was a Cochran-Mantel–Haenszel (CMH) analysis [31,32]. This type of analysis is rarely used in acoustics but is widely used in other areas of research. It has recently started to be used to analyse listening test results [33]. In its most basic approach, the analysis assesses the effect that a categorical variable K has on the relationship of
Applied Acoustics 208 (2023) 109370
two dichotomous variables, X and Y. For the purpose of carrying out the CMH analysis is necessary to make K contingency Tables (2 2 K), i.e., a tabular mechanism used in statistics to present categorical data as frequency counts, which present the results of the experiments for each of the X and Y variables. CMH analysis can also be generalised to study experiments where X and Y are not dichotomous. In this case, the size of the contingency table is I J K, where I and J are different from two and can be different in sizes [33,34]. A more detailed description of the analysis and an example of contingency table is shown in Appendix2.
The CMH analysis returns an individual v2 and p-value for each sub-table. These values represent the statistical significance of the common-odds ratio. Both parameters are associated to the differences between the results compared. In the case of v2, its value will increase from 0 as the differences increase, while the p-value will become smaller as the differences grow. The results will be considered significant when the p-values reach a value of less than 0.05. Additionally, a global p-value will be calculated for each room and for each musical motif evaluated in the two analyses. This global p- value is only an estimate, because the exact p-value can only be calculated for 2 2 tables.
Once the listening test data were collected, two contingency tables were created for each room. In the first one, C80 levels were considered to be the independent variable and perceived clarity the dependent one. The effect of the five musical motifs on this relationship was studied and the size of the table was 5 5 6. The second contingency table assessed the effect that C80 levels have on the relationship between musical motifs and perceived clarity. The size of this table was 5 6 5. The data in these tables is a count of the number of participants who have chosen a particular position on the Likert scale for a given musical motif and a given C80 level.
The statistical results of the CMH analysis are only computable if the marginal sums are non-zero for each sub-table evaluated. This means that, if we evaluate the tables with the five positions individually, some tables will not return a value, because the marginal sum is zero. For example, perceived clarity level 1 for Room B, where C80 levels are very high. In order to carry out the analysis it was decided to apply one of the methods used in [35]. This consists of reducing the Likert scale categories by clustering the results into three groups: ‘‘low clarity” for answers 1 and 2, ‘‘medium clarity” for answer 3 and ‘‘high clarity” for answers 4 and 5. Both tables were reduced and their final size was 3 5 6 and 3 6 5, respectively. Yates’s correction for continuity, which aims to prevent the overestimation of statistical significance for small data values, was applied [36].
3. Results
This section presents the results of the analyses carried out, consisting of a regression analysis and two CMH analyses. In addition to these analyses, Fig. 5 shows a box-and-whisker plot of the raw results, so that the variability of response given by participants for different rooms, motifs and levels of clarity can be observed. In this plot, the dots signify the median, the lower and upper ends of the thick line mark the 25th and 75th percentiles, the thin lines extend to the maximum and minimum values, and the triangles show the variability of the median between samples.
The results for each room will be analyzed individually. This is because the assessment of musical clarity for each room has been done in different parts of the test. Furthermore, the Likert scales used had values between 1 and 5 for all rooms, regardless of the reference C80 value. The joint analysis of the rooms would imply that similar perceived clarity values would be given to very different C80 values that have not been assessed simultaneously.
6
M. Larrosa-Navarro, D. de la Prida and A. Pedrero |
Applied Acoustics 208 (2023) 109370 |
3.1. Linear regression
In the regression, C80 levels will be considered to represent the independent variable X and perceived clarity the dependent variable Y. As the aim is to study the relationship between perceived clarity and musical motif, a regression analysis will be performed for each combination of room and motif. Regression analysis studies the ability of participants to perceive an increase in clarity with increasing C80 levels. It is likely that the higher the C80 level, the higher the position chosen by participants on the Likert scale. Nevertheless, it is expected not to obtain very high R2 values because some of the C80 differences studied are less than one JND. The results of the analysis will make it possible to determine whether, for the same room, there are differences between the R2 values of each musical motif. The results of the regressions performed for each room and each motif are shown in Table 4.
It can be seen that the calculated R2 values vary significantly between musical motifs. This variation also exists for the same musical motif in different rooms. The motif with the best fitted regression is MM1. The highest value of R2 for this motif is found in Room C (89.96%) and the lowest in room B (75.86%). The same behaviour is observed in MM4, where the maximum value of R2 is 84.05%, 6% smaller than that obtained for MM1. Almost all p- values obtained for these two motifs are less than 0.05. The only one that exceeds it is MM4 in Room B, where the p-value obtained is 0.0535. Due to the closeness of this value to 0.05, the regressions of these two motifs could be considered significant. Therefore, it can be stated that changes in the clarity of the pieces, which are
due to the different acoustic conditions of the enclosures with different levels of clarity, correspond to significant changes in the perception of clarity. (See Table 5).
The next motif with the highest R2 values is MM2. The two rooms that have good coefficients of determination (approx. 60%) are Room A and Room C. In both situations the results can be considered significant, i.e., the p-value of MM2 for Room A is 0.0539. The lowest R2 value for this motif is found in Room B, the result of which is also not considered significant.
MM3 and MM5 are the two motifs with the worst R2 values. In addition, almost all the results of the combinations studied are considered non-significant. The only combination of room and musical motif where a high R2 value with a significant p-value is obtained is for Room C and MM3 (77.82%). The mode of the responses for these two musical motifs was checked and it was observed that despite the changes in C80 level, the values given for perceived clarity were very similar. In some cases, it was even observed that increasing the C80 level caused a decay in perceived clarity, as can be seen in Fig. 6 for Room A and MM3 and Room C and MM5. This figure represents the regression lines as a function of the C80 level and the mode of the results for perceived clarity.
Among the three rooms evaluated, it can be observed that the highest R2 values are found in Room C. MM5 is an exception within this room, as it has a very low R2 value. It is the lowest value found for this musical motif among the three rooms and its regression line has a negative slope. This means that participants considered the level of clarity of the room to be lower as the C80 values increased. In Room A it can be seen that for MM1 and MM4 there
Fig. 5. Box and whiskers for the six levels of C80 evaluated in each room, considering the five musical motifs.
Table 4
Coefficients of determination R2 in percentage for each combination of motif and venue. () p > 0.15; (*) 0.15 > p > 0.05; (**) 0.05 > p > 0.005; (***) p < 0.005.
|
MM1 |
MM2 |
MM3 |
MM4 |
MM5 |
Room A |
84.15** |
64.62* |
3.19 |
78.59** |
29.51 |
Room B |
75.87** |
29.13 |
32.88 |
64.73* |
29.13 |
Room C |
89.96*** |
66.59** |
77.82** |
84.05** |
19.07 |
|
|
|
|
|
|
Fig. 6. Linear regression lines for each combination of room and stimuli.
7
