Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
4.18 Mб
Скачать

ESTIMATING THE DIFFERENCE BETWEEN TWO MATCHED POPULATION MEANS

125

smoked one to nine cigarettes a day, and 473 had mothers who had smoked 10 or more cigarettes a day. The figure shows the 95 per cent confidence intervals for differences in mean weight according to sex of baby and smoking habits of mothers: at birth, and at three and six months.

The results show, for example, that at birth, the difference between the sample mean weight of female babies born to non-smoking mothers and those born to mothers smoking 10 or more cigarettes a day, was (3220 3052) = 168 g. That is, the infants of smoking mothers are on average lighter by 168 g. Is this difference statistically significant in the population, or due simply to chance? The 95 per cent confidence interval of (234 to 102) g, does not include zero, so you can be 95 per cent confident that the difference is real, i.e. is statistically significant.

Exercise 10.1 Interpret the sample mean and confidence intervals shown in Table 10.2 for all four differences in weights at six months.

Estimating the difference between two matched population means – using a method based on the matched-pairs t test

If the data within each of the two groups whose means you are comparing is widely spread compared to the difference in the spreads between the groups,5 this can make it more difficult to detect any difference in their means. When data is matched (see Chapter 7 for an explanation of matching), this reduces much of the within-group variation, and, for a given sample size, makes it easier to detect any differences between groups. As a consequence, you can achieve better precision (narrower confidence intervals), without having to increase sample size. The disadvantage of matching is that it is sometimes difficult to find a sufficiently large number of matches (as you saw in the case-control discussion earlier).

In the independent groups case, the mean of each group is computed separately, and then a confidence interval for the difference in these means is calculated. In the matched groups case, we use a method based on the matched-pairs t test, in which the difference between each pair of values is computed first and then a confidence interval for the mean of these differences is calculated.

An example from practice

Table 10.3 shows the 95 per cent confidence intervals for the difference in bone mineral density in two matched groups of women, one group depressed and one ‘normal’ (Michelson et al. 1995). (Ignore the ‘SD from expected peak’ rows.) Only one of the confidence intervals contains zero, indicating that there is no difference in population mean bone mineral density at the radius, but there is at all of the other five sites.

5 Called ‘between-group’ variation.

126

CH 10 ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

Table 10.3 Confidence intervals for the differences between the population mean bone mineral densities in two individually matched groups of women, one group depressed, the other ‘normal’, using a method based on the matched-pairs t test. Reproduced from NEJM, 335, 1176–81, by permission of Massachusetts Medical Society

 

 

 

Depressed

Normal

Mean Difference

P

Bone Measured

 

 

Women

 

Women

(95% CI)

Value

 

 

 

 

 

 

 

 

 

 

 

Lumbar spine (anteroposterior)

 

 

± 0.15

 

± 0.09

 

 

 

Density (g/cm2)

peak

 

1.00

1.07

0.08

(0.02 to 0.14)

0.02

SD from expected

0.42

±

1.28

0.26

±

0.82

0.68

(0.13 to 1.33)

 

 

 

 

 

 

 

 

 

Lumbar spine (lateral)

 

 

 

± 0.09

 

± 0.07

 

 

 

Density (g/cm2)

 

 

 

0.74

0.79

0.05

(0.00 to 0.09)

0.03

SD from expected peak

0.88

± 1.07

0.36

± 0.80

0.50

(0.04 to 1.03)

 

Femoral neck

 

 

 

 

± 0.11

 

± 0.11

 

 

 

Density (g/cm2)

 

 

 

0.76

0.88

0.11

(0.06 to 0.17)

<0.00

SD from expected peak

1.30

± 1.07

0.22

± 0.99

1.08

(0.55 to 1.61)

 

Ward’s triangle

 

 

 

 

± 0.14

 

± 0.13

 

 

 

Density (g/cm2)

 

 

 

0.70

0.81

0.11

(0.06 to 0.17)

<0.00

SD from expected peak

0.93

± 1.24

0.18

± 1.22

1.11

(0.60 to 1.62)

 

Trochanter

 

 

 

 

± 0.11

 

± 0.08

 

 

 

Density (g/cm2)

 

 

 

0.66

0.74

0.08

(0.04 to 0.13)

<0.001

SD from expected peak

0.70

± 1.22

0.26

± 0.91

0.97

(0.46 to 1.47)

 

Radius

 

 

 

 

± 0.04

 

± 0.04

 

 

 

Density (g/cm2)

 

 

 

0.68

0.70

0.01

(–0.01 to 0.04)

0.25

SD from expected peak

0.19

± 0.67

0.03

± 0.67

0.21

(–0.21 to 0.64)

 

*Plus-minus values are means ± SD. CI denotes confidence interval.

Values for “SD from expected peak” are the numbers of standard deviations from the expected peak density derived from a population-based study of normal white women.3

This measurement was made in 23 depressed women and 23 normal women.

Exercise 10.2 In Table 10.3, which population difference in bone mineral density is estimated with the greatest precision?

You can also calculate a confidence interval for the difference in two population percentages provided they derive from two metric variables. For the difference between two population proportions, however, a different approach is needed. This is an extension of the single proportion case discussed in Chapter 9, as you will now see.

Estimating the difference between two independent population proportions

Suppose you want to calculate a 95 per cent confidence interval for the difference between the population proportion of women having maternity unit births who smoked during pregnancy and the proportion having home births who smoked. The sample data on smoking status for the sample of 60 mothers is shown in Table 10.1.

ESTIMATING THE DIFFERENCE BETWEEN TWO INDEPENDENT POPULATION MEDIANS

127

There are 10 mothers who smoked among the 30 giving birth in the maternity unit and six among the 30 giving birth at home. This gives sample proportions of 10/30 = 0.3333, and 6/30 = 0.2000, respectively. You can check whether this difference is statistically significant or likely to be due to chance alone, by calculating a 95 per cent confidence interval for the difference in the corresponding population proportions.6 To do this by hand is a bit long-winded and you would want to use a computer program to do the calculation for you.

An example from practice

If you look back at Table 9.1, the randomised trial of integrated versus conventional care for asthma patients, the last column shows the 95 per cent confidence intervals for the difference in population percentages between the two groups, for a number of patient perceptions of the scheme. As you can see, none of the confidence intervals include zero, so you can be 95 per cent confident that the difference in population percentages between the groups of patients is statistically significant in each case.

Estimating the difference between two independent population medians – the Mann–Whitney rank-sums method

As you know from Chapter 5, the mean may not be the most representative measure of location if the data is skewed, and is not appropriate anyway if the data is ordinal. In these circumstances, you can compare the population medians rather than the means, and in place of the 2-sample t test (a parametric procedure), use a method based on the MannWhitney test (a non-parametric procedure).

Parametric versus non-parametric methods

A parametric procedure can be applied to data which is metric, and also has some particular distribution, most commonly the Normal distribution. A non-parametric procedure does not make these distributional requirements. So if you are analysing data that is either metric but not Normal, or is ordinal, then you need to use a non-parametric approach. The Mann–Whitney procedure only requires that the two population distributions have the same approximate shape, but does not require either to be Normal. It is the nonparametric equivalent of the two-sample t test.

Briefly, the Mann–Whitney method starts by combining the data from both groups, which are then ranked. The rank values for each group are then separated and summed. If the medians of the two groups are the same, then the sums of the ranks of the two groups should be

6The 95 per cent confidence interval is (0.088 to 0.355). Since this interval includes 0, we conclude that there is no difference in the proportion of mothers who smoked at home and in the maternity unit.

128

CH 10 ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

Mann-Whitney Test and CI: Apgar matn, Apgar home

Apgar ma

N

=

30

Median

=

7.000

Apgar ho

N =

30

Median

=

8.000

Point estimate for ETA1-ETA2 is

-1.000

95.2 Percent CI

for ETA1-ETA2 is (-2.000,0.000)

Confidence interval for the difference in the two medians.

W = 790.5

Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0668 The test is significant at 0.0616 (adjusted for ties)

Cannot reject at alpha = 0.05

Figure 10.3 Minitab’s Mann–Whitney output for a 95 per cent confidence interval for the difference between two independent median Apgar scores – for infants born in maternity units and at home (raw data in Table 10.1). Note that Minitab uses Greek ‘ETA’ to denote the population median

similar. However, if the rank sums are different, you need to know whether this difference could simply be due to chance, or is because there really is a statistically significant difference in the population medians. A Mann–Whitney confidence interval for the difference will help you decide between these alternatives.

As an illustration, let’s compare the difference in the population median Apgar scores for the maternity unit and home birth infants, using the sample data in Table 10.1. These are independent groups, but since this data is ordinal, we cannot use the two-sample t test, but we can use the Mann–Whitney test of medians. The output from Minitab is shown in Figure 10.3, with the 95 per cent confidence interval in the fourth row.7 Since the confidence interval of (2 to 0) contains zero, you must conclude that the difference in the population median Apgar scores is not statistically significant. Notice that the confidence level is given as 95.2 per cent, not 95 per cent. Confidence intervals for medians cannot always achieve the precise confidence level you asked for, because of the way in which a median is calculated.

An example from practice

Table 10.4 is from a randomised controlled double-blind trial to compare the cost effectiveness of two treatments in relieving pain after blunt instrument injury in an A&E department (Rainer et al. 2000). It shows the median times spent by two groups of patients in various clinical situations. One group received ketorolac, the other group morphine. The penultimate column contains the 95 per cent confidence intervals for the difference in various median treatment times (minutes), between the groups (ignore the last column). As the footnote to the table indicates, these results were obtained using the Mann–Whitney method.

The only confidence interval not containing zero is that for the difference in median ‘time between receiving analgesia and leaving A&E’, for which the difference in the sample medians is 20.0 minutes. So this is the only treatment time for which the difference in population median

7 As far as I am aware, SPSS does not appear to calculate a confidence interval for two independent medians.

Table 10.4 Mann–Whitney confidence intervals for the difference between two independent groups of patients in their median times spent in several clinical situations. One group received ketorolac, the other morphine median number (interquartile range) of minutes relating to participants treatment. Reproduced from BMJ, 321, 1247–51, courtesy of BMJ Publishing Group

 

 

 

 

Median difference

 

 

 

Ketorolac group (n = 75)

Morphine group (n = 73)

(95% confidence

 

 

Variable

 

interval)

P value*

 

 

 

 

 

 

 

 

Interval between arrival in emergency department

38.0

(30.0 to 54.0)

39.0 (29.0 to 53.0)

1.0

(5.0 to 7.0)

0.72

 

and doctor prescribing analgesia

 

 

 

 

 

 

 

Preparation for analgesia

5.0

(5.0 to 10.0)

10.0 (5.5 to 12.5)

2.0 (0 to 5.0)

0.0002

 

Undergoing radiography

5.0

(5.0 to 10.0)

5.0 (4.0 to 10.0)

0

(1.0 to 0)

0.75

 

Total time spent in emergency department

155.0 (112.0 to 198.0)

171.0 (126.0 to 208.5)

15.0

(4.0 to 33.0)

0.11

 

Interval between receiving analgesia and leaving

115.0

(75.0 to 149.0)

130.0 (95.0 to 170.0)

20.0 (4.0 to 39.0)

0.02

 

emergency department

 

 

 

 

 

 

 

*Mann–Whitney U test.

Table 10.5 Confidence interval estimates from the Wilcoxon signed-ranks method for the difference in population food intakes per day, for a number of substances, from a study of the dietary habits of schizophrenics. Values are median (range). Reproduced from BMJ, 317, 784–5, courtesy of BMJ Publishing Group

 

 

Men

 

Women

 

 

All

Wilcoxon signed ranks test

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Patients

Controls

 

Patients

Controls

 

Patients

Controls

Median difference

 

Intake/day

 

(n = 17)

(n = 17)

 

(n = 13)

(n = 13)

 

(n = 30)

(n = 30)

(95% Cl)

P

 

 

 

 

 

 

 

 

 

 

 

Energy (MJ)

11.84

14.19

 

8.87 (5.07–13.02)

9.99 (5.25–16.25)

9.71

11.98

 

2.06 (0.26–4.23)

0.04

 

(7.67–17.93)

(6.94–23.22)

 

 

 

 

(5.07–17.94)

(5.25–23.22)

15.9 (1.1 to 32.8)

 

Protein (g)

92.5

114.2

 

68.7 (38.4–104.2)

82.5 (40.5–142.7)

84.5

96.0

 

0.07

 

(65.1–157.4)

(74–633)

 

 

 

 

(38.4–157.4)

(40.5 to 633.0)

 

 

Total fibre (g)

13.0

22.0

 

10.7 (7.3–18.0)

15.5 (10.7–22.9)

12.6 (7.3–20.8)

18.9 (8.7–86.2)

7.0 (3.6 to 10.6)

0.0001

 

(8.5–20.8)

(8.7–86.2)

 

 

 

 

 

 

 

 

 

 

Retinol (μg)

647

817

 

533 (288–7556)

817 (201–11585)

590

817

 

310 (93 to 1269)

0.02

 

(294–1498)

(134–12341)

 

 

 

 

 

(288–7556)

(134–12341)

 

 

Carotene (μg)

783

2510

 

2048 (550–4657)

3079 (956–6188)

1443

2798

 

1376 (549 to 2452)

0.004

 

(219–3638)

(523–11313)

 

 

 

 

 

(219–4657)

(523–11313)

 

 

Vitamin C (mg)

41.0

81.0

 

 

40.0 (3–165)

61.0 (27.0–291.0)

40.5 (3.0–204)

80.5 (14.0–219)

33.5 (2.0 to 64.0)

0.03

 

(4.0–204)

(14.0–262)

 

 

 

 

 

 

 

 

 

 

Vitamin E (mg)

4.8

10.26

 

 

4.5 (2.3–6.0)

5.38 (3.6–14.7)

4.7 (2.3–18.0)

7.8 (2.2–32.0)

2.9 (1.45 to 5.35)

0.0002

 

(3.4–18.0)

(2.23–32.0)

 

 

 

 

 

 

 

 

 

 

Alcohol (g)

3.8 (0–19.4)

11.7 (0–80)

 

0 (0–5.6)

1.8 (0–12)

 

0 (0–19.4)

5.7 (0–80)

5.4 (1.2 to 9.9)

0.009