Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
4.18 Mб
Скачать

10

Estimating the difference between two population parameters

Learning objectives

When you have finished this chapter you should be able to:

Give some examples of situations where there is a need to estimate the difference between two population parameters.

Very briefly outline the basis of estimation of the difference between two population means using methods based on the two-sample t test1 (for independent populations) and the matched-pairs t test (for matched populations).

Very briefly outline the basis of estimation of the difference between two population medians using methods based on the Mann-Whitney test (for independent populations) and the Wilcoxon test (for matched populations).

Interpret results from studies that estimate the difference between two population means, two percentages or two medians.

Demonstrate an awareness of any assumptions that must be satisfied when estimating the difference between two population parameters.

1Throughout this chapter we will be looking at methods of estimation based on various hypothesis tests. I will begin to discuss hypothesis tests properly in Chapter 12.

Medical Statistics from Scratch, Second Edition David Bowers

C 2008 John Wiley & Sons, Ltd

120

CH 10 ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

What’s the difference?

As you have just seen, it’s possible to determine a confidence interval for any single population parameter – a population mean, a median, a percentage and so on. However, by far the most common application of confidence intervals is the comparison of two population parameters, for example between the means of two populations, such as the mean age of a population of women and the mean age of a population of men; I’ll start with this.

Estimating the difference between the means of two independent populations – using a method based on the two-sample t test

The procedure here, like that for the single mean (see Chapter 9), is based on the t distribution (see the footnote on p. 114). However, with two populations, you need to know if they are independent or matched (see p. 81 to review matching). I’ll start with estimating the difference in the means of two independent populations, since this is by far the most common in practice. For this we use a method based on the two-sample t test. First, there are a number of prerequisites that need to be met:

Data for both groups must be metric. As you know from Chapter 5 the mean is only appropriate with metric data anyway.

The distribution of the relevant variable in each population must be reasonably Normal. You can check this assumption from the sample data using a histogram, although with small sample sizes this can be difficult.

The population standard deviations of the two variables concerned should be approximately the same, but this requirement becomes less important as sample sizes get larger. You can check this by examining the two sample standard deviations.2

An example using birthweights

Suppose you want to compare (by estimating the difference between them), the population mean birthweights of infants born in a maternity unit with that of infants born at home (sample data in Table 10.1). The two samples were selected independently with no attempt at matching.

Both SPSS and Minitab compute the sample mean birthweight of the home-born infants to be 3726.5 g, with a standard deviation of 385.7 g. Recall that for the infants born in the maternity units, sample mean birthweight was 3644.4 g with a standard deviation of 376.8 g (see p. 112). So there is a difference in the sample mean birthweights of 82.1 g, (3726.5 g

– 3644.4 g), but this does not mean that there is a difference in the population mean birthweights.

2This condition is usually stated in terms of the two variances being approximately the same. Variance is standard deviation squared.

ESTIMATING THE DIFFERENCE BETWEEN THE MEANS OF TWO INDEPENDENT POPULATIONS

121

Table 10.1 Sample data for birthweight (g), Apgar scores and whether mother smoked during pregnancy for 30 infants born in a maternity unit and 30 born at home

 

Birthweight (g)

 

Mother smoked

Apgar score

 

 

 

 

 

 

 

 

 

Infant

Hospital birtha

Home birth

 

Hospital birth

Home birth

Hospital birth

Home birth

 

 

 

 

 

 

 

 

1

3710

3810

0

0

 

8

10

2

3650

3865

0

0

 

7

8

3

4490

4578

0

0

 

8

9

4

3421

3522

1

0

 

6

6

5

3399

3400

0

1

 

6

7

6

4094

4156

0

0

 

9

10

7

4006

4200

0

0

 

8

9

8

3287

3265

1

0

 

5

6

9

3594

3599

0

1

 

7

8

10

4206

4215

0

0

 

9

10

11

3508

3697

0

0

 

7

8

12

4010

4209

0

0

 

8

9

13

3896

3911

0

0

 

8

8

14

3800

3943

0

0

 

8

9

15

2860

3000

0

1

 

4

3

16

3798

3802

0

0

 

8

9

17

3666

3654

0

0

 

7

8

18

4200

4295

1

0

 

9

10

19

3615

3732

0

0

 

7

8

20

3193

3098

1

1

 

4

5

21

2994

3105

1

1

 

5

5

22

3266

3455

1

0

 

5

6

23

3400

3507

0

0

 

6

7

24

4090

4103

0

0

 

8

9

25

3303

3456

1

0

 

6

7

26

3447

3538

1

0

 

6

7

27

3388

3400

1

1

 

6

7

28

3613

3715

0

0

 

7

7

29

3541

3566

0

0

 

7

8

30

3886

4000

1

0

 

8

6

a This is the data from Table 2.5.

It is important to remember that a difference between two sample values does not necessarily mean that there is a difference in the two population values. Any difference in these sample birthweight means might simply be due to chance. Now we come to an important point:

If the 95 per cent confidence interval for the difference between two population parameters includes zero, then you can be 95 per cent confident that there is no difference in the two parameter values. If the interval doesn’t contain zero, then you can be 95 per cent confident that there is a statistically significant difference in the means.

122

CH 10 ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

The p value = 0.407 (ignore for now).

Independent samples test

Levene's test for equality of variances

F Sig.

 

 

 

The 95 % CI for the

 

The difference

difference in the

 

between the two

two population

 

sample means =

means.

 

82.1667.

 

 

 

t test for

 

 

 

equality of

 

 

 

means

 

 

95 % Confidence

 

Sig.

 

 

 

Mean

Std. error

interval of the

t

(2-

difference

Difference

difference

 

tailed)

 

 

 

 

 

Lower

Upper

Equal

 

 

 

 

 

 

variances

.037 .847 .835

.407

–82.1667

98.4359

–114.8742

279.2076

assumed

 

 

 

 

 

 

Equal

 

 

 

 

 

 

variances

.835

.407

82.1667

98.4359

–114.8765

279.2099

not

 

 

 

 

 

 

assumed

 

 

 

 

 

 

Figure 10.1 SPSS output (abridged) for 95 per cent confidence interval (last two columns) for the difference between two independent population mean birthweights, using samples of 30 infants born in maternity units and 30 at home (data in Table 10.1)

In other words, if you want to know if there is a statistically significant difference between two population means, calculate the 95 per cent confidence interval for the difference and see if it contains zero.

It is possible to calculate these confidence intervals by hand, but the process is timeconsuming and tedious. Fortunately, most statistics programs will do it for you. Since difference between independent population means is one of the most commonly used approaches in clinical research, you might find it helpful to see some of the output from SPSS and Minitab for this procedure.

With SPSS

Using the birthweight data in Table 10.1, SPSS produces the results (abridged3) shown in Figure 10.1. These tell us that the difference in the two sample mean birthweights is 82.17 g. The sign in front of this value depends on which variable you select first in the SPSS dialogue box. SPSS subtracts the second variable selected (home births in this case) from the first (maternity unit births). This result means that the sample mean birthweight was 82.17 g higher in the home birth infants.

SPSS calculates two confidence intervals, one with standard deviations4 assumed to be equal, and one with them not equal. The 95 per cent confidence interval shown in the last two columns

3I’ve removed material that is not relevant.

4Both Minitab and SPSS refer to equality of variances.

ESTIMATING THE DIFFERENCE BETWEEN THE MEANS OF TWO INDEPENDENT POPULATIONS

123

is (114.9 to 279.2)g, the same in both cases. SPSS tests for equality of the standard deviations (or variances), using Levene’s test. The assumption is that they are the same. We will discuss tests in Chapter 12.

Since this confidence interval includes zero, you can conclude that there is no statistically significant difference in population mean birthweights of infants born in a maternity unit and infants born at home.

With Minitab The Minitab output, which confirms that from SPSS, is shown in Figure 10.2. The 95 per cent confidence interval is in the second row up.

An example from practice

Table 10.2 is from a cohort study of maternal smoking during pregnancy and infant growth after birth (Conter et al 1995). The subjects were 12 987 babies who were followed up for three years after birth. Of these, 10 238 had non-smoking mothers, 2276 had mothers who had

Two-Sample T-Test and CI: Weight hosp (g), Weight home The difference

 

 

 

 

 

between the

Two-sample T for Weight hosp (g) vs Weight home

two sample

means = 82.2.

 

 

 

 

 

 

N

Mean

StDev

SE Mean

 

Weight hosp

30

3644

377

69

 

Weight home

30

3727

386

70

 

Difference = mu Weight hosp (g) –mu Weight home Estimate for difference: –82.2

95% CI for difference: (–279.3, 114.9)

T-Test of difference = 0 (vs not =): T-Value = –0.83 P-Value = 0.407 DF = 57

The 95% CI for the difference in the two population means.

The p-value = 0.407 (ignore for now).

Figure 10.2 Minitab output for 95 per cent confidence interval for the difference between two independent population mean birthweights, using samples of 30 infants born in maternity units and 30 at home. Note that Minitab uses the word ‘mu’ to denote the population mean, normally designated as Greek μ

Table 10.2 95 per cent confidence intervals for difference in weights according to sex and smoking habits of mothers between independent groups of babies. Reproduced from BMJ, 310, 768–71, courtesy of BMJ Publishing Group

 

 

 

At birth

 

 

 

 

At 3 months

 

 

At 6 monts

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Mother’s

 

No.

 

95%CI for

 

No.

 

95%CI for

 

No.

 

95%CI for

smoking habit

children

Weight (g)

difference

children

Weight (g)

difference

children

Weight (g)

difference

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Girls

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Non-smokers

4904

3220

(121 to 55)

4904

5584

(77 to 9)

4895

7462

(47 to 65)

1–9 cigs per day

1072

3132

1071

5550

1072

7471

10 per day

228

3052

(234 to –102)

228

5519

(152 to 22)

227

7434

(141 to 85)

Boys

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Non-smokers

5334

3373

(139 to 75)

5332

6026

(113 to 23)

5330

8038

(118 to 10)

1–9 cigs per day

1204

3266

1204

5958

1204

7974

10 cigs per day

245

3126

(312 to 181)

245

5907

(212 to 26)

245

8014

(136 to 88)