Добавил:

Sekretar kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Ростовский Государственный Медицинский Университет

Предмет:

Медицина общая

Файл:

Ординатура / Офтальмология / Английские материалы / Medical Statistics from Scratch_Bowers_2008.pdf

Скачиваний:

Добавлен:

28.03.2026

Размер:

4.18 Mб

Скачать

☆

<<< < Предыдущая 15 16 17 18 19 20 21 22 23 24 25 2627 / 4327 28 29 30 31 32 33 34 35 36 37 38 39 > Следующая >>>

146 CH 12 TESTING HYPOTHESES ABOUT THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

Where, μM = population mean birthweight of maternity-unit-born infants, and μH = the population mean birthweight of home-born infants.4

With SPSS

Look back at Figure 10.1, which shows the output from SPSS, which, in addition to the 95 per cent conﬁdence interval, gives the result of the two-sample t test of the equality of the two population mean birthweights. The test results are given in columns ﬁve, six and seven. The column headed ‘Sig. (2-tailed)’ gives the p-value of 0.407. Since this is not less than 0.05, you cannot reject the null hypothesis. You thus conclude that there is no difference in the two population mean birthweights.

With Minitab

The Minitab output in Figure 10.2 gives the same p-value value as SPSS (0.407), conﬁrming that the two population means are not signiﬁcantly different.

Some examples of hypothesis tests from practice

Two independent means – the two-sample t test

Table 12.2 shows the baseline characteristics of two independent groups in a randomised controlled trial to compare conventional blood pressure measurement (CBP) and ambulatory blood pressure measurement (ABP) in the treatment of hypertension (Staessen et al. 1997). p-values for the differences in the basic characteristics of the two groups are shown in the last column.

The authors used a variety of tests to assess the difference between several parameters for these independent groups (although these are referred to in the text, this information should have been available somewhere in the table itself). To assess the difference in population mean age, and mean body mass index, they used a two-sample t test. For age, the p-value is 0.03, so you can reject the null hypothesis of equal mean ages and conclude that the difference is statistically signiﬁcant. The p-value for the difference in mean body mass index is 0.39, so you can conclude that the mean body mass index in the two populations is the same.

Exercise 12.2 Comment on what the results in Table 12.2 indicate about the difference between the two populations in terms of their mean serum creatinine and serum total cholesterol levels.

Exercise 12.3 Refer back to Table 1.6, showing the basic characteristics of women in the breast cancer and stressful life events case-control study. Comment on what the p-values tell you about the equality or otherwise, between cases and controls, of the means of the seven metric variables (shown with an * – see table footnote).

4 Note that differences in independent percentages can also be tested with the two-sample t test.

SOME EXAMPLES OF HYPOTHESIS TESTS FROM PRACTICE

147

Table 12.2 Baseline characteristics of two independent groups, from a randomised controlled trial to compare conventional blood pressure measurement (CBP) and ambulatory blood pressure measurement (ABP) in the treatment of hypertension. Reproduced from JAMA, 278, 1065–72, courtesy of the American Medical Association

	CBP Group		ABP Group
Characteristics	(n = 206)		(n = 213)		P
Age, mean (SD), y	51.3	(11.9)	53.8	(10.8)	.03
Body mass index, mean (SD), kg/m2	28.5	(4.8)	28.2	(4.4)	.39
Women, No. (%)	102	(49.5)	124	(58.2)	.07
Receiving oral contraceptives, No. (%)	14 (13.7)		10	(8.1)	.17
Receiving hormonal substitution, No. (%)	19 (18.6)		19	(15.3)	.51
Previous antihypertensive treatment, No. (%)†	134	(65.0)	139	(65.3)	.95
Diuretics, No. (%)	47	(35.1)	59	(42.4)	.26
β-Blockers, No. (%)	65	(48.5)	80	(57.6)	.17
Calcium channel blockers, No. (%)	45	(33.6)	38	(27.3)	.32
Angiotensin-converting enzyme inhibitors, No. (%)	50	(37.3)	48	(34.5)	.72
Multiple-drug treatment, No. (%)	62	(46.3)	65	(46.8)	.97
Smokers, No. (%)	42	(20.5)	35	(16.4)	.29
Alcohol use, No. (%)	115	(55.8)	102	(47.9)	.10
Serum creatinine, mean (SD), μmol/L‡	85.75	(15.91)	88.4	(16.80)	.25
Serum total cholesterol, mean (SD), mmol/L‡	6.00	(1.03)	6.10	(1.19)	.32

Percentages and values of P computed considering only women receiving antihypertensive drug treatment before their enrollment.

†Deﬁned as antihypertensive drug treatment within 6 months before the screening visit.

‡Divide creatinine by 88.4 and cholesterol by 0.02586 to convert milligrams per deciliter.

Two matched means – the matched-pairs t test

Table 10.3 provides an example from practice, and shows the p-values for the differences in population mean bone mineral densities between two individually matched groups of depressed and normal women (which we have already discussed in conﬁdence interval terms). As you can see, only at the radius are the population mean bone mineral densities the same, indicated by a p-value of 0.25. All the other p-values are less than 0.05. Notice that this conﬁrms the conﬁdence interval results.5

Two independent medians – the Mann-Whitney test

With two independent groups, and when the data is ordinal or skewed metric, the median is the preferred measure of location. In these circumstances, the Mann-Whitney test can be used to test the null hypothesis that the two population medians are the same.

Recall that in Chapter 10, I introduced the Mann-Whitney procedure to calculate conﬁdence intervals for the difference between two independent population median treatment times. These

5 Note that differences in matched percentages can also be tested with the matched-pairs t test.

148 CH 12 TESTING HYPOTHESES ABOUT THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

were from a study of the use of ketorolac versus morphine to treat limb injury pain. Table 10.4 contains both 95 per cent conﬁdence intervals and p-values from this study. Only one conﬁdence interval does not include zero, that for the time between receiving analgesia and leaving A&E (4.0 to 39.0). This outcome has a p-value of 0.02, less than 0.05, which conﬁrms the fact that the difference in treatment time between the two population median times is statistically signiﬁcant.

However there is a problem with the time for preparation of the analgesia. Table 10.4 shows this has a 95 per cent conﬁdence interval of (0 to 5.0), which includes zero, implying no signiﬁcant difference in treatment times. But the p-value is given as 0.0002, which suggests a highly signiﬁcant difference in population medians. In the accompanying text the authors indicate that this difference is signiﬁcant and quote the low p-value, so I can only assume a typographical error in the conﬁdence interval.

Interpreting computer output for the Mann-Whitney test

In view of the widespread use of the Mann-Whitney test you might ﬁnd it helpful to see the output for this procedure from both SPSS and Minitab.

With SPSS

With the Apgar scores in Table 10.1, you can use the Mann-Whitney test to check if the population median Apgar scores for infants born in a maternity unit and those born at home are the same and differ in the sample only by chance. The null hypothesis is that these medians are equal. The output from SPSS is shown in Figure 12.1. The p-value of 0.061 is labelled ‘Asymp. Sig. (2-tailed)’. Since this is not less than 0.05 you cannot reject the null hypothesis of no difference in population median Apgar scores between the two groups.

Test Statistics
	APGARALL
Mann-Whitney U	325.500
Wilcoxon W	790.500	The p
Z	–1.876	value.
Asymp. Sig. (2-	.061	value.
Asymp. Sig. (2-	.061

tailed)

Figure 12.1 Output from SPSS for the Mann-Whitney test of the difference between population medians of the two independent Apgar scores (raw data in Table 10.1)

With Minitab

If you refer back to Figure 10.3, you will see the results of Minitab’s Mann-Whitney test three rows from the bottom.6 The p-value is given in the second row up as 0.0616 and since this is

6 ‘ETA’ is Minitab’s word for the population median.

CONFIDENCE INTERVALS VERSUS HYPOTHESIS TESTING

149

not less than 0.05 you cannot reject the null hypothesis. This is conﬁrmed in the bottom row of the table, and enables you to conclude that the population median Apgar scores are the same in both groups of infants.

Two matched medians – the Wilcoxon test

In the same circumstances as for the Mann-Whitney test described above, but with matched populations, the Wilcoxon test is appropriate. Look back at Table 10.5, which was from a matched case-control study into the dietary intake of schizophrenic patients living in the community in Scotland. Here the authors have used the Wilcoxon matched-pairs test to test for differences in the population median daily intakes of a number of substances between ‘All Patients’ and ‘All Controls’. The p-values are in the column headed ‘P’. As you can see, the only p value not less than 0.05 is that for protein (p-value = 0.07), so this is the only substance whose median daily intake does not differ between the two populations. Once again this conﬁrms the conﬁdence interval results.

Conﬁdence intervals versus hypothesis testing

I said at the beginning of this chapter that where possible, conﬁdence intervals are preferred to hypothesis tests because the conﬁdence intervals are more informative. How so? Have another look at Table 10.4, from the study comparing ketorolac and morphine for limb injury pain. The authors give both 95 per cent conﬁdence intervals and p-values for differences in a number of different treatment times, between two groups of limb injury patients. Let’s take the last of these. For the ‘interval between receiving analgesia and leaving A&E’, the p-value of 0.02 enables us to reject the null hypothesis, and you would conclude that the difference between the two population median treatment times is statistically signiﬁcant.

The 95 per cent conﬁdence interval of (4.0 to 39.0) minutes, tells us, not only that the difference between the population medians is statistically signiﬁcant – because the conﬁdence interval does not contain zero – but in addition, that the value of this difference in population medians is likely to be somewhere between 4.0 minutes and 39 minutes. So the conﬁdence interval does everything that the hypothesis test does – it tells us if the medians are equal or not, but it also gives us extra information – on the likely range of values for this difference. Moreover, unlike a p-value, the conﬁdence interval is in clinically meaningful units, which helps with the interpretation. So whenever possible, it is good practice to use conﬁdence intervals in preference to p-values.

Nobody’s perfect – types of error

Suppose you are investigating a new drug for the treatment of hypertension. Your null hypothesis is that the drug has no effect. Let’s suppose that the drug does actually reduce mean systolic blood pressure, but, on average, by only 5 mmHg. However, the hypothesis test you use can only detect a change of 10 mmHg or more. As a consequence, you will not ﬁnd strong enough

150 CH 12 TESTING HYPOTHESES ABOUT THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

evidence to reject the null hypothesis, and you’ll conclude, mistakenly, that the new drug is not effective. But the effect is there, it’s just that your test does not have enough power to detect it.

There are three questions here. First, what exactly is the power of a test and how can we measure it? Second, how can we increase the power of the test we are using? Third, is there a more powerful test that we can use instead? Before I address these questions, a few words on types of error.

Whenever you decide either to reject or not reject a null hypothesis, you could be making a mistake. After all, you are basing your decision on sample evidence. Even if you have done everything right, your sample could still, by chance, not be very representative of the population. Moreover, your test might not be powerful enough to detect an effect if there is one. There are two possible errors:

Type I error: Rejecting a null hypothesis when it is true. Also known as a false positive. In other words, concluding there is an effect when there isn’t. The probability of committing a type I error is denoted α (alpha), and is the same alpha as the signiﬁcance level of a test.

Type II error: Not rejecting a null hypothesis when it is false. Also known as a false negative. That is, concluding there is no effect when there is. The probability of committing a type II error is denoted β (beta).

Ideally, you would like a test procedure which minimised the probability of a type I error, because in many clinical situations such an error is potentially serious – judging some procedure to be effective when it is not. When you set the signiﬁcance level of a test to α = 0.05, it’s because you want the probability of a type I error to be no more than 0.05. Nonetheless, if there is a real effect you would certainly like to detect it, so you also want to minimise the probability of β, a type II error, or put another way, you want to make (1 − β) as large as possible.

Exercise 12.4 Explain, with examples, what is meant in hypothesis testing by: (a) a false positive; (b) a false negative.

<<< < Предыдущая 15 16 17 18 19 20 21 22 23 24 25 2627 / 4327 28 29 30 31 32 33 34 35 36 37 38 39 > Следующая >>>

Соседние файлы в папке Английские материалы

#
28.03.20268.13 Mб0Mastering Corneal Collagen Cross-linking Techniques (C3-R CCL CxL)_Garg, Kanellopoulos, O'Brart, Lovisolo, Pinelli_2008.pdf
#
28.03.202613.68 Mб0Mastering theTechniques of Lens Based Refractive Surgery (Phakic IOLs)_Garg, Alio, Dementiev_2005.pdf
#
28.03.202651.38 Mб0Mechanisms of the Glaucomas_Shields, Tombran-Tink, Barnstable_2008.pdf
#
28.03.20261.67 Mб0Medical Contact Lens Practice_Millis_2005.pdf
#
28.03.20268.96 Mб0Medical Retina_Bandello, Querques_2012.pdf
#
28.03.20264.18 Mб0Medical Statistics from Scratch_Bowers_2008.pdf
#
28.03.20264.93 Mб0Medical Treatment of Glaucoma_Weinreb, Liebmann_2010.pdf
#
28.03.20266.28 Mб0Minimally Invasive Ophthalmic Surgery_Fine, Mojon_2010.djvu
#
28.03.202613.07 Mб0Minimally Invasive Ophthalmic Surgery_Fine, Mojon_2010.pdf
#
28.03.202610.66 Mб0Minimally Invasive Techniques of Oculofacial Rejuvenation_Bosniak, Cantisano-Zilkha_2005.pdf
#
28.03.202617.96 Mб0Minimizing Incisions and Maximizing Outcomes in Cataract Surgery_Alio, Fine_2010.pdf