Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002
.pdf
have been promptly shown with the median test, again avoiding the more cumbersome Wilcoxon-Mann- Whitney U test.
References
1. Daniel, 1984; 2. Wilcoxon, 1945; 3. Clarke, 1964; 4. Aitken, 1969; 5. Huskisson, 1974; 6. Mann, 1947; 7. Bross, 1958; 8. Fleiss, 1981; 9. Wynder, 1960; 10. Kantor, 1966; 11. Berkman, 1971; 12. Brown, 1975; 13. Spitzer, 1965; 14. Mantel, 1979; 15. Moses, 1984; 16. Bradley, 1968; 17. Siegel, 1988; 18. Sprent, 1993; 19. Campbell, 1988; 20. Gonzalez, 1991; 21. Krieger, 1995; 22. Trentham, 1993; 23. Allman, 1995; 24. Leibovici, 1995; 25. Weiss, 1988; 26. Klinkenberg-Knol, 1994; 27. Mulvihill, 1987; 28. Avvisati, 1989; 29. Fisher, 1989; 30. Colquhoun, 1990; 31. Armitage, 1982; 32. Keelan, 1965.
Exercises
15.1. In the clinical trial discussed at the end of Section 15.10, the investigators 29 said they were eager to avoid “flaws in the design” of previous trials of homeopathic treatment. They therefore “designed a trial to clarify these results by overcoming the methodological criticisms while retaining a rigorous design.” For example, beyond the features described in Section 15.10, the investigators used the following methods: “After entry, there was no further contact between the homeopathic doctor and the patient until the treatment was finished, The clinical metrologist dispensed the treatment and performed the assessments and analyses blind.” What aspect of this trial would make you clinically believe that the trial was not well designed, and that a “treatment period interaction” was probably inevitable?
15.2. For 10 diabetic adults treated with a special diet, the fasting blood sugar values (in mg./dl.) before and after treatment were as follows:
Person |
Before |
After |
|
|
|
A |
340 |
290 |
B |
335 |
315 |
C |
220 |
250 |
D |
285 |
280 |
E |
320 |
311 |
F |
230 |
213 |
G |
190 |
200 |
H |
210 |
208 |
I |
295 |
279 |
J |
270 |
258 |
|
|
|
To test whether a significant change occurred after treatment, perform a stochastic contrast in these data using a paired t-test and the signed-rank test. Are the stochastic results similar?
15.3. The data in Exercise 11.1 received a Pitman-Welch test in Section 12.9.5, and a t-test in Exercise 13.1. This chapter gives you the tools with which you can again demonstrate the occasional “inefficiency” of the t test by checking the same data set in three new ways:
15.3.1.With a Wilcoxon-Mann-Whitney U test;
15.3.2.With a sign test; and
15.3.3.With a median test.
Do those tests, showing your intermediate calculations and results. (Because the sign test requires a single group of data or paired data, you will need some moderate ingenuity to apply it here.)
© 2002 by Chapman & Hall/CRC
15.4. Random samples of students at two different medical schools were invited to rate their course in Epidemiology using four categories of rating: poor, fair, good, or excellent. At School A, the course is taught in the traditional manner, with a strong orientation toward classical public health concepts and strategies. At School B, the course has been revised, so that it has a strong emphasis on clinical epidemiology. The student ratings at the two schools were as follows:
|
Poor |
Fair |
Good |
Excellent |
Total |
|
|
|
|
|
|
School A |
5 |
18 |
12 |
3 |
38 |
School B |
1 |
3 |
20 |
12 |
36 |
|
|
|
|
|
|
A medical educator wants to show that this distinction is quantitatively and stochastically significant.
15.4.1.How would you express the results for quantitative significance?
15.4.2.How would you test for stochastic significance?
15.4.3.If you did a non-parametric rank test in 15.4.2, what is another simpler way to do the contrast?
15.4.4.From what you noted in the quantitative comparison, can you think of a simple “in-the- head” test for demonstrating stochastic significance for these data?
15.5.The table below is taken from the first report32 of the efficacy of propranolol in the treatment of angina pectoris. In a randomized, double-blind, crossover design, the 19 patients who completed the trial received either propranol or placebo for four weeks, followed by the other agent for the next four weeks. Each of the three variables in the table below had values of P < .05 for the superiority of propranol over placebo. Looking only at the data for Severity of Angina, what mental test can you do to persuade yourself that P is indeed <.05?
|
|
|
|
No. of Glyceryl Trinitrate |
|
Severity of Angina |
||
|
No. of Attacks of Pain |
|
Tablets Taken |
|
(Grades 1–6) |
|||
Case No. |
Propranolol |
Placebo |
|
Propranolol |
Placebo |
|
Propranolol |
Placebo |
1 |
37 |
68 |
37 |
68 |
3 |
4 |
||
2 |
121 |
116 |
61 |
56 |
6 |
6 |
||
3 |
19 |
47 |
19 |
46 |
2 |
3 |
||
4 |
14 |
26 |
15 |
25 |
2 |
3 |
||
5 |
16 |
79 |
16 |
79 |
2 |
4 |
||
6 |
4 |
6 |
4 |
6 |
2 |
2 |
||
7 |
24 |
24 |
24 |
24 |
4 |
4 |
||
8 |
118 |
100 |
0 |
0 |
4 |
4 |
||
9 |
5 |
4 |
5 |
8 |
2 |
2 |
||
10 |
10 |
27 |
10 |
14 |
3 |
4 |
||
11 |
31 |
19 |
6 |
5 |
4 |
4 |
||
12 |
31 |
46 |
41 |
71 |
2 |
3 |
||
13 |
59 |
64 |
59 |
64 |
3 |
4 |
||
14 |
86 |
175 |
119 |
203 |
6 |
6 |
||
15 |
1 |
8 |
3 |
21 |
2 |
4 |
||
16 |
4 |
8 |
3 |
4 |
2 |
2 |
||
17 |
103 |
96 |
115 |
92 |
3 |
3 |
||
18 |
6 |
44 |
19 |
43 |
2 |
2 |
||
19 |
38 |
75 |
38 |
75 |
4 |
4 |
||
|
|
|
|
|
|
|
|
|
© 2002 by Chapman & Hall/CRC
15.6. A renowned clinical investigator, who has just moved to your institution, consults you for a statistical problem. He says the Journal of Prestigious Research has refused to accept one of his most important studies of rats because he failed to include results of a t test for a crucial finding. “Then do the t test,” you tell him. He looks at you scornfully and says, “I did it before the paper was submitted.” “Oh,” you respond ruefully, “and I suppose it did not quite make ‘significance’.” “That is right,” he says, “I tried to get away without reporting the P value, but the reviewers have insisted on it.” “Why not add a few more rats?” you suggest. “I cannot,” he replies. “My lab at Almamammy U. was dismantled three months ago before I came here. The technician who did all the work is gone, and there is no way I can set things up to test any more rats now. I felt so sure the results were significant when I first saw them that I did not bother to check the t test at the time. Had I done so, I would have added more rats to the study. Do you have any ideas about what I might do now?” “Let me see your data,” you say.
He shows you the following results, expressed in appropriate units of measurement:
Group A: 11,17, 21, 52; XA = 25.25; sA = 18.30
Group B: 1, 7, 8, 9; XB = 6.25; sB 3.59
You look at the results for a moment, do a simple calculation at your desk with a paper and pencil, then say to him, “You need not do anything. Your results are significant now at a two-tailed P < .05.” After further discussion of what you did and how to report it, the investigator departs with a feeling of awe and wonder at your remarkable genius. The paper is accepted by the J. Prest. Res. and, as a grateful statistical “patient,” he sends you a bottle of an elegant champagne.
What did you do at your desk?
© 2002 by Chapman & Hall/CRC
16
Interpretations and Displays for Two-Group Contrasts
CONTENTS
16.1Visual Integration of Descriptive and Stochastic Decisions
16.1.1Comparison of Observed Distributions
16.1.2Stochastic Comparison of Box Plots
16.1.3Quantile-Quantile Plots
16.2Principles of Display for Dimensional Variables
16.2.1General Methods
16.2.2Conventional Displays of Individual Data Points
16.2.3Bar Graphs
16.2.4Undesirable Procedures
16.3Displaying Binary Variables
16.3.1Orientation of 2 × 2 Table
16.3.2Individual Summaries
16.3.3Graphic Displays
16.4Displaying Ordinal Data
16.4.1Overall Comparisons
16.4.2Category-by-Category (Side-by-Side) Comparisons
16.5Nominal Data
16.6“Bonus” Displays
16.6.1New Haven–New York Timetable
16.6.2Sexist Pies
References
Exercises
A two-group contrast is probably the most common statistical comparison in medical research. As noted in Chapters 10 through 15, the comparison involves separate attention to description and inference. The descriptive focus is on quantitative distinctions in the data; the inferential focus is on stochastic appraisal of the quantitative distinctions.
This chapter is devoted to methods of interpreting and displaying the distinctions. The first section shows new graphic ways in which box plots and quantile-quantile plots offer a “visual interpretation” of the quantitative and stochastic decisions. The rest of the chapter discusses principles that can be used to produce effective (and ineffective) displays of the comparisons.
16.1 Visual Integration of Descriptive and Stochastic Decisions
If concepts of Z tests, t tests, P values, and confidence intervals had not been developed, decisions about significant differences might be made by examining the observed data directly, without any calculated stochastic boundaries.
© 2002 by Chapman & Hall/CRC
16.1.1Comparison of Observed Distributions
In one type of inspection, two groups of dimensional (or ordinal) data can be compared for the distributions shown in their histograms or frequency polygons.
This type of display, shown in Figure 16.1, was used to compare the distribution of urinary excretion of a fixed dose of lithium given to three groups of factory workers demarcated according to tertiles of serum uric acid.1 Perhaps the greatest advantage of such displays is their demonstration, discussed in Section 16.1.1.2, of the extraordinary amount of overlapping data that can achieve such stochastic accolades as “P < .001.”
|
25 |
|
|
22.6% |
|
|
23.8% |
|
25.3% |
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1st Tertile |
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2nd Tertile |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3rd Tertile |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
% |
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Frequency, |
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
P<.001 |
|
|
|
|
|
|
|
|
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-10 -12 -14 -16 -18 -20 |
-22 -24 -26 -28 |
-30 -32 -34 -36 -38 -40 |
|||||||||||||
|
8 |
10 |
12 |
14 |
16 |
18 |
20 |
22 |
24 |
26 |
28 |
30 |
32 |
34 |
36 |
38 |
|
|
|||||||||||||||
FE Lithium, %
FIGURE 16.1
Frequency distributions of the fractional excretion (FE) of lithium by tertiles of serum uric acid in 568 male factory workers. Statistics by analysis of variance for differences between the means. [Figure and legend taken from Chapter Reference 1.]
16.1.1.1Back-to-Back Stem-and-Leaf Diagrams — An approach that shows all data points in both groups is a back-to-back arrangement of stem-and-leaf diagrams, suggested by Chambers et al. 2 These diagrams, as shown here in Figure 16.2, offer an excellent way of reviewing the data yourself. The diagrams have the advantages of being easy to construct, compactly showing all the items as well as the shape of the two distributions, while avoiding the problems of multiple points at the same location, overlapping frequency polygons, and also the large vertical extent needed for a range of values that go from 18 to 82 in Figure 16.2.
The main disadvantage of either frequency polygon or stem-and-leaf direct displays, however, is the absence of summary indexes to help in the quantitative comparison of results. If the direct graphic displays show little or no overlap, the two groups are clearly different; but because such dramatic separations are rare, a more effective approach is needed for routine usage.
16.1.1.2Permissiveness of Stochastic Criteria — If you begin examining the actual data — rather
than P values, standard errors, and confidence intervals — you will discover the extraordinary amounts of overlapping spread, as shown in Figure 16.1, that can achieve stochastic “significance” in conventional criteria for a contrast in means.
© 2002 by Chapman & Hall/CRC
FIGURE 16.2
Back-to-back stem-and-leaf diagrams of monthly average temperature for Lincoln (left) and Newark (right). [Figure and legend taken from Chapter Reference 2.]
To illustrate what happens, suppose we have two sets of Gaussian data, with different means, XA and XB , but with similar group sizes, nA = nB = n, and equal variances, s2A = s2B . The two groups will have a common mean, X = (XA + XB ) ⁄2 and a common variance, sp = sA = sB = s. As shown in
Figure 16.3, the Gaussian distributions in the two groups will overlap on both sides of X.
The cross-hatching on the right side of X shows the data from Group B that overlap in group A; and the cross-hatching on the left side of X shows the data from Group A that overlap in Group B. These two cross-hatched zones constitute the zone of central overlap. With some algebra not shown here, it
can be demonstrated that if each curve has an area of 1, the |
magnitude |
of the central overlap is 1 – 2pc , |
|||||||||||
where pc is the one-tailed probability corresponding to Zc = |
|
|
X |
B – |
X |
A |
⁄(2sp ), for the proportion of data |
||||||
lying between each mean and the location of |
X |
. |
|
|
|
|
|
|
|
||||
For example, suppose |
X |
A = 146.4 and |
X |
B = 140.4, |
with sp = 9.85. The value of Zc will be |
||||||||
(146.4 – 140.4) / [2 (9.85)] = .309, for which the one-tailed pc = .121. The area of central overlap will be 1 − [2 (.121)] = .758.
|
_ |
_ |
X _ |
XA |
XB |
Group A data |
Group B data |
that overlap |
that overlap |
Group B |
Group A |
-2.0 |
-1.0 |
_ |
_ |
1.0 |
2.0 |
sp |
sp |
sp |
sp |
||
|
|
XA |
XB |
|
|
FIGURE 16.3
Patterns of central overlap at P < .05 for two Gaussian distributions with equal sample sizes and variances. Central overlap is the sum of the two shaded zones. For further details, see text.
© 2002 by Chapman & Hall/CRC
The magnitude of central overlap can be determined from the value of Zc that produces stochastic
significance. Thus, for two means, XA and XB , with group sizes nA and nB and a pooled common standard deviation sp, the stochastic criterion for “significance” at a two-tailed α = .05 is
|
|
|
|
|
X |
A – |
X |
B ≥ |
1.96sp N ⁄(nA nB ) |
|
||||
For equal sample sizes, with nA = nB = N/2 = n, the value of N ⁄(nA nB ) becomes |
2/n, and the |
|||||||||||||
foregoing formula becomes |
X |
A − |
X |
B ≥ 1(.96 |
2/ n)sp , which is ( |
X |
A – |
X |
B ) ⁄sp ≥1.96 |
2/n. Because |
||||
Zc = (XA – XB ) ⁄2sp , we can substitute appropriately and promptly determine that
Zc ≥ 0.98 
2/n
for stochastic significance with α = .05 in the comparison of two means. Table 16.1 shows the effect of enlarging n on values of 
2/n, on Zc = 0.98 
2/n, on the corresponding one-tailed pc for Zc, and on the zone of central overlap, calculated as 1 − 2pc. As group sizes enlarge, the value of 
2/n becomes smaller, thus making the “significance” criterion easier to satisfy. The value of Zc also becomes smaller, thus reducing the proportion of the nonoverlapping data and increasing the proportion of the central overlap.
TABLE 16.1
Effect of Increasing Group Size (n) on Proportion of “Central Overlap” for a Stochastically Significant Contrast of Means at α = .05
|
2 ⁄n |
Zc = .098 2/n |
pc = Nonoverlapping |
1 − 2pc = Proportion |
n |
Proportion in One Curve |
of Central Overlap |
||
|
|
|
|
|
4 |
.707 |
.693 |
.256 |
.488 |
5 |
.632 |
.620 |
.232 |
.536 |
10 |
.447 |
.438 |
.169 |
.662 |
15 |
.365 |
.350 |
.137 |
.726 |
20 |
.316 |
.309 |
.121 |
.758 |
50 |
.200 |
.196 |
.078 |
.844 |
100 |
.141 |
.138 |
.055 |
.890 |
200 |
.100 |
.098 |
.039 |
.922 |
|
|
|
|
|
For example, when n = 20, 0.98 
2/n = .309, for which pc = .121 and 1 − 2pc = .758. Consequently, despite stochastic significance, the Group A and Group B distributions shown in Figure 16.3 will have a “central overlap” of about 76%. About 38% of the Group A data would overlap the distribution of Group B below the common mean X, and about 38% of the Group B data would overlap the distribution of Group A above X. The pattern for this result is further illustrated in Figure 16.4.
For n = 50, 0.98 
2/n = .196, and 84.4% of the data will be in a state of central overlap. For n = 200, the contrast of means will be stochastically significant at P = .05, despite a central overlap for more than 92% of the data.
16.1.1.3 Illustration of Permissiveness — For a published illustration of permissiveness in central overlap, Figure 16.5 here is modified (to show only the data points and means) from the original illustration3 used to compare blood pressure in “100 diabetics and 100 nondiabetics.” Do the two sets of data in Figure 16.5 look substantially different to you? Regardless of what your answer may be, Figure 16.6 shows the original illustration, in which a 95% confidence interval extended from 1.1 to 10.9 around the increment of 6.0 in the two means. The interval excludes 0 and therefore denotes stochastic significance at 2P < .05.
In a related illustration, shown here as Figure 16.7, the presented data had the same pattern and same means as in Figure 16.5, but with only 50 members in each group. The 95% confidence interval is now larger and includes the null value of 0, so that the difference is no longer stochastically significant. Figure 16.7 was originally intended to show the effect of sample size in achieving stochastic significance but might also be used to demonstrate the peculiarity of regarding these two groups as “significantly
© 2002 by Chapman & Hall/CRC
|
38% of |
12% of |
50% of |
|
Group A |
Group A |
Group A |
|
data here |
data here |
data here |
Group A |
|
|
|
|
_ |
|
|
|
|
|
|
|
ZC _ |
XA |
|
|
= -.309 at X |
|
|
|
for curve A |
_ |
|
|
|
X |
|
|
|
|
_ |
_ |
ZC = +.309 at X |
||
|
XB |
for curve B |
|
Group B
50% of |
12% of |
38% of |
Group B |
Group B |
Group B |
data here |
data here |
data here |
FIGURE 16.4
Overlap of data for two Gaussian groups of 20 members each, with XA > XB , similar standard deviations, and P = .05. Beginning at
the common mean X, 38% of the data in Group A overlap the distributions of Group B, and 38% of the data in Group B overlap the distribution of Group A. For further details, see text.
Systolic blood pressure (mm Hg)
200 |
|
190 |
|
180 |
|
170 |
|
160 |
|
150 |
146.4 |
140 |
140.4 |
|
|
130 |
|
120 |
|
110 |
|
100 |
|
90 |
|
Systolic blood pressure (mm Hg)
200 |
|
|
190 |
|
|
180 |
|
|
170 |
|
|
160 |
|
|
150 |
146.4 |
|
140 |
|
140.4 |
|
|
|
130 |
|
|
120 |
|
|
110 |
|
|
100 |
|
|
90 |
|
|
|
Diabetics |
Non-diabetics |
FIGURE 16.5
Systolic blood pressures in 100 diabetics and 100 non-diabetics with mean levels of 146.4 and 140.4 mm Hg, respectively. [Figure and legend taken from Chapter Reference 3.]
Difference in mean systolic blood pressure (mm Hg)
|
|
|
|
|
|
30 |
|
|
|
|
|
|
|
||
95% Cl |
|
20 |
|||||
|
|||||||
|
|
|
|||||
|
|
|
10.9 |
|
10 |
||
|
|
|
|
6.0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1.1 |
|
|
0 |
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-10 |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
-20 |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
-30 |
|
|
|
|
|
|
|
||
Diabetics Non-diabetics
FIGURE 16.6
Systolic blood pressures in 100 diabetics and 100 non-diabetics with mean levels of 146.4 and 140.4 mm Hg, respectively. The difference between the sample means of 6.0 mm Hg is shown to the right together with the 95% confidence interval from 1.1 to 10.9 mm Hg. [Original figure and legend as presented in Chapter Reference 3.]
© 2002 by Chapman & Hall/CRC
Systolic blood pressure (mm Hg)
200 |
|
|
|
|
190 |
|
|
Difference in |
|
180 |
|
|
mean systolic |
|
|
|
blood pressure |
||
170 |
|
|
(mm Hg) |
|
|
|
|
30 |
|
160 |
|
|
95% Cl 20 |
|
150 |
146.4 |
|
13.0 |
10 |
|
6.0 |
|||
|
|
140.4 |
|
|
140 |
|
-1.0 |
0 |
|
|
|
|||
|
|
|
|
|
130 |
|
|
|
-10 |
120 |
|
|
|
-20 |
110 |
|
|
|
-30 |
100 |
|
|
|
|
90 |
|
|
|
|
|
Diabetics |
Non-diabetics |
|
|
FIGURE 16.7
Same as Figure 16.6, but showing results from two samples of half the size — that is, 50 subjects each. The means and standard deviations are as in Figure 16.6, but the 95% confidence interval is wider, from −1.0 to 13.0 mm Hg, owing to the smaller sample sizes. [Figure and most of legend taken from Chapter Reference 3.]
different” in Figure 16.6, but not in Figure 16.7. If the large overlap of data and small quantitative increment (6.0) in the means of Figure 16.7 did not impress you as “significant,” why were you impressed with essentially the same pattern of data and quantitative distinction in Figure 16.6? (Further discussion of Figures 16.6 and 16.7 is invited in Exercise 16.1.)
16.1.2Stochastic Comparison of Box Plots
After recognizing the large amount of overlapping data permitted by the conventional criteria for stochastic significance, you may want to consider overlap in bulk of the data, rather than confidence intervals for central indexes, as a stochastic mechanism for deciding that two groups are “significantly” different. A splendid way to inspect overlap is the comparison of “side-by-side” box plots, for which certain types of overlap can be readily “translated” into Gaussian stochastic conclusions.
16.1.2.1 No Overlap of Boxes — B ounded by the interquartile range or H-spread, each box contains 50% of the group’s data. If Group A has the higher median, the boxes will not overlap if the lower quartile value for Group A exceeds the upper quartile value for Group B. In symbols, this requirement is QLA > QUB . If the two boxes do not overlap, as shown in Figure 16.8, the two groups can promptly be deemed stochastically different.
QUB |
|
|
|
|
|
QUA |
|
|
|
|
|
||
|
|
|
|
|
QLA |
|
|
|
|
|
|
||
|
|
|
|
|
||
|
|
|
|
|
||
|
|
|
|
|
||
QLB |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FIGURE 16.8
Non-overlapping box plots for Group A and B, with QLA > QUB. At least 50% of the data in each group have no overlapping values.
© 2002 by Chapman & Hall/CRC
When converted into Gaussian principles for contrasting two means, the descriptive no-overlap-of- boxes criterion is much stricter than the usual stochastic demands. In a Gaussian distribution, Z .5 = .674 demarcates a standard-deviation zone containing 50% of the data on either side of the mean. For the spread of Gaussian data, the foregoing relationship would become (XA – .674sA ) > (XB + .674sB ), which requires that (XA – XB ) be > .674(sA + sB). Assuming that the group sizes are equal and that sA
and sB are approximately equal at a common values of sp, this descriptive requirement is XA – XB > 1.348sp .
As noted in Section 16.1.1.2, for a two-tailed P < .05 in a comparison of two means for equal-sized groups, the Gaussian demand for stochastic significance is that (XA – XB ) ⁄sp must exceed 1.96
2/n. As group sizes enlarge,
2/n will become progressively smaller, thus making this demand easier to
meet for the same values of XA – XB |
and sp. For equal group sizes 1.96 2/n need merely be smaller |
than twice the values of 0.98 2 ⁄n |
shown in Table 16.1. The largest of these doubled values, 1.386, |
will occur when n in each group is as small as 5.
Consequently, a pair of box plots showing no overlapping boxes with group sizes of ≥ 5 (for essentially Gaussian data) will almost always achieve two-tailed stochastic significance at P < .05.
16.1.2.2 Half-H-Spread Overlap Rule —
A more lenient but still effective stochastic criterion can be called the “Half-H-spread overlap” rule. It is pertinent, as shown in Figure 16.9, if the median for Group A is higher than the upper quartile for B and if the lower quartile of A is higher than the median of B. In symbols, the principle can be expressed as
QL |
|
~ |
and QU |
~ |
A > |
XB |
XA . If the overlap does not extend |
||
|
|
|
B < |
beyond the cited quartiles, no more than 25% of the data from Group B can be contained in the basic H- spread of Group A, and no more than 25% of the data from Group A can be in the H-spread of Group B.
Although apparently a lenient criterion, the Half-H- spread-overlap rule is still more demanding than the usual Gaussian stochastic standards, because the rule refers to the spread of the actual data, not to the more restricted scope of confidence intervals for an increment of two means. Because the inner 50% of Gaussian data spans a two-tailed Z.5 = .674, the upper quartile
for a Gaussian group B would be at XB + .674sB ; and the corresponding lower quartile for Group A would
be XA – .674sA . The “Half-H-spread” rule, expressed in Gaussian terms, would demand that
|
|
25% |
QUA |
QU |
|
|
~ |
B |
|
X |
|
|
|
A |
|
|
|
25% |
25% |
~ |
|
|
|
|
|
QLA |
|
XB |
|
||
25%
QLB
FIGURE 16.9
Less than Half-H-spread overlap in two box plots
with QL |
˜ |
˜ |
> XB and QU |
< XA . |
|
|
A |
B |
(XB + .674sB ) < XA and (XA – .674sA ) > XB
which becomes the simultaneous requirement that
(XA – XB ) > .674sA and (XA – XB ) > .674sB
Consequently, the requirement is that
(XA – XB ) > .674(larger value of sA or sB )
If we let s represent this larger value, XA – XB exceeds .674s when the Half-H-spreads do not overlap. In Table 16.1, the values of 0.98
2/n, when doubled to produce 1.96
2/n, show that two-tailed stochastic significance at α = .05 will occur here for equal group sizes at all values of n above 20.
Thus, the “physical examination” of overlap in box plots can be used as a quick screening test for stochastic significance of two compared groups. The mathematical values calculated in this section and
© 2002 by Chapman & Hall/CRC
