Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002
.pdf
in Section 16.1.2.1, however, rest on the assumptions that the data are Gaussian, with equal sizes and variances in both groups. Nevertheless, the results will not be too disparate from these calculations if those assumptions do not hold. In general, the contrast will almost always be stochastically significant at 2P < .05 if the two boxes have no overlap at all or if they have no more than a Half-H-spread overlap for group sizes ≥ 20.
16.1.2.3 Published Examples — Because the box plot is a relatively new form of display, its full potential as a visual aid has not yet been fully exploited for stochastic decisions. For scientists who want descriptive communication and who would prefer to decide about “significance” by directly examining quantitative distinctions in “bulks” of data, a contrast of box plots offers an excellent way to compare the spreads of data and the two bulks demarcated by the interquartile zones. The comparative display of two box plots can thus be used not only to summarize data for the two groups, but also to allow “stochastic significance” to be appraised from inspections of bulk, not merely from the usual P values and confidence intervals. Because box-plot displays have been so uncommon, however, few illustrations are available from published literature.
Figure 16.10 shows such a display, which was presented in a letter to the editor. 4 The spread is completely shown for the data, which are not Gaussian, because the boxes and ranges are not symmetrical around the median (marked with an asterisk rather than a horizontal line). Stochastic significance can promptly be inferred for this contrast because the two boxes are almost completely separated. The only disadvantage of this display (and of the associated text of the letter) is that the group sizes are not mentioned.
Figure 16.11 shows another contrast of box plots, obtained during a previously mentioned (Chapter 10) study, by Burnand et al.,5 of investigators’ deci-
sions regarding quantitative significance for a ratio of two means. The
group sizes are indicated; and the medians are drawn with lines (rather than
asterisks). The “whiskers” for each group cover an ipr95, rather than the
“fence” and “outlier” tactic proposed by Tukey. With almost no overlap in the boxes, the obvious differences in the groups will also be stochastically significant.
In a study of cardiac size on chest films, shown in Figure 16.12, Nickol et al.6 presented box-plot contrasts for three variables in three ethnic groups. The investigators stated that “every comparison (in these data) differed significantly (p < 0.001).” Using the Half- H-spread exclusion principle, the box plots for the investigated groups show
obviously significant differences between Africans and Asians in cardiac diameter, between Africans and Caucasians in cardiothoracic ratio, and between Africans and Caucasians in age.
16.1.2.4 Indications of Group Size — One disadvantage of box plots is that group sizes are not routinely indicated. The sizes can easily be inserted, however, at the top of each “whisker,” as shown in Figure 16.11. Some writers have recommended7 that the width of the boxes be adjusted to
© 2002 by Chapman & Hall/CRC
reflect the size of each group. Since unequal widths might impair the discernment of lengths in the boxes, the group sizes are probably best specified with a simple indication of N for each group.
16.1.2.5 Notched Box Plots — To m a k e t h e results more stochastic, the box plot can be marked with
a central notched |
zone that covers |
the distance |
~ |
± |
X |
||||
[(1.57 )(ipr50 )/ n ]. |
~ |
is the median, ipr50 |
||
In these symbols X |
||||
is the interquartile range (between the top and bottom of the box), n is the group size, and 1.57 is analogous to a Zα factor, chosen here to provide a suitably sized counterpart of the confidence interval for a median. The increment in two medians is stochastically significant if the notched interval of one box does not overlap the median of the other.
Figure 16.13 shows notched box plots for CD4 lymphocyte counts in two main groups of men, each divided into three subgroups.8 The notch-spread criterion for “significance” is analogous to the Half-H-spread-overlap rule, but is easier to fulfill because the potentially overlapping distance is smaller. The Half-H-spread rule is much easier to use, however, and avoids the need for extra measurement of notch length.
16.1.3Quantile-Quantile Plots
Ratio of
Two Means
16.0
15.0
2.0
1.0
Despite its use as a descriptive summary of data, the box-plot contrast discussed throughout Section 16.1.2 was employed for stochastic decisions. An elegant way of contrasting the quantitative distinctions of two groups, regardless of their sizes, has been devised by Wilk and Granadesikan.9 They originally proposed “quantile (Q-Q) plots, percent (P-P) plots, and hybrids of these” to compare an observed univariate group with a proposed theoretical distribution for the group, but the tactic can be used to contrast distributions for two different groups. The quantile technique, which relies on quintiles, quartiles, medians, etc., is probably easier to apply than the percent P-P plot, which relies on cumulative percentiles.
The quantile-quantile (Q-Q) graph creates a “pairing,” otherwise unattainable with nonpaired data, in which the respective X and Y values at each point are the corresponding quantiles of the compared Groups A and B. The two groups
can then be compared descriptively according to the line connecting the Q-Q points. If points lie along the “identity” line X = Y, the two groups have similar results. If the line goes through the origin with slope 1.2, the values in one group are in general 20% higher than in the other.
The method has seldom been used in medical literature, so the illustration here has another source. In Figure 16.14, the ozone concentrations for Stamford vs. Yonkers are plotted with a large array of quantiles, such as 2nd percentile, 5th percentile, 10th percentile, quartile, median, for each group. A much simpler line, however, could be drawn for just three points: the median and the lower and upper quartiles in each group. Four additional points — the lower and upper quadragintile values and the 10and 90-percentile values might be added for more complete detail.
C.P. Jones10 has pointed out that the relationship of lines and crossings in the Q-Q plot can be used to demonstrate descriptive differences in location, spread, and shape for the two compared groups of data. Jones has also proposed a simple “projection plot” in which the difference in each pair of quantile values is plotted against their average value. With this arrangement, the identity line passes horizontally through the increment of 0.
© 2002 by Chapman & Hall/CRC
|
Radiographic |
|
|
|
|
|
Cardiothoracic ratio,% |
|
Age, years |
|
|
|
|||||||||||||||||||
|
Cardiac diameter, |
58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||||||
|
mm |
|
|
|
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
46 |
|
|
|
|
|
|
|
|
|
|
54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
|
|
|
|
|
|
|
|
|
52 |
|
|
|
|
|
|
|
|
|
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
42 |
|
|
|
|
|
|
|
|
|
|
50 |
|
|
|
|
|
|
|
|
|
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
40 |
|
|
|
|
|
|
|
|
|
|
48 |
|
|
|
|
|
|
|
|
|
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
38 |
|
|
|
|
|
|
|
|
|
|
46 |
|
|
|
|
|
|
|
|
|
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
36 |
|
|
|
|
|
|
|
|
|
|
44 |
|
|
|
|
|
|
|
|
|
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
34 |
|
|
|
|
|
|
|
|
|
|
42 |
|
|
|
|
|
|
|
|
|
|
44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
32 |
|
|
|
|
|
|
|
|
|
|
40 |
|
|
|
|
|
|
|
|
|
|
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
30 |
|
|
|
|
|
|
|
|
|
|
38 |
|
|
|
|
|
|
|
|
|
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
28 |
|
|
|
|
|
|
|
|
|
|
36 |
|
|
|
|
|
|
|
|
|
|
32 |
|
|
|
|
|
|
|
|
|
26 |
|
|
|
|
|
|
|
|
|
|
34 |
|
|
|
|
|
|
|
|
|
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
24 |
|
|
|
|
|
|
|
|
|
|
32 |
|
|
|
|
|
|
|
|
|
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
22 |
|
|
|
|
|
|
|
|
|
|
30 |
|
|
|
|
|
|
|
|
|
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
AFR ASI |
CAU |
28 |
|
AFR ASI CAU |
16 |
|
|
|
|
|
|
|
|
|
|||||||||||||||
|
|
|
|
AFR |
ASI |
CAU |
|||||||||||||||||||||||||
|
|
|
|
|
|
||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||
|
|
|
|
|
(S) |
|
(S) |
|
|
|
|
|
|
|
|
(S) |
|
|
(K) |
(S) |
|
(S) |
|||||||||
FIGURE 16.12
Box-and-whisker plots of the distribution of the data for radiographic cardiac diameter, cardiothoracic ratio, and age, by ethnic groups; AFR = African, ASI = Asian, CAU = caucasian. The box encloses the interquartile range, and is transected by a heavy bar at the median. The whiskers encompass the 25% of data beyond the interquartile range in each direction (after Tukey, 1977). Skew was statistically significant in 5 groups of data marked (S), p < 0.05, but was of very moderate degree in all but the data for Age of caucasian subjects. Age in African subjects showed kurtosis marked (K), p < 0.05, of mild degree. [Figure and legend taken from Chapter Reference 6.]
Initial CD4+ Lymphocytes/ L
1500 |
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
500 |
|
|
|
|
0 |
|
|
|
|
No AIDS, |
Thrush, |
AIDS |
No AIDS, |
Thrush, |
AIDS |
No Thrush No AIDS |
(n=68) |
No Thrush No AIDS |
(n=50) |
||
(n=165) |
(n=60) |
|
(n=436) |
(n=71) |
|
Homosexual Men |
Injecting Drug Users |
FIGURE 16.13
Distributions of CD4+ lymphocyte counts in injecting drug users and homosexual men, with each cohort divided into individuals who did or did not develop acquired immunodeficiency syndrome (AIDS) or thrush, as described in the text. Boxes indicate median and upper and lower hinges; “waists” around the medians indicate 95% confidence intervals of the medians. The lowermost and uppermost horizontal lines indicate fences, either 1.5 interquartile ranges away from the first and third quartiles or at the limit of the observations, whichever was smaller. Outliers (values beyond the fences) are marked by asterisks. [Figure and legend taken from Chapter Reference 8.]
© 2002 by Chapman & Hall/CRC
16.2 Principles of Display for Dimensional Variables
Graphs and other visual displays of dimensional data have usually been prepared in an ad hoc arbitrary manner, according to the standards (or whims) of the artist to whom the investigator took the data. This laissez faire policy was sharply attacked several years ago in a landmark book by Edward Tufte, The Visual Display of Quantitative Information.11 He offered an invaluable set of principles for both intellectual communication and visual graphics in the display of data. Some of those principles have been used in the discussion here.
Results for a dimensional variable in two groups can usually be effectively compared from suitable summary statements that indicate the central index, spread, and size for each group. If this information is provided in the text or in a table, additional visual displays may be unnecessary. If presented, however, the displays should convey distinctions that are not readily apparent in the summary statements. The main such distinction is the distribution of individual data points. Their portrait can promptly let the reader distinguish spreads and overlaps that are not always clear in the verbal summaries.
16.2.1General Methods
|
250 |
|
|
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
* |
|
|
|
|
|
* |
|
|
|
|
|
|
* |
|
|
|
ppb |
200 |
|
**** |
|
|
|
|
* |
|
|
|
||
|
|
|
* |
|
|
|
|
|
|
* |
|
|
|
Quantiles. |
150 |
|
**** |
|
|
|
|
*** |
|
|
|
||
|
|
|
*** |
|
|
|
|
|
|
* |
|
|
|
OzoneStamford |
10050 |
|
** |
|
|
|
|
** |
|
|
|
||
|
|
|
* |
|
|
|
|
|
*** |
* |
|
|
|
|
|
|
|
|
|
|
|
|
* |
|
|
|
|
|
|
* |
|
|
|
|
|
|
* |
|
|
|
|
|
|
*** |
|
|
|
|
|
|
*** |
|
|
|
|
|
0 |
** |
|
|
|
|
|
|
|
|
|
|
|
|
0 |
50 |
100 |
150 |
200 |
250 |
|
|
Yonkers Ozone Quantiles. ppb |
|
|
||
FIGURE 16.14
Empirical quantile-quantile plot for Yonkers and Stamford ozone concentrations. The slope is about 1.6, implying that Stamford levels are about 60% higher thanYonkers level. [Fig - ure taken from Chapter Reference 2.]
The main challenges in displaying individual points of dimensional data are choices of symbols, methods of portraying multiple points at the same location, and selection of summary indexes.
16.2.1.1Symbols for Points — In most instances, dimensional values are shown in a vertical array within the arbitrary horizontal width assigned for columns of each compared group. Because each group usually occupies its own column, different symbols are seldom needed for the points of different groups.
A simple, circular, filled-in dot, enlarged just enough to be clearly visible, is preferable to open circles, squares, triangles, crosses, diamonds, and other shapes. The other shapes become useful, however, to distinguish one group from another when two or more groups are contrasted within the same vertical column.
16.2.1.2Multiple Points at Same Location — Multiple points at the same vertical location can be spread out in a horizontal cluster as ……, and the arbitrary width of the columns can be adjusted
as needed. If the spread becomes too large, the points can be split and placed close together in a double or triple tier. A remarkable “cannonball” spread of points is shown for multiple groups12 in Figure 16.15.
Instead of appearing individually in a cluster (or “cannonball”) arrangement, the multiple points are sometimes combined into a single expanded symbol, such as a large circle, whose area must appropriately represent the individual frequencies. The area is easy to identify if the symbol is a square, but is difficult to appraise if it must be calculated for a circle or other non-square symbol. Enlarged symbols can be particularly useful in a two-dimensional graph, whose spacing cannot be arbitrarily altered to fit the
©2002 by Chapman & Hall/CRC
Serum - Ascltes Albumin Gradient (g/l)
40
30
20
10
0
|
|
|
|
( -20 ) |
|
|
|
|
|
|
|
|
p<0.001 |
p<0.0001 |
N S |
p<0.001 |
p<0.0001 |
||||
|
|
|
|
|
|
|
|
|
Misc |
|
Sterile Cirrhotic |
|
Infected Cirrhotic |
CARD |
Misc PHT |
|
|
PCA |
Non-PHT |
||
|
PHT |
|
|
|
|
|
|
NonPHT |
|
|
|
|
|
|
|
|
|
||||
FIGURE 16.15
Serum-ascites albumin gradient in ascites, classified by presence or absence of portal hypertension. Statistical comparisons are to the sterile cirrhotic group by unpaired t-test. CARD = cardiac ascites; Misc Non-PHT = miscellaneous- nonportal-hypertension-related; Misc PHT = miscellaneous portal-hypertension-related; NS = not significant; PCA = peritoneal carcinomatosis. Mean ± SD bars are included as well as a horizontal line at 11 g/L, the threshold for portal hypertension. All groups differed significantly (P < 0.05) from the sterile cirrhotic samples by analysis of variance with the Dunnette test. [Figure and legend taken from Chapter Reference 12.]
individual cluster. In comparisons of categorical groups, however, the enlarged symbols usually keep the viewer from discerning individual frequency counts.
16.2.1.3 Choice of Summary Symbols — The decisions about summary symbols for indexes of central tendency, spread, and size depend on whether the purpose of the display is descriptive communication or stochastic inference. Many displays seem to be aimed at stochastic decisions, showing bars and flanges for means and standard errors but not points of data. Because the summary indexes can readily be presented in verbal text or tables, the display seems wasted if it does not show the individual data points and if the summary does not indicate spread of the data rather than indexes for stability of the mean.
If the visual goal is really stochastic, a separate problem arises from the choice of a standard error. For a stochastic Z (or t) test, or for a confidence interval, the statistical formula uses a standard error calculated from the combined (or “pooled”) variance in the two groups. Thus, for purely stochastic displays, each group should have an identical standard error, derived from the combined variance. If the standard errors are shown individually for each group, they might be used for the type of crude mental evaluations described in Section 13.3.2, but the exact sizes of the standard errors are often difficult to discern from the graphical display.
Probably the main value of showing standard errors is to “market” the results by disguising spread of the data. Because the standard deviation is divided by 
n, the standard error is always smaller than the SD and makes things look much more compact, tidy, and impressive. The SE becomes progressively smaller with enlarging sample sizes, thus allowing two big groups of data with a dismayingly large amount of overlap to appear neatly separated.
If aimed at the appropriate goal of descriptive communication, the display of individual points will often suffice to show the data, because the corresponding summary values will usually appear in text or tables. An appropriate choice of summaries, however, can be valuable for descriptive (and even stochastic) evaluations of “bulk,” as discussed earlier. The summaries are also necessary for large data sets having too many points to be shown individually. The comparison of two box plots is the best current method of showing these summaries. With suitable artistic ingenuity, the individual points might
© 2002 by Chapman & Hall/CRC
even be shown inside each box, and they could replace (or be shown alongside) the “whiskers” beyond the box.
Some good and bad examples of two-group dimensional displays are shown in the sections that follow. Because some of the displays will be adversely criticized, the sources of publication are not identified, and the names of some of the original variables have been changed.
16.2.2Conventional Displays of Individual Data Points
This subsection contains displays of various conventional arrangements of the points of data for two
comparisons. |
|
|
|
|
|
|
|
|
Figures 16.16 and 16.17 show the points of data directly, |
|
|
|
|
|
|
|
|
|
|
|
NF |
F |
|
|||
without any indexes of spread. In Figure 16.17, the location |
|
|
|
|
||||
of the means would probably be easier to identify if they |
|
2.0 |
|
|
|
|
|
|
were shown with lines rather than asterisks, if the legend |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
called them means rather than averages, and if the vertical |
|
|
|
|
|
|
|
|
scale on the far left had more “tick marks.” |
|
1.8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
||
In Figure 16.18, a solid line has been added to show the |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mean, and dashed lines to show SE in each direction. As |
|
1.2 |
|
|
|
|
|
|
noted earlier, the main value of displaying SE alone is that |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
a difference in means cannot be stochastically significant |
VARIABLES |
1.0 |
|
|
|
|
|
|
if the SE values overlap. If they do not overlap, however, |
|
|
|
|
|
|
||
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
||
a t value or confidence interval for the difference must still |
|
|
|
|
|
|
|
|
be calculated to demonstrate stochastic significance. For |
0.8 |
|
|
|
|
|
|
|
TWO |
|
|
|
|
|
|
||
a very wide gap between the ranges of the two SE zones, |
|
|
|
|
|
|
|
|
the mental calculation described earlier (Section 13.3.2) is |
|
|
|
|
|
|
|
|
OF |
0.6 |
|
|
|
|
|
|
|
easier to do from the stated summary indexes than from |
|
|
|
|
|
|
||
RATIO |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
||
trying to measure the size of the gap in a graph. |
|
|
|
|
|
|
|
|
In Figure 16.19, a desirable line has been drawn near |
0.4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
||
the bottom of each group to separate all the data points |
|
|
|
|
|
|
|
|
having values of 0. The arrangement allows each of those |
|
0.2 |
|
|
|
|
|
|
points to be seen without the crowding that occurs in the |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
points above the line in the left column. |
|
|
|
|
|
|
|
|
Figure 16.20 makes a laudable effort to show all the |
|
|
|
|
|
|
|
|
data points, but communication is impeded by the over- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
lapping circles and by the many diverse symbols, which |
FIGURE 16.16 |
|
|
|||||
are generally used only once or twice. The display prob - |
Graph shows ratio of two variables in nonfailing |
|||||||
ably would have been much easier to read and interpret |
(NF) and failing (F) conditions. |
|
|
|||||
if each person were shown with a dot, group means with
lines, and special individuals identified with a label on the graph and an arrow pointing to the dot.
In Figure 16.21, a logarithmic scale is used to cover the wide range of values for data points shown in a single vertical array for three groups, with different symbols. The three groups are so clearly separated that no summary expressions are needed to emphasize the distinctions. In Figure 16.22, data points are also shown for three groups, together with indications of mean and SEM for each group. In the original report, no stochastic calculations were presented for the textual claim that the special amine was “consistently raised in uraemia,” but the separation of the data points indicates that no such calculations were needed. The display of SEM values thus seems redundant, particularly as no claims were made for stochastic distinctions between the “blind-loop” and “control” groups.
Figure 16.23 is remarkable for displaying the summary results with medians and an inner 80-per- centile zone rather than with Gaussian indexes. The authors13 say they wanted to avoid Gaussian assumptions and that “this type of presentation is easily understood by most people.” The research was used, however, to establish a zone of normal values in 704 “healthy asymptomatic aircrewmen.” The 10th and 90th percentile boundary points were chosen “arbitrarily … [as] conservative reference values for the response of healthy men to treadmill exercise” and also “to exclude individuals having subclinical
© 2002 by Chapman & Hall/CRC
2
p<0.05
Diameter |
(mm) |
|
Arterial |
1 |
|
|
|
Minimum |
|
|
|
||
|
0
Treatment A Treatment B
FIGURE 16.17
Minimum arterial diameter for subjects receiving treatments A or B.
Variableof |
0.7 |
0.5 |
|
Magnitude |
0.3 |
0.1 |
Smokers Nonsmokers
FIGURE 16.19
Segregation of clusters of data points at value of 0.
|
|
500 |
|
Concentrationof Drug |
tissue)of(ng/g |
400 |
|
300 |
|||
|
|
||
|
|
120 |
|
|
|
60 |
|
|
|
0 |
Control Diseased
FIGURE 16.18
Means and standard errors for two groups.
400
300
of Variable |
200 |
|
Magnitude |
||
Patient 1 |
||
|
Patient 2 |
|
|
100 |
0
39 Adult |
7 Pediatric |
Controls |
Controls, the |
|
Patients, and |
|
Their Parents |
FIGURE 16.20
Magnitude of measured variable in (at left) 39 normal adults and (at right) the 7 children with other neurologic diseases who were used as controls (open circles), including 2 following a ketogenic diet (circles framed by squares); Patient 1; his parents (solid circles); Patient 2; and her mother (circle with slash). Half-filled circles to the right of each group represent means, and the vertical scales are marked in increments of 1 SD from the mean, with the results for Patients 1 and 2 excluded. The level of the variable for Patients 1 and 2 was more than 3 SD below the mean for either group of control subjects.
© 2002 by Chapman & Hall/CRC
conditions.” Despite the excellent clarity of the visual display, many clinicians will have doubts about establishing a “range of normal” by arbitrarily excluding 20% of apparently healthy persons.
|
1000 |
|
|
|
|
|
500 |
|
|
|
|
|
400 |
|
|
|
|
|
300 |
|
|
|
|
|
200 |
|
|
|
|
Variableof |
|
|
1000 |
|
|
100 |
|
|
|
|
|
ExcretionUrinary |
|
|
|
|
|
|
Amines |
800 |
|
|
|
|
|
|
|
|
|
|
|
|
|
S.E.M. |
|
|
|
|
|
Mean |
|
Magnitude of |
|
Concentrationof Special |
600 |
|
|
10 |
|
|
|
||
|
|
|
|
|
|
|
|
|
400 |
|
|
|
|
|
200 |
|
|
|
|
|
0 |
|
|
|
1 |
|
Control |
Blind loop |
Uraemic |
FIGURE 16.21
Urinary excretion of cited variable during water-deprivation tests in 10 children with histiocytosis and diabetes insipidus (triangles) and 21 children with histiocytosis but no symptoms of diabetes insipidus (circles). The hatched area represents the range in healthy children.
FIGURE 16.22
Special amine concentrations in control, blind-loop, and uremic subjects.
16.2.3Bar Graphs
For large groups, which may have too many individual data points to be easily displayed, the results are often shown with bar graphs. The lengths of the bars show the means, and flanges are added to show dispersion (usually SE). Figure 16.24 gives an example of this tactic for comparing results, in two groups of about 48 animals marked with cross-hatched or black bars, at different times after special administration of insulin. The symbols “N.S.” and “p < 0.05” are added to show the absence or presence of stochastic significance at each time point of comparison. This type of graph offers little that cannot be discerned from the data summaries for each contrast, but a visual display may be valuable for showing the trend of results over time.
A particularly complex bar chart, which probably tries to do too many things, is shown in Figure 16.25. The lengths of the bars indicate the means, but the quartile and median values are appended, together
© 2002 by Chapman & Hall/CRC
FIGURE 16.23
Control and submaximal treadmill exercise measurement data are shown as median ( ) with 10th and 90th percentiles. Submaximal data are presented for the 5, 10 and 15 percent grade within the Balke-Ware treadmill protocol; the relationship between these three grades and other treadmill protocol stages is depicted at the bottom of the figure. The numbers of subjects with complete data for each of the measurements are 698 for control; 699, 700, and 503, respectively, for three submaximal responses. [Figure and legend taken from Chapter Reference 13.]
200
HEART RATE [bmp]
40
220
SYSTOLIC
BLOOD
PRESSURE [mmHg]
100
100
DIASTOLIC
BLOOD
PRESSURE [mmHg]
60
110 |
A |
N.S. |
|
|
|
100 |
|
|
90 |
|
N.S. |
80 |
|
|
70 |
|
p < 0.05 |
60 |
|
|
50 |
|
|
|
|
TIME |
FIGURE 16.24
Blood-glucose in diabetic animals treated with special insulin.
CONTROL |
SUB - MAXIMAL RESPONSE |
|
|
|
|
|
|
|
|
|
|
|
190 |
|
|
|
|
|
174 |
|
|
|
|
|
|
|
|
|
|
|
|
|
158 |
|
|
|
141 |
|
|
|
|
|
|
|
|
|
|
|
130 |
|
|
|
|
|
|
|
102 |
|
|
|
|
|
|
|
82 |
|
|
|
|
|
|
|
|
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
208 |
|
|
|
|
|
193 |
|
|
|
|
|
|
|
174 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
160 |
140 |
|
|
|
148 |
|
|
|
|
|
|
132 |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
110 |
|
|
|
|
|
|
|
|
|
|
90 |
|
90 |
|
90 |
|
|
|
90 |
|
70 |
|
68 |
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 PERCENT GRADE, |
10 |
AT 3.3 MPH |
15 |
|||
|
|
I |
|
II |
BRUCE |
|
|
III |
|
|
|
I |
II |
|
EUESTAD |
|
|
III |
|
|
|
|
II |
III |
McHENRY |
IV |
|
V |
|
|
360 |
|
|
|
|
|
|
|
|
|
330 |
|
|
|
|
|
|
|
334 |
|
|
|
|
|
|
|
|
|
|
|
300 |
|
|
|
|
|
|
|
|
(days) |
270 |
|
|
|
|
|
|
|
|
240 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
period |
210 |
|
|
|
|
212 |
|
|
|
180 |
|
180 |
|
|
|
|
|
189 |
|
|
∆ |
= 80 |
|
|
|
|
|||
-free |
|
|
|
|
|
||||
150 |
|
|
|
|
|
|
|
|
|
Relapse |
|
|
|
|
|
|
|
|
|
120 |
132 |
116 |
|
|
|
|
|
119 |
|
|
|
|
|
|
|
||||
90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
60 |
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
|
|
|
|
|
|
|
|
0 |
|
A |
|
|
|
|
|
B |
|
|
|
|
|
|
|
|
||
FIGURE 16.25
Average relapse-free period after treatment in groups A and B, shown in the height of columns with corresponding median values, and 1st and 3rd quartiles in 2,045 patients.
© 2002 by Chapman & Hall/CRC
with a special symbol (∆ ) for demarcating the increment in means. This same information could probably have been communicated, with greater artistic ease and visual effectiveness, in two contrasted box plots. Note that the half-H-spreads do not overlap, consistent with a statement in the text that the 73-day difference in medians is “highly significant with non-parametric tests.”
16.2.4Undesirable Procedures
The next set of illustrations shows some particularly unattractive or undesirable procedures. In Figure 16.26, information that could readily have been understood with simple, direct summaries has been displayed in bar graphs. Beyond the basic redundancy, the graphs do not indicate the group sizes or the identity of the top flanges as either SD or SEM. The artist has also “volumized” the bars with an unnecessary third dimension. As a final undesirable touch, the legend erroneously states that the bar graphs are histograms.
Figure 16.27 is another volumized bar graph, which also omits showing group sizes. The unidentified “SED” at the far right of the graph is presumably standard error of the difference, explained in the text only as having been obtained “from the error mean squared of the respective ANOVA (analysis of variance) table.” Aside from the flaws in display, however, the data themselves may not have been appropriately analyzed. According to the text of the report, all the results were obtained in two sets of crossover studies of 6 patients. In one crossover, the patients went from a supine to a tilted position, and in each position they were untreated or received a particular treatment. The research was thus aimed mainly at contrasting the two sets of changes in a single group, but the results are presented as though four groups were being compared.
Figure 16.28 shows two of several similar structures displaying blood pressure and heart rate in response to five different treatments given to 14 patients. The original legend listed only the abbreviations used for the treatments and had no other identifying information. What at first seems to be an incomplete box plot for the blood pressure values could eventually, by careful reading of the text, be discerned as bars having mean systolic pressure on the top of each “box” and diastolic pressure on the bottom. The flanges represent standard deviations. In the lower display for heart rate, the circles are presumably central indexes, whose actual value is effectively obscured by the circles, and whose identity — as means or medians — is not listed in either the original legend or the original text.
FIGURE 16.26
Histograms of plasma levels of clotting substance before and after (4.2 ± 0.3 hours) tis- sue-type plasminogen activator (t-PA) at the two doses. Levels were significantly lower af - ter treatment with the two doses (p < 0.01) but similar for each.
FIGURE 16.27
Bar graphs showing changes in arterial plasma norepinephrine concentration during 30° head-up tilting before and after treatment. Data are shown for patients in the supine position (open bars) and after 30° head-up tilting (shaded bars).
Substance |
500 |
100 mg t-PA |
|
|
|||
|
|
||
|
|
150 mg t-PA |
|
Clottingof |
400 |
|
|
300 |
* |
||
|
|||
|
* |
||
|
|
||
Magnitude |
200 |
|
|
100 |
|
*p <0.01 0
|
|
Before t-PA |
After t-PA |
|
|
|
UNTREATED |
TREATED |
|
NOREPINEPHRINEARTERIAL |
CONCENTRATION (pg/ml) |
200 |
|
|
150 |
SED |
|||
|
|
|||
|
|
|
||
|
|
100 |
|
|
|
|
50 |
|
0
© 2002 by Chapman & Hall/CRC
