Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002
.pdf
9
Communication and Display of Univariate Data
CONTENTS
9.1Construction of Visual Arrangements
9.1.1Tables
9.1.2Graphs
9.1.3Charts
9.1.4Labels
9.1.5Legends
9.2Summary Indexes
9.2.1Nonplussed Minus
9.2.2Sizeless Group
9.2.3Unpropped Proportions
9.3Alternative Symbols for Spread
9.4Displays of Binary Data
9.4.1Citations of Both p and q
9.4.2Comparisons of p vs. q
9.5Displays of Nominal Data
9.5.1Pie Graphs
9.5.2Dot Charts
9.5.3Bar Charts
9.5.4Hyperpropped Proportions
9.6Displays of Ordinal Data
9.7Overlapping Multi-Binary Categories
9.8Gaussian Verbal Transformations References
Exercises
Before the statistical horizon expands from one group to two, the last topic to be discussed is the different ways in which a group’s data can be communicated. A good picture is said to be worth a thousand words, but a good statistical portrait may sometimes be even more valuable, because the summary for a large group may reflect more than a thousand items of data. On the other hand, a few well chosen words (or numbers) of description can often be better than an unsatisfactory portrait.
Statistical information is communicated with tables, graphs, charts, and individual summary indexes. The rest of this chapter describes some of the “lesions” to be avoided and some useful things to consider when results are shown for univariate data of a single group. The catalog of visual challenges in statistical display will be augmented later after the subsequent discussion of data for two groups.
9.1Construction of Visual Arrangements
The visual displays of tables, graphs, and charts can be made consistent and clear with some basic rules, cited in the next few sections.
© 2002 by Chapman & Hall/CRC
9.1.1 Tables
Arranging tables creates a tricky problem because they are oriented differently from |
graphs. In a |
||||||
2-dimensional graph, the examination usually goes in an upward diagonal direction ( |
↑ |
) when the X |
|||||
and Y axes unfold as ↑ |
→ |
|
↑ |
|
↑ ), however, as the columns |
||
|
. A two-way table has the opposite direction ( |
||||||
and rows move rightward and down, |
→ |
. |
|
|
|
||
|
|
|
|
||||
The cells of tables show frequency counts for the corresponding values of the data. These counts are commonly displayed for the categories of binary, ordinal, and nominal data, but dimensional data are usually shown as points in a graph. The main decisions about a table are the arrangement of orientation and sequence of categories.
9.1.1.1 Orientation — The difference in orientation can produce many inconsistencies, not only for two-way tables (which will be discussed later), but also for the one-way tables that display data for a single group. In general, because many readers are accustomed to comparing magnitudes in a vertical tier, a one-way table usually goes in a vertical direction as
|
Category |
Frequency |
|
|
|
||
|
|
|
|
|
|
|
|
|
A |
21 |
|
|
|
||
|
B |
82 |
|
|
|
||
|
C |
38 |
|
|
|
||
|
D |
67 |
|
|
|
||
rather than horizontally as |
Category |
A |
B |
C |
D |
||
Frequency |
21 |
82 |
38 |
67 |
|||
|
|||||||
Editors usually prefer the horizontal arrangement, however, because it saves space.
9.1.1.2 Sequence of Categories — In the sequence of categories for binary variables, should presence precede absence or vice versa? For ordinal variables, should the sequence be mild, moderate, severe or severe, moderate, mild? And what should be done with nominal variables, which cannot be ranked? Should they be listed alphabetically, enabling easy identification of each category, or placed according to the magnitude of the frequency counts, allowing easy visualization of results? If frequency counts determine the orientation, should they go downward or upward in the sequence of categories?
The sequence of binary or ordinal categories within an arrangement is usually “dealer’s choice.” Remembering the analogous codes of 0/1 for binary data and 1, 2, 3, … for ordinal data, many readers prefer to see the sequence go from lower to higher “values” as
Absent |
Mild |
or |
Moderate |
Present |
Severe |
rather than |
|
Present |
Severe |
or |
Moderate |
Absent |
Mild |
Either sequence is acceptable, but once chosen, it should be used consistently thereafter for all binary and ordinal variables. If the first tabulated ordinal variable is sequenced as 3, 2, 1, all subsequent ordinal sequences should usually go the same way, not as 1, 2, 3 for some variables and 3, 2, 1 for others.
The sequencing of nominal categories depends on the purpose of the table. If it is an exhaustive presentation of details, such as the population of each of the 50 United States, the categories should be arranged alphabetically (or by region) so that individual states (or regions) can be easily identified. If the
© 2002 by Chapman & Hall/CRC
table will be used to compare frequencies among several but not a large number of categories, the sequence can be listed according to magnitude. The largest frequencies are usually shown first. Thus, the previous alphabetical arrangement would become
9.1.2Graphs
Category |
|
Frequency |
|
|
|
B |
82 |
|
D |
67 |
|
C |
38 |
|
A |
21 |
|
Graphs are intended to show the individual points of the data. A one-way dimensional graph, as illustrated earlier in Figure 5.2 and here in section 9.1.2.1, is almost always oriented in a vertical direction. The main decision is how to display the scale. The conventional rules for this decision are discussed in the next few subsections.
9.1.2.1 Increments in Scale — The increments on the scale should be large enough to show the distinctions suitably. If the increments are too small, tiny distinctions will be excessively magnified into misleading comparisons. For example, the three numbers 103.1, 102.6, and 103.8 are all relatively close to one another. They might become quite distant, however, if the graph had increments of 0.1 unit, so that the separations become 5 units for the first and second numbers, and 7 units for the first and third. The left side of the vertical graph in Figure 9.1 shows the first distinction; the right side shows the second. If the graphic increments are too large, important distinctions may become trivialized. Thus, the numbers 4.3, 2.7, 1.8, and 3.1 might all seem to be in essentially the same place of the coarse graphic scale in Figure 9.2.
150 |
|
|
104 |
|
|
150 |
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|||
|
|
|
|
100 |
|
|
||
|
|
|
|
|
|
|||
100 |
|
|
103 |
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
50 |
|
|
||
|
|
|
|
|
|
|||
50 |
|
|
|
|
|
|
||
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
||
|
|
|
|
|
0 |
|
|
|
0 |
|
|
102 |
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|||||
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FIGURE 9.1 |
|
|
|
FIGURE 9.2 |
||||
Excessively small (on left) and large (on right) scales |
Four numbers appearing in essentially the same loca- |
|||||||
for same three numbers. |
|
|
|
tion of a coarse scale. |
||||
9.1.2.2Origin — The origin (0 point) of the scale should almost always be shown. This demand will avoid the misleading magnifications produced on the right side of Figure 9.1 when the origin was omitted and the scale ranged from 102 to 104. The 0 point can be avoided in situations where its display would be silly, as when one of the variables indicates calendar years, such as 1980, 1985, 1990, …, or when a 0 value does not appear in the data, such as height.
9.1.2.3Changes in Scale — If the points have a wide spread but several “modal” zones, the results may be bunched in an unsatisfactory way for each zone. One way to avoid this problem is to change the scale at different places in the graph. The changes should always be shown, however, with a clearly demarcated break in the identifying axis. Thus, the six points 129, 135, 140, 2051, 2059, and 2048 might appear in two ways as shown in Figure 9.3.
This same break-in-scale approach can be used whenever the data cover a wide range. To change scale in mid-graph without a clear identification of the change, however, is one of the cardinal sins of data display.
©2002 by Chapman & Hall/CRC
|
|
2055 |
|||
|
2500 |
|
|
2050 |
|
|
|
||||
|
|
2045 |
|||
|
|
|
|||
|
2000 |
|
|
|
|
|
|
|
|
|
|
|
1500 |
|
|
|
|
|
|
140 |
|||
|
|
|
|||
|
1000 |
|
|
130 |
|
|
|
120 |
|||
|
|
|
|||
FIGURE 9.3 |
500 |
|
|
|
|
|
|
|
|
||
Crowded display on left converted to clear |
0 |
|
|
0 |
|
display with changes in axis on right. |
|
|
|||
|
|
||||
|
|
|
|
|
|
9.1.2.4 Transformations of Scale — Rather than breaking the scale for a wide range of data, the investigator can convert the results into some other continuous scale, such as the popular logarithmic transformation. For example, the six numbers of the foregoing section could easily be transformed to the base-10 logarithms of 2.11, 2.13, 2.15, 3.312, 3.314, and 3.311. The results could then be readily shown in a scale whose demarcations extend from 0 to 4.
Alternatively, the data could be plotted directly on graph paper that is calibrated in logarithmic units and that will show the same distinctions, without the need for calculating logarithmic values. In this type of scale, the value of 0 will represent 1, and the other values will be 1 (for 10), 2 (for 100), 3 (for 1000) and 4 (for 10,000).
Two important details to remember about logarithmic transformations are that they may sometimes be used merely to display data, not necessarily to analyze it, and that the logarithms are sometimes constructed with a “base” other than 10. Logarithms with base 2 can be used for showing antibody values; and “natural” logarithms, abbreviated with ln and constructed with e = 2.7183 as the base, are commonly used in certain types of multivariable analysis. In a graph scaled with natural logarithms, the correspondence of ln and original values would be 0 for 1, 1 for 2.72, 2 for 7.39, 3 for 20.1, 4 for 54.6, 5 for 148.4, etc. (This type of scale seems to have been used for values of whole-body glucose disposal in Figure E.19.4 of the Exercises in Chapter 19.)
9.1.3Charts
The word chart can be used for any pictorial display of statistical data that does not involve the cells of a table or points of a graph. The pictogram charts that can be used to illustrate (or distort) comparisons will be further discussed in Chapter 16. Most other charts appear as bar graphs.
9.1.3.1 Arrangement of Bars — The frequency counts or relative frequencies of categories in a univariate group of data are often displayed in bar charts (also called bar graphs).
The bars are usually oriented vertically, showing the height of each corresponding magnitude for the category listed at the base of the bar. A histogram is a bar chart in which the horizontal categories are the ordinalized intervals of dimensional data. For example, age might be shown in intervals of < 10, 10–19, 20–29, 30–39, etc.
In any bar chart, all the bars should have equal widths, but the basic width may vary to permit suitable labeling. For example, the bars may be narrow, as in the left of Figure 9.4, or wide as on the right.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
|
B |
|
C |
|
AUSTRIA |
|
BRAZIL |
|
CANADA |
FIGURE 9.4
Narrow bars on left, enlarged on right to permit wider labels.
© 2002 by Chapman & Hall/CRC
For dimensional and ordinal data, the bars can touch contiguously to represent the ranked arrangment. Thus, the bars might be shown as in Figure 9.5, but many artists and investigators prefer to leave small spaces between each bar. For unranked nominal data categories, the bars should be separated by spaces as in the foregoing A, B, C or Austria, Brazil, Canada illustrations in Figure 9.4. In a histogram, the bars always touch one another, but the bars at the two ends may be omitted if the extreme intervals are unbounded, as in the arrangement <10, 10–19, 20–29, 30–39, 40–49, …, 70–79, ≥ 80. If the extreme intervals are bounded, with widths that differ from the equal-sized intervals in the interior, the outside intervals can have unequal widths, but the height of each bar should then be adjusted to represent the average value for the entire interval.
FIGURE 9.5
Contiguous arrangement of three ordinal
categories in bar chart. |
MILD |
MODERATE SEVERE |
|
|
9.1.3.2 Artistic Abuses — Because each category has a single associated magnitude, the length of the bar is its only item of numerical communication. For this purpose, as discussed later, the bars could easily be replaced by simple lines. As an esthetic custom, however, the bars are usually made wide enough to allow suitable labels, but major distortions can be produced by esthetic manipulations done to make things look “interesting” or “pretty.” For example, in certain pictograms, the bars may be replaced by drawings of persons, places, or things that have the same heights as the bars; but the different widths of the associated objects convert the visual image from a length to an area, thus creating deceptive impressions by magnifying the actual differences in size.
A particularly lamentable custom is “volumizing”— a tactic that converts the two-dimensional bar to a three-dimensional post. Since the second-dimension was needed only for labeling, the third-dimension is completely unnecessary. Its use is an act of “marketing,” not scientific communication. (Examples of these abuses for contrasts of groups will be shown later in Chapter 16.)
9.1.4 Labels |
|
|
|
Each axis of a table, graph, or bar chart must be clearly labeled or |
PROPORTIONS |
||
explained in each direction. The statements for the vertical magnitudes |
|||
|
|
||
in graphs or bar charts are easier to read if presented horizontally as |
|
|
|
|
|
|
|
P
R
O
P
O
rather than in vertical contiguity as R or in vertical T re-orientation as I
O
N
S
P R O P O R T I O N S
The vertical labels are conventional and are usually preferred because they allegedly save space, but a well-placed horizontal label will not increase space and may actually reduce the total width of the graph or chart.
© 2002 by Chapman & Hall/CRC
9.1.5Legends
Each visual display — table, graph, or chart — should have a legend that describes the contents in a “stand-alone” manner, i.e., the reader should be able to understand the basic results without having to refer to the text. The legend need not repeat the details of criteria for categories such as mild, moderate, and severe, but should always indicate the name of the variable whose categories are being displayed.
9.2Summary Indexes
For the reader as well as the author, the statistical summary of a group should always indicate the size of the group and the spread of data, not just the central index. The absence of the additional information for spread and size leads to “orphan indexes,” whose provenance is unknown. The two main orphans in the statistical “family” are unidentified indexes of spread and groups of unknown size.
9.2.1Nonplussed Minus
An orphan index of spread occurs when dimensional data are summarized in an expression such as 37.8
± 1.9, without mentioning whether the 1.9 is a standard deviation or standard error. To avoid the “nonplussed minus” lesion, the entity that appears after the ± sign should always be identified as a standard error or standard deviation. A minor problem occurs when authors use the ± symbol to report “SE = ± 1.9” or “SD = ± 31.7,” thus erroneously implying that the standard error or deviation (which is always positive) might be a negative number.
Arguments have been offered1 that the ± display be abandoned and replaced by expressions such as 37.8 (SE 1.9), but many editors resist the proposal because it seems to involve extra spaces of type when ± 1.9 is replaced by (SE 1.9). Trying to conserve space whenever possible, such as eliminating periods after abbreviations—making mm. become mm and Dr. become Dr—editors are unhappy about abbreviations that require more space. On the other hand, no ambiguity is created when mm. becomes mm, but the absence of an identifying SD or SE makes the solo 1.9 become a malcommunicated menace. If the preferred (SE 1.9) or (SD 20.7) notation is still regarded as too long, one space could be saved by eliminating the parentheses and using ± SD 1.9. The statement could then be 37.8 ± SD 1.9.
9.2.2Sizeless Group
Even if the ± entity is identified as SE or SD, however, the number of members in the group is often omitted. If SE is given, stability of the mean can readily be estimated from the coefficient of stability, as SE/ X . Thus, if 1.9 in the foregoing example is SE, the value of 1.9/37.8 = .05 indicates that the mean itself is relatively stable. If SD is cited but not n, however, stability of the mean cannot be determined for the sizeless group.
On the other hand, when the mean seems stable from the X ± SE values, a reader who does not know the size of the group cannot determine whether the “stable” mean is a suitable representative of the data. Thus, if n = 279 and SE = 1.9, the standard deviation in the foregoing data will be 1.9 
279 = 31.7. The coefficient of variation will be 31.7/37.8 = .84, an excessively high value, indicating that the distribution is too eccentric (or diffuse) for the mean to be a satisfactory central index.
If not immediately adjacent to the central index and index of spread, the size of the group should always be cited somewhere in the neighboring text. A particularly important aspect of size is the “effective” number of persons contained in the analysis, not the original number in the group. For example, in a study of 279 people, the indexes of central location and spread may actually have been calculated with n = 205, because the variable had 72 items with unknown values. Failure to cite the “effective” rather than merely the original size of the group is a violation of “truth in advertising.”
© 2002 by Chapman & Hall/CRC
9.2.3Unpropped Proportions
A different manifestation of the sizeless-group lesion occurs when proportions or percentages are recorded merely as a central index, such as .37 or 37%, without an indication of either numerator or denominator. Because the variance of binary data with the proportion p is p(1 − p), the reader can immediately determine variance, but is deprived of a crucial prop of information: the group size needed to discern the stability or standard error of the proportion.
Although the denominator can always be cited in a format such as .37 (82), the easiest-to-understand presentation would show both numerator and denominator as .37 (30/82). This arrangement, which puts the central index first, seems preferable to the alternative arrangement, 30/82 (.37).
9.3Alternative Symbols for Spread
In deciding whether to display SE or SD, the main question is: What does the reader want to know or what does the author want to show? If the goal is to determine the stability of the mean, SE is preferred, because it easily leads to the evaluation of SE/ X .
On the other hand, because the more common scientific goal is to communicate evidence rather than inference, the most desirable information is the spread of the data. For this purpose, the citation of ± SD is unsatisfactory for the several reasons noted throughout Section 5.3.2. (A spread of one standard deviation around the mean demarcates a relatively useless Gaussian inner zone of about 68%; and Gaussian descriptive summaries will be unwarranted and possibly misleading for the many sets of medical measurements that do not have Gaussian distributions.
For example, a Gaussian 95% zone for the data in Section 9.2.2 would be approximated as 37.8 ±
(2) (31.7) = 37.8 ± 63.4. The zone would extend to an upper level of 101.2, but its lower level would be a negative value of −25.6, which might be impossible. This problem can be avoided by reporting medians and inner percentile ranges rather than means and standard deviations. The median and ipr95 zone for the foregoing data might be cited in symbols such as 32[2-96] or [2; 32; 96]. The citation may occupy a bit more space than X ± SD, but decimal points may be saved if the original data are expressed in integer values.
Another example can come from the 56 integers that constitute the data set of Table 3.1, where X = 22.73 and s (calculated with n − 1) is 7.69. The median in this instance falls between the 28th and 29th ranks, both of which have values of 21. Calculated with the proportional method, the 2.5 percentile occurs at 11 and the 97.5 percentile occurs at 41. The data set could thus be summarized with X ± s as 22.73 ± 7.69 or with X and ipr95 as 21[11-41]. The second citation actually occupies less space than the first unless you round the decimals to 22.7 ± 7.7.
The preferred percentile method of descriptive citation will probably not come into general usage for many years, however, until the increased use of computer-intensive statistics begins to replace the parametric Gaussian paradigm.
9.4Displays of Binary Data
A graph or chart is seldom necessary or even desirable for binary data. Nothing is gained by showing a large array of 1’s and 0’s (or whatever binary feature is being considered). Because the central index and spread of data are easily summarized with a single binary proportion, p = r/n [or q = (n − r)/n], a simple citation of this proportion and its two constituent numbers will suffice to show what is happening.
9.4.1Citations of Both p and q
In some presentations, the values of p and q are separated for individual citations such as Success Rate
=25% (18/72) and Failure Rate = 75% (54/72). The two entities are redundant because the complement
©2002 by Chapman & Hall/CRC
of any binary proportion is readily apparent. If the success rate is 25%, the failure rate must be 100%
−25% = 75%.
The only possible need for listing more than one proportion occurs when the binary central index
represents a compressed summary for nominal or ordinal rather than binary data. For example, suppose we want a single central index to summarize the post-therapeutic state in 72 people whose results are: improved, 18; no change, 44; and worse, 10. The largest single value would be the proportion 61% (44/72) for the no-change group. If this result is regarded as too uninformative, the summary might be offered with two proportions for improved, 25% (= 18/72) and worse, 14% (= 10/72). To avoid the redundant citation of 72 in both denominators, the summary might be stated as follows: “Of 72 patients, 25% improved and 14% were worse. All others were unchanged.”
Methods of displaying ordinal and nominal data are further discussed in Sections 9.5 and 9.6.
9.4.2Comparisons of p vs. q
In certain portraits, the redundant complementary proportions are placed next to one another on a bar graph, as in Figure 9.6, which shows the percentage of rapid and slow acetylator phenotypes in each of three HIV stages (marked A,B,C). This type of display makes the two results in each stage look as though they arose from contrasted groups, rather than from a single group. All the expense and space of the picture for each stage communicates less effectively than single statements, such as Rapid Acetylator Phenotype in Stage HIV C: 55% (11/20).
|
100 |
|
|
|
90 |
|
16 |
|
80 |
|
|
patients |
7 |
|
|
70 |
|
||
|
|
||
60 |
|
11 |
|
of |
50 |
|
9 |
Percentage |
|
||
40 |
|
|
|
n=3 |
|
|
|
30 |
4 |
|
|
20 |
|
||
|
|
||
|
|
|
|
|
10 |
|
|
|
0 |
|
|
|
HIV A |
HIV B |
HIV C |
Distribution of rapid acetylator phenotypes (solid bars) and slow acetylator phenotypes (open bars) by HIV stages. Percentages are calculated for individual disease stages; absolute numbers are given on top of each bar.
FIGURE 9.6
Redundant bar graph showing proportions for both complementary percentages in each group. (Figure and attached legend taken from Figure 2 of Chapter Reference 2).
9.5Displays of Nominal Data
Nominal data are particularly challenging to summarize and display. Because the data cannot be ranked, quantitative values cannot be calculated for a central index and index of spread. The only possible single central index is a modal or compressed binary proportion. For example, the illustrative set of nominal data in Section 9.1.1.1 had the following frequency counts for 208 persons: A, 21; B, 82; C, 38; D, 67. A single “plurality” central index could be obtained as the modal value = 82/208 = 39% for Category B. A “majority” central index could be produced by the compression of categories C and D to form the binary proportion 105/208 = 50.5%.
© 2002 by Chapman & Hall/CRC
Visually, the frequencies or relative frequencies of nominal categories are often shown in bar charts, but the horizontal sequence of bars implies (incorrectly) that the categories are ranked. Accordingly, nominal data can be displayed in two other formats: pie graphs and dot charts.
9.5.1Pie Graphs
The circular outline of a pie-graph can be divided into wedges that correspond to proportions of the nominal categories. If each category has the relative frequency, pi, its angular slice is pi × 360 °. For example, in the preceding data, the proportions are A, .10; B, .39; C, .18; and D, .32. The respective angles in the “pie” would be A, 36°; B, 140°; C, 65°, and D, 115°. The pie graph is usually constructed in a clockwise arrangement, starting at 12 o’clock, and the categories are usually sequenced with the largest coming first. Thus, the foregoing data would produce the pie graph shown in Figure 9.7.
An extraordinary pie graph, which may be among the worst ever constructed, is shown in Figure 9.8, for sources of income of a charity organization in the U.K.3 The basic idea of a pie graph is abandoned because each category is given about the same angular slice, and the circular shape has been converted to a quasi-ellipse for portraying a third dimension. The extra dimension is height of the slices of pie, which act “volumetrically” to show the distinctions in magnitude.
|
|
|
|
Legacies |
|
|
|
|
|
£29.1m |
|
|
A |
|
|
Investment |
|
|
|
Net shops income |
|||
|
|
income |
|||
C |
|
|
|
||
B |
Donations/ |
£5.6m |
£6.5m |
||
covenants |
|||||
|
|
||||
|
|
£2.6m |
|
|
|
|
|
£2.5m |
|
£0.3m |
|
|
|
|
|
||
|
D |
|
£0.5m |
£0.4m |
|
|
|
Groups of friends/ |
|
Miscellaneous |
|
|
|
|
|
||
|
|
other regional |
Affinity cards |
Card company |
|
|
|
activities |
|
|
|
|
FIGURE 9.8 |
FIGURE 9.7 |
Pie-graph (? Edam-cheese graph) needlessly cast into three |
Pie graph for data in Section 9.5.1. |
dimensions. [Figure taken from Chapter Reference 3.] |
A useless pie graph is shown in Figure 9.9. Because only two categories are included, the authors are displaying both p and q for the same single binary proportion. Instead of appearing as two magnitudes in a redundant bar graph (as in Figure 9.6), however, these results are shown in a redundant pie graph.
9.5.2Dot Charts
Pie graphs are difficult to draw, requiring a protractor or special paper for the angles, and are difficult to interpret, because most persons are not accustomed to assessing and comparing angular wedges. Although pie graphs continue to appear in published literature, the dot chart 5 is a preferred replacement that is easier to draw and interpret.
In a dot chart, the scale of values is placed horizontally and the categories are placed in rows that can be arranged in descending magnitudes or some other appropriate order (such as alphabetical). The dot chart for the data in Figure 9.7 is shown in Figure 9.10.
In addition to its other virtues, a dot chart has the advantage of easily showing magnitudes of 0, which cannot be readily displayed with bar charts or pie graphs.
© 2002 by Chapman & Hall/CRC
Bone Biopsy-Proven Osteomyelitis
(n=28)
32%
(n=9)
68%
(n=19)
Clinically Suspected Osteomyelitis
Clinically Unsuspected Osteomyelitis
FIGURE 9.9
Redundant pie graph for “Relationship between clinical and bone biopsy diagnoses of osteomyelitis.” [Figure taken from Chapter Reference 4.]
B
D
C
A
0 |
.20 |
.40 |
.60 |
.80 |
1.0 |
FIGURE 9.10
Dot chart corresponding to pie graph of Figure 9.7.
9.5.3Bar Charts
Although a univariate distribution of unranked nominal categories should not be shown as a graph, the principle is regularly violated when the categories are ranked according to their relative frequencies and then displayed in the form of a bar graph or chart. Figure 9.11 shows an example of this approach, used by the American Statistical Association (ASA) to display6 percentages for the categories of its “continuing education expenditures.”
Percentage of Total CE Expenditures
50
45
45
40
35
30
25
20
15 |
|
12 |
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||
|
|
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||||
10 |
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
5 |
|
|
|
|
|
|
|
|
|
|
|
4 |
4 |
3 |
2 |
2 |
|
|||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||||||||||||
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Employees |
-Collec tion |
Travel |
Expenditure.Prof Supplies |
Categories-Audio OfficeGen'l |
Postage |
.Pubs .Prod |
Contract .Srvcs |
|
|
|
|
Srvcs |
visuals |
|
|
|
|
FIGURE 9.11
1992 Budgeted ASA continuing education expenditures. [Legend and figure derived from Chapter Reference 6.]
© 2002 by Chapman & Hall/CRC
