Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
25.93 Mб
Скачать

Figure 16.29 is a counterpart, in two groups, of the useless display of complementary proportions shown earlier in Figure 9.1 for one group. The information for both groups could have been communicated more simply and directly with a straightforward comparison of the constituent numbers.

Figure 16.30 has been nominated for an “Oscar” in malcom-

munication. The only virtue of the figure — its display of data

 

170

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

points — is negated by ineffective or inadequate marks and

Hg)

160

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

labels. The upper and lower bars for each box show only the

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(mm

150

 

 

 

 

 

 

 

 

 

 

 

 

 

 

range for each group. Two different measurement scales are

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

distinguished by and symbols, which are confusing because

140

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PRESSURE

 

 

 

 

 

 

 

 

 

*

 

 

 

 

 

 

 

 

 

they are normally used to signify magnitude of P values rather

 

 

 

 

 

 

 

 

 

 

130

 

 

 

 

 

 

 

 

 

 

 

 

 

 

than scales of measurement. The unidentified dashed horizontal

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

lines must be discerned, from the text, as means. Sample sizes

120

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

are not identified and must be either counted from the dots or

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

BLOOD

110

 

 

 

 

 

 

 

 

 

 

 

 

 

 

searched in the text.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 16.31 at first seems to have many merits. It shows box

 

 

 

 

 

 

 

 

 

 

 

 

 

 

plots and cites actual values for pertinent boundaries on each

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

plot. The legend, however, is confusing in describing an “NL”

 

 

 

 

 

 

 

 

 

 

 

*

 

80

 

 

 

 

 

 

 

 

 

(presumably “normal”) hatched bar for “blood urea nitrogen,”

 

 

 

A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

B

 

 

 

 

 

 

 

although the graph refers to “serum creatinine.” Furthermore,

 

 

 

 

 

 

 

 

 

 

 

Treatment

 

 

 

Treatment

the meaning of the hatched NL bar is not clear. Does the “NL”

 

 

 

 

 

 

 

symbol indicate a range of normal, or does its hatched bar

 

 

 

 

 

 

 

represent a box plot analogous to the others on that graph? A

 

 

 

 

 

 

 

more cogent statistical problem, however, is that the data in this

 

 

 

 

 

 

 

 

110

 

 

 

 

 

 

 

 

 

 

 

 

 

 

crossover study have been displayed as though two groups were

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

involved. A more effective portrait would have shown the results

(bpm)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

of change for individual patients.

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 16.32 asks the reader to compare histograms for the

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

RATE

80

 

 

 

 

 

 

 

 

 

 

 

 

 

 

distribution of atypical lobules per breast in two groups. Despite

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

the merit of showing the actual distributions, the portrait has no

70

 

 

 

 

 

 

 

 

 

 

 

 

 

 

HEART

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

value in helping summarize the data for interpretation. The

60

 

 

 

 

 

 

 

 

 

 

 

 

 

 

associated table of data, copied here from the original publica-

 

*

p< 0.01

 

50

 

tion, makes relatively little contribution beyond listing a rela-

 

 

 

 

tively useless “average” (presumably mean) value for the

FIGURE 16.28

 

 

 

 

 

 

 

 

 

 

 

 

 

eccentric distributions. The original discussion of the statistical

 

 

 

 

 

 

 

 

 

 

 

 

 

Blood pressure and heart rate during

analysis and your invitation to improve things are contained in

each treatment regimen.

 

 

 

 

 

 

 

 

Exercise 16.2.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

16.3 Displaying Binary Variables

The frequency counts for a binary variable in two groups form a 2 2 table, and each group’s results are easily summarized as a binary proportion. The main visual decision is how to orient the rows and columns of the table itself.

16.3.1Orientation of 2 × 2 Table

If the research has the cause–effect architectural structure of a randomized trial, observational cohort, or etiologic case-control study, the most scientific arrangement places the alleged causal agents in the rows and the outcomes in the columns. This format is used because readers are accustomed to looking at horizontal scientific symbols for the sequence of cause → effect. Table 16.2 is organized in this “longitudinal” direction of observation and reasoning. In longitudinal studies, the rows indicate the

© 2002 by Chapman & Hall/CRC

%

 

100

 

75

 

7 7

5 2

50

 

25

 

2 3

4 8

0

 

Blacks

Whites

= Surgical

= Natural

FIGURE 16.29

Percentages of black women and white women by type of menopause.

 

 

 

 

*

Absolute Count

 

DISEASESTILL'S

 

 

TYPESOTHEROF

ARTHRITIS

 

 

 

 

 

 

 

 

** Percentage

 

 

 

 

 

 

 

 

 

 

 

 

mmcu/BASOPHILS

130

 

 

 

HOSPITALIZEDNON

CONTROLS

 

HOSPITALIZED

CONTROLS

 

 

 

 

 

1.30

BASOPHILS

 

 

 

 

 

 

 

 

 

120

 

 

 

 

 

 

 

 

 

 

1.20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

110

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.10

 

 

100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

80

 

 

 

 

 

 

 

 

 

 

 

 

 

 

**

 

 

0.8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

70

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.7

 

ABSOLUTE

60

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.6

PERCENTAGE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

50

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

*

 

 

 

 

 

 

40

 

 

 

 

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

30

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.3

 

 

20

 

 

 

 

 

 

 

 

 

 

 

**

 

 

 

 

 

0.2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

*

 

 

 

 

 

 

 

 

 

 

 

 

 

 

**

 

 

 

 

 

 

 

 

 

 

10

 

 

 

 

 

*

 

 

 

 

 

0.1

 

 

*

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

**

Serum

Creatinine

mg/dl

 

 

 

N = 99

 

 

 

 

 

 

 

7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Max

12.6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3.1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

1.9

 

 

 

 

1.9

 

 

 

NL

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.2

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

1

 

 

1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Baseline

 

Peak

FIGURE 16.30

FIGURE 16.31

Absolute number and percentage of basophils in controls and in

Changes in renal function induced by treatment in

children with arthritic conditions.

patients receiving immunotherapy for advanced can-

 

cer. The arrow indicates that the maximum level for

 

serum creatinine was greater than 7 mg/dL. Hatched

 

bar shows blood urea nitrogen.

© 2002 by Chapman & Hall/CRC

ABSOLUTE NUMBER OF BREASTS

25

 

 

35

20

 

30

15

 

25

BREASTS

20

 

10

15

5

OF

10

 

PERCENT

5

 

 

 

0

 

0

5

 

10

 

 

5

0

 

0

A

Number of atypical lesions per breast in (A) random routine autopsy and (B) cancer-associated breasts.

B

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70 75

80

85

90

95 100 105 110115120 125 130

197

225

TOTAL NUMBER OF ATYPICAL LOBULES PER BREAST

Comparison of 119 autopsy and cancer-associated breasts.

Item

Autopsy

Cancer-associated

 

 

 

Number of breasts

67

52

Average age (years)

63.47

60.80

Age range (years)

25-96

28-89

Average number of AL per breast

9.96

37.40

Range in number of AL per breast

0-92

0-225

 

 

 

FIGURE 16.32

Reproduction of histograms and tabular data for a comparison of “atypical lobules per breast” in two groups.

denominator groups, n1 and n2, that have been exposed or nonexposed to the alleged “cause”; and the columns show frequencies for the outcome fates, f1 and f2, for the forward direction of observation.

In an etiologic case-control study, however, the groups are chosen according to presence or absence of the outcome event. They are then “followed backward” to determine previous exposure to the alleged “cause.” To keep a scientific format, the table can still have the same horizontal orientation with “causes” in the rows; but the n1 and n2 totals for the “sampling” are now listed in the columns for the selected outcome groups that are the appropriate denominators. The arrangement is shown in Table 16.3.

16.3.2Individual Summaries

Because of structural differences in the research, the n1 and n2 denominators for longitudinal and casecontrol studies have different locations in the tables and must be expressed with different binary proportions.

For the longitudinal data of Table 16.2, n1 and n2 are marginal totals for the rows. The proportions of success, which would be p1 = a/n1 and p2 = c/n2, could then be compared as an increment pl p2 or as a “risk” ratio, p1/p2. (If “risk of success” seems like an inappropriate phrase, it can be replaced by the risk ratio for failure, expressed as q2/q1.)

For the case-control study shown in Table 16.3, the analogous calculation of a /f1 and c/f2 in the rows is forbidden, as noted earlier (in Section 10.7.3), because patients were not chosen according to their exposure or nonexposure. If two proportions in this table are to be compared directly, they would have to refer to antecedent exposure, as e1 = a/n1 and e2 = b/n2. Because these proportions have relatively little intuitive meaning and cannot be used to express risk, they are seldom given much attention. Instead, the odds ratio, ad/bc, is usually cited as a single index of contrast, with the hope that it will adequately approximate the risk ratio. (The topic, mentioned in Chapter 10, is further discussed in Chapter 17.)

© 2002 by Chapman & Hall/CRC

TABLE 16.2

Architectural Structure of a Longitudinal “Cause–Effect” 2 ×

2 Table

 

 

 

 

 

Alleged Outcome

 

 

 

Present

Absent

 

Alleged Cause

(Success)

(Failure)

Total

 

 

 

 

Present (Active Treatment)

a

b

n1

Absent (Comparative Treatment)

c

d

n2

Total

f1

f2

N

 

 

 

 

 

TABLE 16.3

Architectural Structure of Etiologic Case-Control Study

 

Alleged Outcome

 

 

Present

Absent

 

Alleged Cause

(Diseased Case)

(Nondiseased Control)

Total

 

 

 

 

Present (Exposed)

a

b

fl

Absent (Non-exposed)

c

d

f2

Total

n1

n2

N

 

 

 

 

16.3.3Graphic Displays

If graphic displays have any merit at all for a 2 × 2 table, their main role would be to show the direction of observation and to compare magnitudes for the appropriate proportions.

Figures 16.33 and 16.34 show the “tabular box-graphs” that have been proposed14 for this purpose.

In both instances, the 2 ×

2 table contains the data

 

20

10

, and both box-graphs are drawn as “unitary

 

15

46

squares.” The dividing lines, however, are arranged to show the different directions and interpretations of the research architecture. The proportions for the selected denominators of the two main groups are shown by the placement of thick horizontal or vertical lines, and the thinner perpendicular vertical or horizontal lines are then placed to show the appropriate proportions in each group of either success for the cohort study or exposure for the case-control study.

 

 

F1

 

F2

TOTAL

 

F1

F2

TOTAL

 

 

 

 

 

 

 

 

10

30

E1

20

 

10

30

E1

20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

46

61

E2

15

 

46

 

61

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E2

15

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TOTAL

35

56

91

TOTAL

35

 

56

 

91

 

 

 

 

 

 

FIGURE 16.33

 

 

 

FIGURE 16.34

 

 

Tabular box-graph for a cohort study. The horizontal line is drawn at a distance of 0.33 (=30/91) in the unitary square. The two vertical lines are drawn at distances of 0.67 (=20/30) and 0.25 (=15/61).

Tabular box-graph for a case-control study. The vertical line is drawn at a distance of 0.38 (=35/91). The two horizontal lines are drawn at distances of 0.57 (=20/35) and 0.18 (=10/56).

© 2002 by Chapman & Hall/CRC

To display each constituent of an odds ratio, Figure 16.35 has a “quadrant-hub” arrangement of four contiguous squares, whose magnitudes for each side are a, b, c, and d for the four cells of the 2 × 2 table. The substantial differences in magnitude of the ad vs. bc squares in Figure 16.35 is consistent with the odds ratio of 6.1 [= (20 × 46)/(10 × 15)].

4.5 =

20

 

10 = 3.2

3.9 = 15

46 = 6.8

FIGURE 16.35

Contiguous squares showing cell sizes in a 2 × 2 table. [Figure and legend taken from Chapter Reference 14.]

The graphical displays in the squares of Figures 16.33 through 16.35 do not communicate much beyond what is readily discerned from direct inspection of the corresponding table, but they can be used if you want tables to receive an artistic emphasis that might help compensate for the enormous visual attention given to dimensional graphs in statistical illustrations.

A visually esthetic but intellectually unattractive arrangement shows the spinning-top shapes of Figure 16.36. The implications of the shapes themselves are difficult enough to understand; but the legend — which is reproduced here exactly as published — is inscrutable.

16.4 Displaying Ordinal Data

Because of the problems of getting adequate summary indexes (discussed throughout Section 15.7), there is no single ideal method for visual display of ordinal categories. The available methods of arrangement will inevitably differ according to the goal of the display. Is it intended to compare the overall distribution of categories in the two groups, or to do a category-by-category comparison? Bar charts can be used for either purpose, but the organization of the bars will differ.

FIGURE 16.36

Change with age of numbers of people at risk by virtue of having bone masses below an arbitrary threshold. Pendular shapes represent normal distribution of bone mass in two populations with different fracture risks (equivalent to cas - es and controls) at two different ages. For each population mean bone mass has been reduced between the two ages by 1.5 standard deviations of the distribution ( ), an amount fully in accord with that implied by cross-sectional studies as occurring between age 50 and 85. Risk threshold intersects distributions at (from left to right) 2.5, 1.0, 1.0, 0.5 standard deviations from means, resulting in stated ratios of numbers at risk in the two populations (shaded areas).

Bone mass

Controls

Cases

Controls

Risk

threshold

1 : 30

1 : 5

Age

Cases

© 2002 by Chapman & Hall/CRC

16.4.1Overall Comparisons

The “divided bar-chart,” also called a “component bar-chart,” has been a standard method of showing the distribution of constituent categories for two groups. Figure 16.37 is a schematic drawing of the arrangement, with the ordinal categories arrayed in vertical tiers that obscure both the shape of the distribution, and the comparison of individual categories.

A preferable way of displaying this same information is the divided dot chart proposed by Cleveland and McGill.15 Figure 16.38, which shows their dot chart for the same information that appears in Figure 16.37, allows magnitude to be clearly identified for each category, while also displaying shape of the overall distributions.

 

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

 

 

 

 

5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

5

 

VALUES

6

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

 

 

 

 

4

 

 

 

 

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

 

 

2

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

A

 

B

 

FIGURE 16.37

Divided (or “component”) bar chart for ordinal categories in two groups. [Figure and legend taken from Chapter Reference 15.]

GROUP A TOTAL

A5

A4

A3

A2

A1

GROUP B TOTAL

B5

B4

B3

B2

B1

0

2

4

6

8

10

12

VALUES

FIGURE 16.38

Divided dot chart for data of Figure 16.37. [Figure taken from Chapter Reference 15.]

© 2002 by Chapman & Hall/CRC

16.4.2Category-by-Category (Side-by-Side) Comparisons

Perhaps the most effective way to show category-by-category comparisons, however, is to place the bars opposite one another on a single vertical stalk.16 If just a few categories are included, this type of arrangement resembles a traffic signpost, as in Figure 16.39. If the ordinal data have many categories that are shown without spaces, as in Figure 16.40, the portrait may look like the layers of the “Michelin man.” The latter tactic has often been used to construct a “population pyramid,” which compares demographic components for two regions. The tactic can also be used to construct a type of back-to- back histogram, shown in Figure 16.41 for the distributions of CD-4 counts in two groups of men.17

Group 1

Group 2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

COSTA RICA

 

 

 

 

 

 

SWEDEN

 

 

 

 

 

 

 

 

1953

 

 

 

 

 

 

 

1956

 

 

B

 

 

 

Male

 

Female

 

 

 

Age

 

Male

Female

 

 

 

 

 

 

 

 

 

 

 

75 - 79

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

70 - 74

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

65 - 69

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

60 - 64

 

 

 

 

 

 

C

 

 

 

 

 

 

 

 

 

 

55 - 59

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

50 - 54

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

45 - 49

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

40 - 44

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

35 - 39

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

 

30 - 34

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

25 - 29

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

20 - 24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

15 - 19

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10 - 14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5 - 9

 

 

 

 

 

 

E

 

 

 

 

 

 

 

 

 

 

 

0 - 4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10

8

6

4

2

0

2

4

6

8

10

6

4

2

0

2

4

6

 

 

 

 

 

Percentage

 

 

 

 

 

 

Percentage

 

 

FIGURE 16.39

FIGURE 16.40

“Traffic signpost” arrangement. [Figure and legend

“Michelin-man” arrangement, showing “population pyramid”

taken from chapter Reference 16.]

often used in demographic comparisons.

If the layered categorical-bar arrangements of Figure 16.39 are converted to dot charts, as shown in Figure 16.42, the results produce a “Christmas tree” effect.16

Heterosexual Men

Homosexual Men

)

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

300

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

mm

600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(per

900

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Count

1200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1500

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CD4+

1800

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2100

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2400

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

120100 80

60

40

 

20

0

50

100 150 200 250

 

 

 

 

 

 

 

 

No. of Observations

FIGURE 16.41

Distribution of CD4+ counts in HIV-negative heterosexual and homosexual men in San Francisco. [Figure and legend taken from Chapter Reference 17.]

Group 1

Group 2

A

B

C

D

E

FIGURE 16.42

“Christmas tree” dot chart for the 5-category, 2-group data of Figure 16.39. [Figure taken from Chapter Reference 16.]

© 2002 by Chapman & Hall/CRC

16.5 Nominal Data

For two groups of nominal data, no visual structure improves the comparison that can be obtained with a well-organized table showing the frequency counts (or relative frequencies) for each of the nominal categories in each group. If visual adornment is desired, a side-by-side category dot chart is probably the best approach.

16.6 “Bonus” Displays

As a reward for your completing this chapter, here are two “bonuses” of interesting features in data display.

16.6.1New Haven–New York Timetable

In Edward Tufte’s second book, Envisioning Information,18 he denounces the visual display used for the schedule of Metro North trains from New York to New Haven. His critique is replicated in Figure 16.43.

In Figure 16.44, however, Tufte shows a revised improved schedule, done as a student project at Yale University in 1983. Whatever be the merits of the criticism and the new design, travelers on this railroad route will recognize that the old format is still being used almost 20 years later. (Tufte does not mention whether the improved version was ever brought to the attention of railroad officials.)

NEW YORK TO NEW HAVEN

MONDAY TO FRIDAY, EXCEPT HOLIDAYS

Leave

Arrive

Leave

Arrive

Leave

Arrive

New

New

New

New

New

New

York

Haven

York

Haven

York

Haven

AM

AM

PM

PM

PM

PM

12:35

2:18

2:05

3:45

T 6:25

8:10

5:40

7:44

3:05

4:45

T 7:05

8:56

7:05

8:45

T 4:01

5:45

T 8:05

9:45

8:05

9:45

4:41

6:25

T 9:05

10:50

9:05

10:45

T 4:59

6:53

10:05

11:45

10:05

11:45

XT 5:02E

6:33

11:20

1:05

11:05

12:45

XT 5:20

7:08

12:35

2:18

12:05

1:45

X 5:42

7:26

 

 

1:05

2:45

XT 6:07E

7:46

 

 

PM

PM

PM

PM

PM

PM

Bold sans serif capitals weak in distinguishing between two directions:

NEW HAVEN TO NEW YORK

NEW YORK TO NEW HAVEN

Column headings repeated 3 times and 24 AM's and PM's shown due to folded sequence of times. The eye must trace a serpentine path in tracking the day's schedule; and another serpentine for weekends:

Poor column break, leaving last peak-hour train as a widow in this column.

Too much separation between leave/arrive times for the same train.

Too little separation between these unrelated columns.

SATURDAY, SUNDAY & HOLIDAYS

AM

AM

PM

PM

PM

PM

12:35

2:18

2:05

3:45

7:05

8:45

5:40

7:37

S 3:05

S 4:45 H

8:05

H 9:45

8:05

9:45

4:05

5:45

9:05

10:45

10:05

11:47

5:05

6:48

11:20

1:00

12:05

1:45

6:05

7:48

12:35

2:18

PM

PM

PM

PM

AM

AM

The service shown herein is operated by

Metro-North Commuter R.R.

REFERENCE NOTES

Economy off-peak tickets are not valid on

trains in shaded areas.

Check displays in G.C.T. for departure tracks. E-Express

X-Does not stop at 125th Street.

S-Saturdays and Washington's Birthday only. H-Sundays and Holidays only.

T-Snack and Beverage Service.

HOLIDAYS-New Year's Day, Washington's Birthday, Memorial Day, Independence Day, Labor Day, Thanksgiving and Christmas.

FIGURE 16.43

Most frequently used part of schedule (showing rush-hour trains) is the most cluttered part, with a murky screen tint and heavy-handed symbols.

Rules segregate what should be together; a total of 41 inches (104 cm) of rules are drawn for this small table.

Wasted space in headings cramps the times (over-tight leading, in particular). Well-designed schedules use a visually less-active dot between hours and minutes rather than a colon.

Ambiguity in coding; both X and E suggest an express train, or even E for Economy.

Edward R. Tufte’s critique18 of the Metro North Railroad schedule of trains from New York to New Haven.

© 2002 by Chapman & Hall/CRC

NEW YORK

 

 

NEW HAVEN

 

 

 

Grand Central Station

 

 

 

Monday to Friday,

Saturday, Sunday,

except holidays

and holidays

Leaves

Arrives

 

Leaves

Arrives

New York

New Haven

 

New York

New Haven

12.35 am

2.18

 

 

12.35 am

2.18

5.40 am

7.44 am

 

5.40 am

7.37 am

7.058.45

8.05

9.45

8.05

9.45

9.0510.45

10.05

11.45

10.05

11.47

11.0512.45 pm

12.05 pm

1.45

 

 

 

12.05 pm

 

1.45 pm

1.05

2.45

 

 

 

 

 

 

2.05

3.45

 

 

 

2.05

 

3.45

3.05

4.45

 

 

 

3.05 Saturdaysonly

4.45

4.01

5.45

 

ticketsare

boxedin areas.

4.05

 

5.45

X 5.02

6.33

 

5.05

 

6.48

4.41

6.25

 

 

 

 

 

 

4.59

6.53

 

-offpeak

trainson

 

 

 

5.42

7.26

 

 

 

 

5.20

7.08

 

Economy

validnot

 

 

 

6.25

8.19

 

6.05

 

7.42

X 6.07

7.46

 

 

 

 

 

 

 

 

 

7.05

 

8.45

7.05

8.56

 

 

 

 

8.05

9.45

 

 

 

8.05 Sundaysonly

9.45

9.05

10.50

 

 

 

9.05

10.45

10.0511.45

11.20

1.05 am

11.20

1.00 am

12.35 am

2.18

12.35 am

2.18

X Express

Does not stop at 125th Street

Holidays: New Year's Day, Washington's Birthday, Memorial Day. Independence Day, Labor Day, Thanksgiving and Christmas.

NEW HAVEN

 

 

 

 

 

 

NEW YORK

Grand Central Station

Monday to Friday,

Saturday, Sunday,

except holidays

and holidays

 

 

 

 

 

 

 

Leaves

Arrives

Leaves

Arrives

New York

New Haven

New York

New Haven

 

 

 

12.35 am

2.18

12.35 am

2.18

5.40 am

7.44 am

 

5.40 am

7.37 am

7.05

8.45

8.05

9.45

8.05

9.45

9.0510.45

10.05

11.45

10.05

11.47

11.0512.45 pm

12.05 pm

1.45

 

 

 

12.05 pm

 

1.45 pm

1.05

2.45

 

 

 

2.05

 

 

3.45

2.05

3.45

 

 

 

 

 

3.05

4.45

 

 

 

3.05

Saturdays

4.45

 

 

areas.boxedin

only

 

4.01

5.45

 

aretickets

4.05

 

 

5.45

4.41

6.25

 

 

 

 

 

 

 

4.59

6.53

 

 

 

 

 

 

 

X 5.02

6.33

 

-offpeak

trainson

5.05

 

 

6.48

5.20

7.08

 

 

 

 

 

 

 

 

 

 

 

 

5.42

7.26

 

Economy

notvalid

 

 

 

 

6.25

8.19

 

6.05

 

 

7.42

X 6.07

7.46

 

 

 

 

 

 

 

 

 

 

7.05

 

 

8.45

7.05

8.56

 

 

 

 

 

8.05

9.45

 

 

 

8.05

Sundays

9.45

 

 

 

only

 

9.05

10.50

 

 

 

9.05

 

10.45

10.0511.45

11.20

1.05 am

11.20

1.00 am

12.35 am

2.18

12.35 am

2.18

X Express

Does not stop at 125th Street

Holidays: New Year's Day, Washington's Birthday, Memorial Day. Independence Day, Labor Day, Thanksgiving and Christmas.

FIGURE 16.44

Proposed improvement for schedule shown in Figure 16.43. [Figure taken from Chapter Reference 18.]

16.6.2Sexist Pies

Several years ago the New Yorker magazine19 published a cartoon, shown here as Figure 16.45, in which pie graphs depicted the alleged distribution of male and female thoughts. You can make your own decision about whether the data are worth converting to a dot chart.

FIGURE 16.45

Proposed distribution of gender-related cognitive foci. [Figure taken from Chapter Reference 19.]

© 2002 by Chapman & Hall/CRC

References

1. Cappuccio, 1993; 2. Chambers, 1983, pg. 60; 3. Gardner, 1986; 4. Nelson, 1992; 5. Burnand, 1990; 6.Nickol, 1982; 7. McGill, 1978; 8. Margolick, 1994; 9. Wilk, 1968; 10. Jones, 1997; 11. Tufte, 1983; 12. Runyon, 1992; 13. Wolthius, 1977; 14. Feinstein, 1988b; 15. Cleveland, 1984; 16. Singer, 1993; 17. Sheppard, 1993; 18. Tufte, 1990; 19. New Yorker Magazine, 1991; 20. Baer, 1992.

Exercises

16.1.(Note: To answer the following questions, you should not have to do any counting or locating of individual points on the graphs.)

16.1.1.Using the data shown in Figure 16.6, determine a coefficient of potential variation, i.e., stability, for the mean difference in the two groups. Are you impressed with the stability? Why or why not?

16.1.2.Determine a coefficient of variation for the data in each group. Are you impressed that the means are suitable representatives of the data? Why or why not?

16.1.3.The authors state (see legend of Figure 16.7) that the groups in Figure 16.6 have the same standard deviations as in Figure 16.7. Verify this statement.

16.1.4.Although you might think that the two groups in Figure 16.7 were randomly chosen as half of the groups in Figure 16.6, the points on the graphs contain contradictory evidence. What is that evidence? How do you think the data were really obtained for the two figures?

16.2.The text that follows appeared in a leading journal of American research and contains the only

quantitative information — together with what appeared in Figure 16.32 — of a published report on preneoplastic lesions in the human breast:

The t-test was used to examine the hypothesis that the means of the populations of the two samples (routine autopsy and cancer-associated) are equal when the sigmas are equal but unknown. If the two samples are drawn from the same population they must necessarily have the same sigma and mean. If the t-test rejects the hypothesis that the means are equal while the sigmas are equal but unknown, then the populations are different within the confidence interval that is drawn from the t-test. In this instance the t-test indicates that the two samples are not drawn from the same population, with less than 1 percent chance for error. All P values were less than 0.1. These results show a positive correlation between AL and cancer-associated breasts in the human.

16.2.1.If this manuscript were submitted now and you were asked to review the statistical analysis, what comments would you make?

16.2.2.Using the available information cited here and in Figure 16.32, prepare and show an alternative (improved) graphic presentation for the results.

16.3.Figure E.16.3 appeared in the report of a study20 that compared two new technologies — gradient echo magnetic resonance imaging (MRI) and 99mTc methoxyisobutyl-isonitrile single-photon emission computed tomography (MIBI-SPECT) — for their ability to define myocardial scars. The abbreviation DWT in the vertical ordinate refers to diastolic wall thickness, whose magnitudes on MRI were compared with the MIBI-SPECT designation of scar or no scar.

By visual inspection and mental computation only, without using a calculator, how could you determine whether the ± 1 and ± 2 citatio ns on the graph are standard deviations or standard errors?

16.4.Figure E.16.4 shows the data points and geometric means for two groups.

16.4.1.What alternative expression would you propose as an index of central tendency for these groups?

16.4.2.What problem would occur if you tried to do a t (or Z) test on the data shown in the graph?

16.4.3.What alternative simple strategy can you use to demonstrate stochastic significance for this contrast?

©2002 by Chapman & Hall/CRC