Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
25.93 Mб
Скачать

10.1.4.Even though the U.K. subjects were “healthy,” they were probably older than the U.S. group.

10.1.5.In the U.S. trial, the NNT values were 110 for total MI and 521 for total stroke. Thus, for the one stroke created in about 500 patients in the U.S., about 5 MIs would have been prevented. In the U.K. trial, the NNT values were 463 for total MI and 270 for total stroke. Thus, in the U.K., while one MI was being prevented in about 500 patients, about 2 strokes were being created. Your decision about whether the aspirin is worth taking may depend on whether you live in the U.S. or the U.K., and whether you prefer a stroke-free or MI-free existence.

10.1.6.Individual answers.

10.3.This was another trap question. If you fell in, please be enlightened as you emerge:

10.3.1.The actual risks cannot be calculated for users and nonusers of reserpine because this was

not a forward-directed cohort study. If any “rates” are to be calculated, they would have to be antecedent rates of exposure to reserpine. These rates would be .073(= 11/150) in the cases and

.022 (= 26/1200) in the controls. A formation of increments or ratios for these two rates, however,

would not be particularly informative. If the idea is to determine the relative risk of breast cancer in users and nonusers of reserpine, the best index would be the odds ratio, which is (11 × 1174)/(26 × 139) = 3.57.

10.3.2.Because the actual risks cannot be determined for users and nonusers, an incremental risk cannot be calculated. One of the prime disadvantages of the etiologic case-control study is that it provides only a ratio, i.e., the odds ratio (which approximates the relative risk ratio). This type of study, however, cannot produce an increment in risks.

10.5.Individual answers, but ARF's gut reactions are as follows:

10.5.1.Look for an increment of at least 10% and a ratio of at least 1.25. This is achieved if the success rate is 56% for active treatment.

10.5.2.Because mortality is already so low, much of this decision will depend on the risks and inconvenience of the active treatment. However, on a purely quantitative basis, the mortality rate should be lowered to at least 50% of its previous value. Hence, the active treatment should have a mortality of 4% or less.

10.5.3.The active treatment should be proportionately at least 50% better, so that its mean should be (1.5) × (1.3) = 1.95.

10.5.4.If the risk ratio is as high as 10, the absolute risk of getting endometrial cancer with estrogens is only .01. For a chance of only one in a hundred, the woman might be told the risk and allowed to make up her own mind. (If the risk ratio is lower than 10, the actual risk is even smaller than .01). Besides, the prognosis of estrogen-associated endometrial cancer is

extraordinarily favorable compared with cancers that were not estrogen-associated.

10.7. Individual answers. Note that the investigators often fail to report (and editors fail to demand citation of) the values of δ or θ that were anticipated for the trial.

Chapter 11

11.1. In a “unit-fragility” type of procedure, do an “extreme” relocation by exchanging the lowest member of Group A and the highest member of Group B. Group A would become 12, 14, 16, 17, 17, 125. Group B would become 1, 19, 29, 31, 33, 34. For a purely “mental” approach, without any calculations, compare the two sets of medians. In medians, the original comparison was 15 vs. 32; after the exchange, the comparison is 16.5 vs. 30; and the latter comparison still seems highly impressive. [On the other hand, if you use a calculator, the “new” means become XA = 33.5 and XB = 24.5, so that the direction of the increment is reversed. Better ways to handle this problem are discussed in Chapters 12 and 16.]

11.3. If the observed “nonsignificant” difference is XA – XB = –5, a more extreme difference in the same direction will be more negative and, hence, at the lower end of the interval.

© 2002 by Chapman & Hall/CRC

11.5.

11.5.1.Answers to be supplied by readers. (ARF is not happy with it for reasons discussed in the text, but has nothing better to offer unless we shift completely to the idea of descriptive boundaries.)

11.5.2.(a) When sample sizes are unavoidably small, a too strict value of α will prevent any conclusions. Rather than discard everything as “nonsignificant,” the value of α might be made

more lenient. The “significant” conclusions can then be regarded as tentative—to be confirmed in future research.

(b)In certain types of “data-dredging” procedures the information is being “screened” in search of anything that might be “significant.” The material caught in this type of “fishing expedition” would then have to be evaluated scientifically as being viable fish or decayed auto tires.

11.5.3.Before any conclusion is drawn for a difference that emerged from multiple comparisons in a data-dredged screening examination.

11.5.4.Raising α to more lenient levels would reduce the sample sizes needed to attain

“significance.” The research costs would be reduced. Lowering α to stricter levels would have the opposite effects. Since the credibility of the research depends mainly on its scientific structure rather than the stochastic hypotheses, the scientific credibility of the research might be unaffected. The purely statistical effect depends on how you feel mathematically about the old and new α levels. As noted later, however, a stricter level of α will reduce the possibility of “false positive” conclusions, but will raise the possibility of “false negatives.” A more lenient level of α will have opposite effects: more false positives but fewer false negatives.

11.7. The chance of getting two consecutive 7’s requires a 7 on each toss. The probability of this event is the product of (1/6)(1/6) = .03. It would also be possible, however, to get a 7 on one toss, but not the other. This event could occur as yes/no with a probability of (1/6)(5/6) = .14, or as no/yes with a probability of (5/6)(1/6) = .14. The total probability for the yes/yes, yes/no, no/yes, and no/no events would be .03 + .14 + .14 + .69 = 1.00.

Chapter 12

12.1. The remaining untested tables are

 

2

3

 

and

 

3

2

 

 

 

 

 

 

3

3

 

 

2

4

(40% vs. 50%)

(60% vs. 33%)

using k = 1.870129870 × 102, the p value for the first table is k/2!3!3!3! = 0.433 and for the second

table is k/3!2!2!4! = 0.325. The sum of these two p values is 0.758, which, when added to .242 (the two-tailed p value noted in Section 12.6.1) gives 1.

12.3. Observed difference in means is 5.17, which is 34/4 = 8.50 for Treatment X and 10/3 = 3.33 for Treatment Y. This difference is exceeded by only the following four arrangements.

 

 

Mean

Mean

 

 

 

 

 

X

Y

X

Y

Difference,

 

 

 

X

Y

 

 

 

 

 

 

 

 

 

3,8,11,13

1,2,6

8.75

3

5.75

 

 

 

 

6,8,11,13

1,2,3

9.5

2

7.5

 

 

 

 

1,2,3,6

8,11,13

3

10.6

–7.6

1,2,3,8

6,11,13

3.5

10

–6.5

 

 

 

 

 

 

 

 

 

The seven items in the trial can be divided into one group of four and one group of three in 35 ways [= 7!/(4!)(3!)]. Of these arrangements, three yield mean differences that are as great or greater than 5.17; and two yield mean differences that are negatively as large or larger than 5.17. The two-tailed P value is thus 3/35 + 2/35 = 5/35 = .143.

© 2002 by Chapman & Hall/CRC

Chapter 13

13.1. For Group A, X = 12.833 and s = 6.113. For Group B, X = 45.167 and s = 39.479. The value of sp becomes 28.25, and t = (32.334/28.25)(6)(6)/12 = 1.98. At 10 d.f., this result falls below the required t.05 = 2.23 and so 2P > .05. A major source of the problem is the high variability in both groups, particularly in Group B, where the coefficient of variation is 0.8. The standardized increment, which is (32.334/28.25) = 1.145, seems impressive despite the large value for sp, but the group size factor is not quite big enough to get t across the necessary threshold.

13.3

 

 

 

 

Group A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Group B

 

 

 

 

 

 

Σ XA = 66

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Σ XB = 49

 

 

 

 

 

 

 

 

 

 

 

 

nA = 9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

nB = 6

 

 

 

 

 

 

 

 

 

 

 

 

 

= 7.33 lbs

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

B = 8.17 lbs

 

 

 

 

 

 

 

 

 

 

 

X

A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

Σ XA2 = 504

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Σ XB2

 

= 411

 

 

 

 

 

 

 

 

 

 

 

(Σ XA)2/nA = 484

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(Σ XB)2/nB = 400.17

 

 

 

 

 

 

Σ XA2 (Σ XA)2/nA = 20

 

 

 

 

 

Σ XB2

 

(Σ XB)2/nB = 10.83

 

 

 

 

sA2

= 2.50

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

sB2

= 2.17

 

 

 

 

 

 

 

 

 

 

 

 

sA = 1.58

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

sB = 1.47

 

 

 

 

 

 

 

 

 

 

 

 

sXA

= .527

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

sXB = .601

 

 

 

 

 

 

 

 

 

 

 

13.3.1. For Group A: t8,.05 = 2.306

 

 

t8,.0l

= 3.355

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

95% Conf. Interval:

7.33 ± 2.306 (.527) = 7.33 ± 1.22 = 6.11 to 8.55 lbs

 

 

 

 

99% Conf. Interval:

7.33 ± 3.355 (.527) = 7.33 ± 1.77 = 5.56 to 9.10 lbs.

 

 

 

 

For Group B: t5,.05 = 2.571

 

 

t5,.0l

= 4.032

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

95% Conf. Interval:

8.17 ± 2.571 (.601) = 8.17 ± 1.55 = 6.62 to 9.72 lbs

 

 

 

 

99% Conf. Interval:

8.17 ± 4.032 (.601) = 8.17 ± 2.42 = 5.75 to 10.59 lbs

 

 

 

13.3.2. Using the formula t = (

 

 

 

 

 

 

− µ)/[ s n ], we get

 

 

 

 

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

tA

= 7.33----------------------- 6.7

 

= 1.195.

At 8 d.f., .20 < 2P < .40

 

 

 

 

 

 

 

 

 

 

 

.527

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

t = 8.17----------------------- 6.7

= 1.47--------- = 2.45.

At 5 d.f., .05 < 2P < 0.1

 

 

 

 

 

 

 

B

.601

 

.601

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

13.3.3.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

1

 

 

 

 

 

 

 

 

 

 

 

For t test,

t =

XA

– XB

, and s.e.

(XA – XB ) =

1

.

 

Because

 

 

 

 

 

-----------------------s.e.(XA

 

 

 

 

sp

 

----- +

n----B

 

 

 

 

 

 

 

 

 

 

 

B )

 

 

 

 

 

 

 

nA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

 

(nA – 1)sA2 + (nB – 1 )sB2

30.85

 

2.3731,

 

 

 

 

 

 

 

 

 

= (2.3731 )

1

1

 

sp =

--------------------------------------------------------

 

 

 

 

 

 

 

 

 

 

 

 

=

------------

=

and s.e.

 

 

 

 

 

-- +

6--

 

nA + nB – 2

 

 

 

 

 

 

 

 

 

(XA

– XB )

 

 

 

 

 

 

 

 

 

 

 

 

 

13

 

 

 

 

 

 

 

 

 

 

 

 

9

=

.6595

= .812,

t =

 

–.84----------

= –1.034. At 13 d.f., the 2P value for this t is

> 0.2.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

812

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

For confidence interval calculations, a 97.5% confidence interval around XA – XB is XA – XB ±

(t13,.025)(standard error) = (.84) ± (2.533)(.812) = −.84 ± 2.06, which spans from 2.90 to +1.22. Since this interval includes 0 (the assumed mean of the populational difference), we cannot

conclude that a remarkable event has occurred.

© 2002 by Chapman & Hall/CRC

13.3.4. If this is a single Gaussian population, its mean, µ, is 7.67 and its standard deviation, σ , is estimated as SXX /n = 33.333/15 = 1.491. We would expect to find 95% of the cases within µ ±1.96σ . This interval is 7.67 ± 1.96 (1.491) = 4.75 to 10.59. Because this interval includes 5, there is nothing peculiar about the baby who gained only 5 lbs. (The baby becomes “peculiar” only if you erroneously determine a 95% confidence interval around the mean and find that the baby does not fit into that interval. But why should it?)

13.5.

13.5.1.Adding the two standard errors mentally, the commentator got 6 + 4 = 10. The doubled value, 20, was more than twice the incremental difference in means, which was 25 –16 = 9. At

this magnitude for the crude approximation, the commentator felt reasonably sure that the 95% confidence interval around 9 would include the null hypothesis value of 0.

13.5.2.To do a t test with the pooled variance Formula [13.13] requires converting the standard

errors back to standard deviations. With a slightly less accurate but much simpler approach, you

can use Formula [13.15], calculating the denominator as 42 + 62 = 7.21. The value of Z (or t) is then 9/7.21 = 1.25. With the large sample sizes here, you can refer to a Z distribution, and note

(from the Geigy tables or the computer printout) that the corresponding 2P value is .21. If you used the pooled-variance method, the standard deviations are sA = 481 = 36 and sB = 6 64 = 48. The value of sp becomes

[(80 )(36 )2 + (63)(48)2]/(81 + 64 – 2 ) =1740 = 41.71

and

t = (9/41.71)( (81 )(64 )/145 ) = 1.29

In the Geigy tables, for ν = N 2 = 143, the two-tailed P value is 0.2 when t = 1.29 and 0.1 when t = 1.66. (The result is the same as what was obtained with Z.)

13.5.3.The commentator easily calculated the standard deviations by first taking square roots of

the group sizes as nA = 81 = 9 and nB = 64 = 8. Mentally multiplying these values times the cited standard errors produced the respective standard deviations of sA = 36 and sB = 48. These values are so much larger than the corresponding means of XA = 16 and XB = 25 that both distributions must be greatly eccentric and unsuitable for summarization with means.

13.5.4.The distibutions should be examined for shape before any decisions are made. Without any further information, however, a good approach might be to compare medians and do a PitmanWelch test on the medians.

13.7.All of the answers are available in previous discussions that can be summarized as follows:

13.7.1.The subtraction of means is direct, easy, and conventional. Alternative stochastic tests can be arranged, however, with ratios or other indexes of contrast.

13.7.2.Not necessarily. The discrepancies may be misleading if the data greatly depart from a Gaussian distribution. (In such circumstances, a better approach may be to use percentile and other “distribution-free” tactics.)

13.7.3.If deviations from the mean are added directly, the sum of Σ (Xi – X ) will always be zero.

13.7.4.Division by n 1 offers a better estimate, on average, of the corresponding (and presumably larger) standard deviation in the larger parametric population from which the observed results are a “sample.”

13.7.5.The standard error is a purely analytic (and poorly labeled) term, denoting the standard deviation in a sampled series of means. The term has nothing to do with the original process of measurement.

13.7.6.Division produces a standardized result in a manner analogous to converting the number of deaths in a group to a rate of death.

© 2002 by Chapman & Hall/CRC

13.7.7.When a parameter is estimated from n members of an observed group of data, the same parameter is presumably estimated from all subsequent theoretical samplings. To produce the same parametric value, the samplings lose a degree of freedom from the original n degrees. The term degree is an arbitrary label for n – k, where k is the number of parameters being estimated.

13.7.8.Under the null hypothesis that two groups are similar, the P value represents the random stochastic chance of getting a result that is at least as large as the observed difference. (Other types of P values, discussed in later chapters, have other meanings for other hypotheses.) The P value has become dominant because of intellectual inertia and professional conformity, and because investigators have not given suitable attention to the more important quantitative descriptive issues in “significance.”

13.7.9.In repeated theoretical parametric samplings, we can be confident that the true value of the parameter will appear in 1 − α (e.g., 95%) of the intervals constructed around the observed

estimate, using the corresponding value of Zα or tα . The confidence is statistically unjustified if the samples are small, if they widely depart from a Gaussian distribution, or if the parameter is

a binary proportion with values near 0 or 1, far from the meridian value of .5. The confidence is also unjustified, for substantive rather than statistical reasons, if the sample is biased, i.e., not adequately representative of what it allegedly represents.

13.7.10.If we reject the null hypothesis whenever P ≤ α , the hypothesis can be true for an α proportions of rejections, leading to false positive conclusions that a distinction is “significant”

when, in fact, it is not. The P value is often, although not always, determined from standard errors, but has nothing to do with errors in the measurement process. The value of α is regularly set at

.05 because of (again) intellectual inertia and professional conformity.

Chapter 14

14.1. For evaluating “clinical significance,” the death rates were .104 (or 10.4%) for timolol and .162 (or 16.2%) for placebo. This is an absolute reduction of 16.2 10.4 = 5.8% in deaths, and a proportionate reduction of 5.8%/16.2% = 36%. Many people would regard this as a quantitatively significant improvement. On the other hand, the survival rates were 89.6% and 83.8%, respectively. The proportionate improvement in survival was 5.8%/83.8% = 7%, which may not seem so impressive (unless you were one of the survivors). Another way of expressing the quantitative contrast is with the number needed to treat, discussed in Chapter 10. For the direct increment of .162 .104 = .058, NNT = l/.058

17. Thus, about 1 extra person survived for every 17 actively treated. For statistical significance, X2 can be calculated as

 

 

(98 )2

+

(152)2

(250 )2

 

 

 

(1884 )2

 

 

= 13.85

 

 

 

 

 

 

 

------------

 

---------------

---------------

 

 

------------------------------

 

 

 

 

945

 

939

 

1884

 

 

 

(250 )(1634)

 

 

 

 

Alternatively, since the fourfold table will be

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

98

 

847

 

 

 

 

 

945

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

152

 

787

 

 

 

 

 

939

 

 

 

 

 

 

250

 

1634

 

 

 

 

 

1884

 

 

the calculation can be

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[(98 × 787 ) (847 ×

152)]2 1884

= 13.85

 

 

-------------------------------------------------------------------------------

 

(250)(1634 )(945)(939 )

 

 

 

 

 

 

 

 

 

At 1 degree of freedom, this value of X2 has an associated 2P < .001. The authors seem justified in claiming “statistical significance.”

© 2002 by Chapman & Hall/CRC

14.3. What Mendel observed was 74% (= 787/1064) and 26% in his proportions. Intuitively, this seems quite consistent with the proposed ratios. If we want mathematical confirmation for this intuition, we can proceed as follows.

At a ratio of 3:1, the expected values for 1064 crosses are (3/4)(1064) and (1/4)(1064), i.e, 798 and 266. Using the observed-expected formula, we have

X

2

=

(787 – 798)2

+

(277 – 266)2

=

(–11)2

+

(11 )2

 

------------------------------798

------------------------------266

---------------798

------------266

 

 

 

 

 

 

121121

=-------- + -------- = .152 + .455 = .607

798266

At 1 d.f., this is not stochastically significant (P > .3). Therefore, the result is consistent with the hypothesis.

14.5. To use the mental calculation approach with Formula [14.8], note that we must get .001 multiplied by something that will make the result exceed 2. Since .001 = 1 × 10-3, the multiplier must be 2 × 103 or 2000. The latter is the square root of N, which should therefore be 4 × 106, or about 4 million. Consequently, each player should have batted about 2 million times.

For a more formal calculation, let Zα = 1.96. Let π = (.333 + .332)/2 = .3325. Then 1 π = .6675. Substitute in Formula [14.19] to get n = (2)(.3325)(.6675) (l.96)2/(.001)2 = 1,705,238.2 as the number of times at bat for each player. [To check the accuracy of this calculation, assume that N = 2n = 3,410,478 and apply Formula [14.7] with k = .5. The result is [.001 (.3325 )(.6675)] × (.5) × 3,410,478 = 1.96, which is the Z needed for 2P < .05.]

Chapter 15

15.1. Like the common cold, “fibrositis” is often a self-limited condition that gradually improves with time. Patients could therefore be expected to be generally “better” in the second month of treatment than in the first. Unless the “fibrositis” was distinctly chronic and stable, it was not a suitable condition for a crossover trial. In the entry criteria, however, the investigators made no demands for a minimum duration of symptoms.

15.3.

15.3.1.The data are “perfectly” aligned, so that the first 6 ranks are in Group A, and the next 6 ranks are in Group B. The sum of the first 6 ranks is (6)(7)/2 = 21. The total sum of ranks is

(12)(13)/2 = 78. For the U test, the lower value will be 21 [(6)(7)/2] = 0. In Table 15.4, for n1 = n2 = 6, a U value of 5 will have 2P .05.

15.3.2.The data are not matched, but we could arbitrarily form sets of 6 matched pairs in which each of the 6 values in Group A is matched with each of the 6 values in Group B. We could determine the total number of possible matchings and then check results for all the associated sign tests. An enormously simpler approach, however, is to recognize that the value in Group B

will always be larger than the value in Group A, no matter how the matching is done. Thus, the six matched pairs will always have 6 positive signs for B A. For a “null hypothesis” probability of 1/2, this result has a P value of (1/2)6 = 1/64 = .016.

15.3.3.The median for the total data set is at 18 (between 17 and 19). For the 2 × 2 table, we get

 

Below Median

Above Median

TOTAL

 

 

 

 

Group A

6

0

6

Group B

0

6

6

TOTAL

6

6

12

 

 

 

 

© 2002 by Chapman & Hall/CRC

AGE: CAUCASIAN
FIGURE AE 16.1
Quantile-quantile plot for Exercise 16.1

The Fisher test requires only 1 calculation, because the table is at the “extreme” value; and the calculation is quite simple (because 2 sets of values cancel for 6!). Thus, the P value for the table can be calculated as (6!)(6!)/12! = .001. For two tails, this is doubled to a 2P value of .002.

Note that in this instance, the Fisher test on the median gave a more “powerful,” i.e. lower, P value than the two other tests. Its result in this instance is also the same as the Pitman-Welch

Pvalue, which was obtained as 2/[12!/(6!)(6!)].

15.5.Of the 19 paired comparisons, 11 are tied and 8 favor propranolol. Ignoring the 11 ties, we can do a sign test to get a P value for (1/2)8. Because (1/2)5 = 1/32 = .03 and (1/2)8 will be even smaller, we can state that P < .05. With greater mental effort, we can calculate (1/2)8 = 1/256, which is P < .01. (The authors reported “P < 0.04” for the comparison, but did not say which test was used.)

Chapter 16

16.1.

16.1.1.Since the symmetrical confidence-interval component is 10.9 6.0 = 4.9, and since Zα = 1.96 for a 95% C.I., the value of SED must have been 4.9/1.96 = 2.5. The coefficient of potential variation for SED/do = 2.5/6.0 = .42, which is below .5. If you use .5 as a boundary, you can be impressed.

16.1.2.Since the two groups have equal size, SED was calculated as (s2A + s2B ) ⁄n. If we

assume sA = sB = s, SED = 2s2 /n, and s = (SED) n/2. For SED = 2.5 and n = 100, s = 17.7. The respective ratios for C.V., which are 17.7/140.4 = .13 and 17.7/146.4 = .12, are not large enough to suggest that the means misrepresent the data.

16.1.3.In Figure 16.7, the, confidence interval component is 13.0 – 6.0 = 7.0, and so SED = 7.0/1.96 = 3.57. Using the same reasoning as in 14.1.2, and with n = 50, we now calculate s = (SED)n/2 = 3.5750/2 = 17.8, which is close enough to 17.7 to confirm the authors’ statement.

16.1.4.The diabetic group in Figure 16.7 contains a value >190 and the non-diabetic group contains a value >180. Neither of these values appeared in Figure 16.6. Therefore, the groups in

Figure 16.7 could not have been sampled from those in Figure 16.6. Another suspicion of “different sources” could be evoked by the statement that the means and standard deviations in Figure 16.7

are the same as in Figure 16.6. This exact similarity is unlikely if the groups in Figure

16.7 were

each randomly sampled from those in Figure 16.6. What probably happened is that the data in

both figures were obtained via a “Monte Carlo” construction, using the assigned means of 146.4

and 140.4, the assigned standard deviation of 17.7, the assumption that both groups were

Gaussian, and then doing random sampling from the theoretical Gaussian curve for each group.

Alternatively, both groups could have been sampled from a much larger population.

 

 

 

 

 

 

 

 

 

16.3. If ±2 is a standard deviation, the value of 2 SD is

60

 

 

AGE:

 

 

 

 

 

 

Line of

 

 

 

 

 

 

 

 

 

about 4, and 95% of the data on the right side should extend

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AFRICAN

 

 

 

 

 

 

Unitary ratio

 

 

 

 

 

 

 

 

 

as 4 ± 4, or from about 0 to 8, which corresponds roughly

50

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

to what is shown in the graph. If ±2 is a standard error, the

 

 

 

 

 

 

 

 

Upper

 

 

 

 

 

 

"100th"

 

 

 

 

 

 

 

 

 

 

 

Quartile

 

 

 

 

 

 

 

 

 

value of SD is about 18, since 88

9. The data would

40

 

 

Quantile-quantile plot

 

 

 

 

 

 

 

 

 

Percentile

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

extend as 4 ± 2(18), which they do

not. Therefore, the

30

 

 

 

 

Median

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

± values are SDs.

 

 

 

Lower

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Quartile

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

16.5. The lower quartile, median, and upper quartile points

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

are connected with a solid line in Figure AE 16.1. Dotted

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

lines show the extensions to the “0th” and “100th”

10

 

 

 

 

 

 

 

"0th"

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Percentile

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

percentiles at the extremes of the data for age in the African

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

and Caucasian groups of Figure 16.12. The three quartile

0

10

20

30

40

50

60

70

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

points show that Africans are older than Caucasians in the mid-zone of the data (although not at the extremes). The “bend” in the quantile-quantile line shows that the two groups have unequal variance (which could have been

readily noted from the asymmetrical box for Caucasians in Figure 16.12).

© 2002 by Chapman & Hall/CRC

16.7. Individual answers. As for the Supreme Court, as might be expected, it did not take a clear stand on the scientific issue. Instead, after acknowledging that the Frye rule might be too rigid, the Court said that local judges could use their own wisdom to evaluate scientific evidence and to decide what might be admissible, even if it did not comply with the Frye rule. Because the Court offered no guidelines for the local decisions, judges may have to start enrolling in appropriate courses of instruction.

Chapter 17

17.1.

17.1.1.With increased use of screening tests (such as cervical pap smears, mammography, and routine white blood counts) and more definitive technologic tests (such as ultrasound, CT scans, and MRI), many cancers are now detected that would formerly have been unidentified during the life of the patients. If incidence rates are determined not from death certificates, but from tumor registries or other special repositories of diagnostic data, the incidence rates will inevitably increase. The cancers that would formerly be found only as “surprise discoveries” at necropsy (if necropsy was done) will now be found and counted among the “vital statistics.”

17.1.2.Many of the cancers identified with the new screening and technologic procedures will be relatively asymptomatic and slow-growing. They will usually have better prognoses, even if left untreated, than the cancers that formerly were diagnosed because they had produced symptoms. Because the customary morphologic classifications do not distinguish these functional behaviors of cancer, the survival rates will rise because the relatively “benign” cancers are now included in the numerators and denominators. The relatively “malignant” cancers, however, may continue to be just as lethal as before. When referred to a community denominator, the cancer death rates may seem unchanged.

17.1.3.Because no unequivocal data (about nutrition, life style, smoking, alcohol, etc.) exist about risk factors for these cancers, it is difficult to choose a suitable public-health intervention that offers the prospect of more good than harm. Many nonmedical scientists — if professionally uncommitted to a particular “cause” or viewpoint — might therefore conclude that basic biomedical research has a better potential for preventing these cancers than any currently known public-health intervention.

17.3.

17.3.1.The denominator is 300, and 4 cases existed on July 1. Prevalence is 4/300 = .0133.

17.3.2.The incidence rate depends on whom you count as the eligible people and what you count

as incidence. Three new episodes occurred during the cited time period. If we regard everyone as eligible, the denominator is 300, and the incidence of new episodes will be 3/300 = .01. If you

insist that the eligible people are those who are free of the disease on July 1, the eligible group consists of 300 4 = 296 people. If you insist on counting only new instances of disease, the recurrence in case #3 is not counted as an episode. The incidence would be 2/296 = .0068. If you

allow the denominator to include anyone who becomes disease-free during the interval, only case

1 is excluded. If the numerator includes any new episode of disease, there are 3 such episodes in the interval. Incidence would be 3/299 = .01.

17.3.3.Numerator depends on whether you count 7 episodes or 6 diseased people. Denominator

is 300.

17.5.

17.5.1.Adding 0.5 to each cell produces (5.5)(8.5)/(0.5)(.05) = 187.

17.5.2.The main arguments were (1) the control groups were improperly chosen and should have contained women with the same pregnancy problems (threatened abortion, etc.) that might have evoked DES therapy in the exposed group; (2) the ascertainment of previous DES exposure may have been biased; (3) the Connecticut Tumor Registry shows an apparently unchanged annual incidence of clear-cell cancer despite little or no usage of DES for pregnancies of the past few decades.

©2002 by Chapman & Hall/CRC

17.5.3. If the occurrence rate of CCVC is very small, perhaps 1 × 105 in non-exposed women, the cohorts of 2000 people were too small to pick up any cases. On the other hand, if the rate is 1 × 104 (as some epidemiologists believe) and if the case-control odds ratios of 325 or 187 are correct, we should expect about 32.5 or 18.7 cases per thousand in the exposed group. This number is large enough for at least several cases to be detected in a cohort of 2000 people. Accordingly, either the odds ratios are wrong or the occurrence rate of CCVC is much smaller than currently believed. Either way, the cancerophobic terror seems unjustified.

17.7.

1.What were the four numbers from which the odds ratio was calculated? (If they are small enough to make the result sufficiently unstable, i.e., far from stochastic significance, you can dismiss the study without any further questions. You can also get an idea, from the control group data, of the exposure rate in the general population.)

2.What is the particular adverse event? (If it is a relatively trivial phenomenon, major publicpolicy action may not be needed.)

3.What is the customary rate of occurrence of the adverse event? (If it has a relatively high rate, e.g., > .1, the odds ratio may be deceptively high. If it has a particularly low rate, e.g.,

.0001, the high odds ratio (or risk ratio) might be deceptively frightening. Examination of the NNE or some other index of contrast might give a better idea of how “impressive” the distinction may be.

4.What is the particular “exposure”? (If it is sufficiently beneficial, such as a useful vaccination or surgical procedure, you can consider whether an alternative beneficial exposure is available if the current agent is “indicted” or “removed” from general usage.)

Chapter 18

18.1.

18.1.1.The simplest approach is to note that the intercept is 1.6 for the displayed formula Y = 1.07X + 1.6. This stated intercept appears to be the correct location of the point where the line crosses the Y axis when X = 0. (Note that the artist’s statement of “+1.6” for the intercept disagrees with the “+0.02” listed in the caption.) A more complicated, but still relatively simple approach,

is to note that the vertical extent of the Y points is from about 2 to 86 units and the horizontal extent of X is from 0 to 80. Thus, the crude slope is (86 2)/(80 0) 1, consistent with the

stated slope of 1.07.

18.1.2.The abscissa has a break and does not begin at 0. The intervals can be measured by eye and ruler, however. When 0 is reached, the line seems to intersect at about Y = −0.9. At X = 50, the value of Y is roughly 1, so that it covers 2 units (from 1 to +1) of Y during an X-span of 50 units. The value of 2/50 = .04, which is the cited slope of the line. About 19 points are above

the line and 12 points are below it, but the latter points have larger distances from the line. Thus,

although the numbers of points are unequal above and below the line, the sum of values for

ˆ

Yi – Yi is probably equal and the line fits properly. 19.3. Individual answers.

Chapter 19

19.1.

19.1.1. The statistician first drew the graph for the 11 points shown in Figure AE.19.1. Drawing a new visual axis through the point containing both median values (10,45), the statistician noted that all other points were in the positive covariance zones of Quadrants I or III. The range of points, from 1 to 19 on the X-axis and from 10 to 80 on the Y-axis, suggested a slope of about 70/18, which is about 4. Because the points seemed about equally distributed around the median

© 2002 by Chapman & Hall/CRC

values on both axes, the statistician guessed that the standard deviations for X and Y would each be about one fourth the corresponding range, i.e., sx ~ 18/4 = 4.5 and sy ~ 70/4 = 17.5, and so sx /sy was guessed as about 4.5/17.5, which is the inverse of the estimated slope, 70 /18. Because r = bsx/sy, the guesswork implies that r will roughly be (70/18)[(18/4)/(70/4)] 1 if the slope is as high as 4. From the dispersion of points, the statistician knows that the slope will not be as high as 4 and that r will not be as high as 1, but the statistician also knows, from Table 19.3, that for 11 points of data, with n = 11, P will be < .05 if r > .6. Believing that r will be close to or exceed .6, the statistician then guessed that the result will be stochastically significant. [A simpler

approach is to note that the bi-median split produces a 2 ×

2 table that is

 

0

5

. This is at the

 

5

0

extreme of a Fisher test for 10!/(5! 5!) = 252 possibilities. The two-tailed P will be 2/252 = .008.]

19.1.2.

 

 

 

 

 

Σ X = 110;

 

= 10; Σ X 2

= 1430; Sxx = 1430 (1102

/11) = 330

X

Σ Y = 490;

 

= 44.55; Σ

Y 2 = 26700; Syy = 26700

(490)2 /11 = 4872.73

Y

Σ XY = 5635; (Σ XΣ Y)/N = 4900; Sxy = 735; b = Sxy/Sxx = 2.23 a = Y – bX = 44.55 – (2.23)(10) = 22.28

The graph of points is shown in Fig. AE19.1, and the line passes through (0, 22.28) and (10, 44.55). If you plotted the alternative line, b′ = Sxy/Syy = 735/4872.73 = .151; a′ = 10 (.151) (44.55) = 3.28. The line passes through (3.28, 0) and (10, 44.55).

r2 = Sxy2

⁄(Sxx Syy )(735 )2 ⁄[(330 )(4872.73 )] = .336; and r = .58

t =

(r

1 – r2 ) n – 2 = (.58/

1

– .336 )

9 = 2.14

Because

the

required t.05 is 2.26

for

P ð .05

at ν = 9, the result just misses being

stochastically significant. On the other hand, because the investigator clearly specified an advance direction for the co-relationship, an argument can be offered that a one-tailed P value is warranted. If so, t.1 is 1.833, and “significance” is attained.

The visual guess of 4 for the slope was too high because the lowest and highest values for Y do not occur at the extreme ends of the range for X. The guess of 1/4 = .25 for the estimated sx /sy was quite good, however, because Sxx /Syy = 330/4872.73 = .068 and sx /sy = Sx x/ Syy = .26.

19.3. Official Answer: YES. If one-tailed P values are permissible for comparisons of two groups, they should also be permissible for comparisons of two variables.

19.5.

19.5.1. The top figure (for firearm homicide)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

shows too much scatter to have a correlation as

 

 

SERUM

 

 

 

 

 

 

 

 

 

high as .913. The lower figure (for firearm

80

 

OMPHALASE

 

 

 

 

 

 

 

suicide) looks like its data could be fit well with

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

two horizontal lines, one at about 5.5 for low

60

 

 

 

 

 

 

 

 

 

 

 

 

 

values of X, and the other at about 6.5 for higher

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

values of X. In fact, the upper figure might also

40

 

 

 

 

 

 

X

 

 

 

 

 

be relatively well fit with two horizontal lines,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

one through the lower values of X and the other

20

 

 

 

 

 

 

 

 

 

 

 

 

 

through higher values.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

19.5.2. Neither of the two graphs shows a

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

relationship “strong enough” to suggest that the

 

 

 

 

 

 

 

 

 

 

 

 

 

0

4

8

12

16

20

 

correlations have r values as high as .64 and .74.

 

 

 

 

 

IMMUNOGLOBIN ZETA

 

 

 

19.5.3. An excellent example of a correlation that

FIGURE AE.19.1

 

 

 

 

 

 

 

got r = − .789 and P < .05 for only 7 data points.

 

 

 

 

 

 

 

Graph

(and regression

line) for data in Exercise

 

In view of what happens for values beyond x Š

19.1.

120, however, the relationship doesn’t seem

 

convincing.

 

19.5.4.Does this look like a good correlation? Nevertheless, it achieved r = .45.

©2002 by Chapman & Hall/CRC