Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Principles Of Medical Statistics_Feinstein_2002

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
25.93 Mб
Скачать

This idea is the source of the degrees of freedom concept. In general, a degree of freedom is lost for each parameter that has been estimated from a sample. Thus, when n 1 is used in the calculation of σˆ = s = SXX /(n – 1 ), the principle is that the best estimate of population variance comes from dividing the group variance (Sxx) by the degrees of freedom in the sample.

7.5.3Using the t Distribution

For small samples, the critical ratio of (X – µ)/( s/n ) is interpreted not as Zj, but as tν ,j, from a t distribution, using the appropriate value of ν . When confidence intervals are constructed, the value that

corresponds to Zα is tν ,j , selected from a table showing the P values associated with the degrees of freedom, ν , and with the assigned value of α for each t ν ,j . The relationship of P values, t values, and degrees of freedom is shown in Table 7.3.

If you have calculated a tj value as the critical ratio from data for a sample of size n, you enter the table at ν = n 1 degrees of freedom, and find the value of P. If you have designated a confidence

interval of size 1 − α , you enter the table at ν = n 1, and find the value of tν,α in the location where P = α . Near the end of Section 7.3.7.3, a 95% confidence interval for a data set of 25 members was

calculated with Zα = 1.96. A more appropriate calculation, however, would use the corresponding

value of t. Since ν = 25 1 = 24 for those data, we would find the value for t24,.05 in Table 7.3 to be 2.064. The 95% confidence interval would then be 10 ± (2.064)(.4) = 10 ± .83. It would be slightly

larger than before, extending from 9.17 to 10.83.

If you examine Table 7.3 closely, you will see that the t values become Z values as the sample sizes enlarge and become infinite. In particular, the t values are quite close to Z values for sample sizes 30 . For example, at the external probability P value of .05, the critical ratios at 30 degrees of freedom are 1.960 for Z, and 2.042 for t. If the sample sizes are quite small, the critical ratios are more disparate. Thus, at the .05 level of P, the critical values of t increase to 2.365 and 2.776 for 7 and 4 degrees of freedom, respectively. When group sizes have these small values, however, decisions about external probability might preferably be made by a bootstrap resampling method as discussed earlier.

7.5.4Distinctions of t vs. Z

The achievement that made W. S. Gosset a statistical immortal was his revelation, as “Student,” that a t rather than Gaussian distribution should be used for inference with the means of small samples. For reasonably large samples (i.e., ν ≥ 30 ), however, the distinction makes relatively little difference; and for smaller samples, the theoretical inferential method may be replaced by a different (empirical) resampling technique in the foreseeable future.

Consequently, despite all of the fame and popularity of the t distribution and t test (to be discussed later), the t technique is not substantially different from the counterpart Z technique. Nevertheless, as long as the t test retains its current popularity, the medical literature will often contain results that were referred to a t distribution. Furthermore, you will keep your “orthodox” statistical colleagues happier if you use tν,α or tν, j in most circumstances where you might have been tempted to use Zα or Zj.

7.6Finite Population Correction for Unreplaced Samples

All of the discussion thus far in this chapter has been based either on an infinite parametric population or on resampling procedures in which each sampled item was replaced as soon as it was drawn. If the parent population is not enormous and the sampling is done without replacement, an important warning must be noted.

With random sampling from a parent population containing N items, the principles of probability depend on the idea that each item in the sample has a 1/N chance of being selected. If N is a very large number, or if the sampling is done with replacement, this principle holds true. When sampling is done without replacement in smaller populations, however, a correction must be made for the changes that occur in N as each member is removed. (This point is important to bear in mind when samples are taken from a roster of medical admissions, discharges, autopsies, etc.)

© 2002 by Chapman & Hall/CRC

TABLE 7.3

Distribution of 2-Tailed P Values for Values of t at Different Degrees of Freedom*

Degrees of

 

 

Probability of a Larger Value, Positive or Negative

 

 

Freedom

0.500

0.400

0.200

0.100

0.050

0.025

0.010

0.005

0.001

 

 

 

 

 

 

 

 

 

 

1

1.000

1.376

3.078

6.314

12.706

25.452

63.657

 

 

2

0.816

1.061

1.886

2.920

4.303

6.205

9.925

14.089

31.598

3

.765

0.978

1.638

2.353

3.182

4.176

5.841

7.453

12.941

4

.741

.941

1.533

2.132

2.776

3.495

4.604

5.598

8.610

5

.727

.920

1.476

2.015

2.571

3.163

4.032

4.773

6.859

6

.718

.906

1.440

1.943

2.447

2.969

3.707

4.317

5.959

7

.711

.896

1.415

1.895

2.365

2.841

3.499

4.029

5.405

8

.706

.889

1.397

1.860

2.306

2.752

3.355

3.832

5.041

9

.703

.883

1.383

1.833

2.262

2.685

3.250

3.690

4.781

10

.700

.879

1.372

1.812

2.228

2.634

3.169

3.581

4.587

11

.697

.876

1.363

1.796

2.201

2.593

3.106

3.497

4.437

12

.695

.873

1.356

1.782

2.179

2.560

3.055

3.428

4.318

13

.694

.870

1.350

1.771

2.160

2.533

3.012

3.372

4.221

14

.692

.868

1.345

1.761

2.145

2.510

2.977

3.326

4.140

15

.691

.866

1.341

1.753

2.131

2.490

2.947

3.286

4.073

16

.690

.865

1.337

1.746

2.120

2.473

2.921

3.252

4.015

17

.689

.863

1.333

1.740

2.110

2.458

2.898

3.222

3.965

18

.688

.862

1.330

1.734

2.101

2.445

2.878

3.197

3.922

19

.688

.861

1.328

1.729

2.093

2.433

2.861

3.174

3.883

20

.687

.860

1.325

1.725

2.086

2.423

2.845

3.153

3.850

21

.686

.859

1.323

1.721

2.080

2.414

2.831

3.135

3.819

22

.686

.858

1.321

1.717

2.074

2.406

2.819

3.119

3.792

23

.685

.858

1.319

1.714

2.069

2.398

2.807

3.104

3.767

24

.685

.857

1.318

1.711

2.064

2.391

2.797

3.090

3.745

25

.684

.856

1.316

1.708

2.060

2.385

2.787

3.078

3.725

26

.684

.856

1.315

1.706

2.056

2.379

2.779

3.067

3.707

27

.684

.855

1.314

1.703

2.052

2.373

2.771

3.056

3.690

28

.683

.855

1.313

1.701

2.048

2.368

2.763

3.047

3.674

29

.683

.854

1.311

1.699

2.045

2.364

2.756

3.038

3.659

30

.683

.854

1.310

1.697

2.042

2.360

2.750

3.030

3.646

35

.682

.852

1.306

1.690

2.030

2.342

2.724

2.996

3.591

40

.681

.851

1.303

1.684

2.021

2.329

2.704

2.971

3.551

45

.680

.850

1.301

1.680

2.014

2.319

2.690

2.952

3.520

50

.680

.849

1.299

1.676

2.008

2.310

2.678

2.937

3.496

55

.679

.849

1.297

1.673

2.004

2.304

2.669

2.925

3.476

60

.679

.848

1.296

1.671

2.000

.299

2.660

2.915

3.460

70

.678

.847

1.294

1.667

1.994

2.290

2.648

2.899

3.435

80

.678

.847

1.293

1.665

1.989

2.284

2.638

2.887

3.416

90

.678

.846

1.291

1.662

1.986

2.279

2.631

.2878

3.402

100

.677

.846

1.290

1.661

1.982

2.276

2.625

2.871

3.390

120

.677

.845

1.289

1.658

1.980

2.270

2.617

2.860

3.373

(z)

.6745

.8416

1.2816

1.6448

1.9600

2.2414

2.5758

2.8070

3.2905

* This table has been adapted from diverse sources.

© 2002 by Chapman & Hall/CRC

For example, the first card selected from a deck of 52 cards has a 13/52 chance of being a spade. If the card is not replaced, the denominator for selecting the next card will be 51; and the probability of the next card being a spade will be either 13/51 or 12/51, according to what was chosen in the first card. Thus, the probability that four consecutive unreplaced cards will all be spades is not (13/52)4 = .004. It is (13/52)(12/51)(11/50)(10/49) = .003.

With the jackknife reconstruction technique, in contrast to the bootstrap resampling method, the sampling is done without replacement. Each jackknifed “sample” contains n – l persons taken, without replacement, from the “parent population” of n persons. A mathematical principle can be derived for the shifting probabilities and necessary corrections of variance that occur when samples of size n are drawn without replacement from a “finite” population of size N. We can expect those samples to have a smaller variance than what would be found if they were taken each time from the intact population, but we need a quantitative indication of how much smaller the variance might be.

If σ 2 is the

variance of the total but finite population, the variance of the means in samples of size n

can be called

σ x2n . It can be calculated as

 

 

 

 

 

σ

2

=

N – n

[7.4]

 

x n

------------/ n ).

 

 

 

 

N – 1

 

Because N n is smaller than N 1, this value will be smaller than σ /n . To illustrate the calculation, the parent population of the data set in Table 7.1 has 20 members, and their variance (calculated with N, rather than N 1) is (78.44)2. [The variance calculated with N – 1 was (80.48)2]. Because each of the jackknife samples has n = 19, the foregoing formula will produce

σ x n

= 20----------------- 19(78.44/ 19) = 4.128

 

20 – 1

This is the same result obtained in Table 7.1, when a standard deviation was calculated for the n reduced means produced by the jackknife procedure.

Note that if N is very large, the value of (N n)/(N 1) in Formula [7.4] will be approximately 1. The standard deviation of means in general samples of size n will then be

σ xn = σ /n

This is the formula for the “standard error” or standard deviation of means when parametric sampling is done from a population of infinite size.

7.7Confidence Intervals for Medians

Because the median is used so infrequently in current reports of medical research, most statistical texts give relatively little attention to finding a confidence interval for a median. The procedure is mentioned here because it may become more common when the median replaces the mean as a routine index of central tendency.

7.7.1Parametric Method

With the parametric approach, the standard error of a median in a Gaussian distribution is mathematically cited as

s ˜ = ( π /2)(σ / n )

X

© 2002 by Chapman & Hall/CRC

where π is the conventional 3.14159… . For an actual group (or “sample”) of data, the value of σ would be approximated by the s calculated from the sample. Because π /2 = 1.253, this formula implies that the median is generally more unstable, i.e., has a larger standard error, than the mean. This distinction may be true for Gaussian distributions, but the median is particularly likely to be used as a central index for non-Gaussian data. Consequently, the formula has few practical merits to match its theoretical virtues.

7.7.2Bootstrap Method

The bootstrap method could be applied, if desired, to produce a confidence interval for the stability of a median. The approach will not be further discussed, however, because currently it is almost never used for this purpose.

7.7.3Jackknife Method

The jackknife process is particularly simple for medians. Because the original group in Table 7.1 contains 20 members, the reduced group will contain 19 members, and the reduced median will always be in the 10th rank of the group. This value will have been the 11th rank in the original group. Counting from the lower end of the original data, the 11th ranked value is 96. It will become the reduced median after removal of any original item that lies in the first 10 ranks, i.e., between 62 and 91. The value of 91 will become the reduced median after removal of any items that lie in the original ranks from 11 to 21, i.e., from 96 to 400.

Thus, the reduced median will be either 91 or 96, as noted in the far right column of Table 7.1. We can also be 100% confident that the reduced median will lie in the zone from 91 to 96. Because the original median in these data was (91 + 96)/2 = 93.5, the reduced median will always be 2.5 units lower or higher than before. The proportional change will be 2.5/93.5 = .027, which is a respectably small difference and also somewhat smaller than the coefficients of stability for the mean. Furthermore, the proportional change of 2.7% for higher or lower values will occur in all values and zones of the reduced median. Accordingly, with the jackknife method, we could conclude that the median of these data is quite stable.

If the data set has an even number of members, the jackknife removal of one member will make the median vary between some of the middlemost values from which it was originally calculated. For an odd number of members, most of the reduced medians will vary to values that are just above or below the original median and the values on either side of it. Thus, if the data set in Table 7.1 had an extra item of 94, this value would be the median in the 21 items. When one member is removed from the data set, the reduced median would become either (94 + 96)/2 = 95, or (91 + 94)/2 = 92.5, according to whether the removed member is above the value of 96 or below 91. The median would become (87 + 94)/2 = 90.5 with the removal of 91, (91 + 96)/2 = 93.5 with the removal of 94, and (94 + 97)/2 = 95.5 with the removal of 96. Thus, if Xm is the value at the rank m of the median, the maximum range of variation for the reduced median will be from (Xm2 + Xm)/2 to (Xm+2 + Xm )/2.

Although seldom discussed in most statistical texts, this type of appraisal is an excellent screening test for stability of a median.

7.8Inferential Evaluation of a Single Group

A single group of data is often evaluated inferentially to determine whether the central index of the group differs from a value of µ that is specified in advance by a particular hypothesis about the data.

Although “hypothesis testing” will be extensively discussed in Chapter 11, the illustration here can be regarded either as a preview of coming attractions or as a “bonus” for your labors in coming so far in this chapter.

© 2002 by Chapman & Hall/CRC

7.8.1One-Group t or Z Test

The test-statistic ratios for Z or t can be applied stochastically for inferring whether the mean of a single set of data differs from a previously specified value of µ. The test-statistic is calculated as

t or Z = (X – µ)/(s/n )

and the result is interpreted as a value for Zj or for tν ,j according to the size of the group. The corresponding P value will denote the external probability that the observed or an even more extreme value of X would arise by chance from a parent population whose mean is µ. If we assume that µ = 0, the test statistic becomes X/ ( s/n ) , which is also the inverse of the coefficient of stability for the mean.

This application of the t or Z index produces what is called a one-group or paired “test of significance.” It can be used to determine whether the mean of a group (or sample) differs from a previously specified value, which serves as µ. More commonly, however, the test is done with “paired” data, such as a set of before-and-after treatment values for a particular variable in a collection of n people. If we let wi be the values before treatment and vi be the paired values after treatment, they can be subtracted to form a single increment, di = vi wi for each person.

The values of di will represent a single variable, with mean d and standard deviation

sd = Σ (di – d)2 /(n – 1)

If we believe that the value of d is significantly different from 0 — i.e., that the after-treatment values are substantially different from the before values — we can establish the “null” hypothesis that the di values have been randomly sampled from a population for which µ = 0. The critical ratio of

---------d----0 =

-------d------

sd / n

sd / n

can be converted to indicate the external probability, or P value, for the possibility that the observed value of d , or a more extreme value, has occurred by chance under the null hypothesis. Using principles to be discussed later, we can then decide whether to reject or concede the null hypothesis that the true mean for the di data is 0.

Note that the “one-group” procedure requires either a single group of data or two groups that can be “paired” to form the increments of a single sample.

7.8.2Examples of Calculations

The one-group t test is illustrated here in application to a single group and to a set of paired data.

7.8.2.1 Single-Group t (or Z) Test — A clinical professor at our institution claims that the group of patients in Exercise 3.3 is highly atypical. She says that the customary mean of blood sugar values is 96, and that the observed mean of 120.1 implies that the group must be inordinately diabetic.

The claim can be stochastically evaluated using a one-group t test, with 96 assumed to be the parametric mean. We would calculate

------=

X

------- ------------------------– µ = 120.1 –

96

---------= 24.1

= 1.339

 

s/ n

18.0

 

18.0

 

In Table 7.3, at 19 degrees of freedom, the two-tailed P value associated with this value of t is close to .2, lying between .2 and .1.

© 2002 by Chapman & Hall/CRC

This type of result is commonly symbolized, with the lower values coming first, as .1 < P < .2. To indicate the two tails, a better symbolism would be .1 < 2P < .2. The two-tailed P value indicates the possibility that a mean at least 24.1 units higher or lower than 96 could arise by chance in the observed group of 20 patients. Because the original conjecture was that 120.1 was too high (rather than merely atypical in either direction), we might be particularly interested in the one-tail rather than two-tail probability. We would therefore take half of the 2P value and state that the external probability is .05 < P < .1 for the chance that a mean of 120.1 or higher would be observed in the selected 20 people if the true mean is 96.

Chapter 11 contains a further discussion of one-tailed and two-tailed probabilities, and their use in rejecting or conceding assumed hypotheses about the observed data.

7.8.2.2 Paired One-Group t Test — Suppose that the 20 patients in Exercise 3.3 were treated with various agents, including some aimed at reducing blood sugar. The values after treatment, shown in Table 7.4, have a mean of 23.3 for the average change in blood sugar. The standard deviation of the increments, however, is 65.6, suggesting that the data are extremely dispersed. This point is also evident from inspecting the wide range of values (going from 254 to +13) for the increments of paired results shown in Table 7.4.

The wide range might elicit strong suspicion that

TABLE 7.4

 

 

the observed distinction is unstable. Nevertheless,

Before and After Values of Treatment of

our main question here is not whether 23.3 is a

Blood Sugar for 20 Patients in Table 7.1

stable value, but whether the mean blood sugar was

 

 

 

Before

After

Increment

indeed lowered by more than the value of 0 that

might occur by random chance alone. To answer

62

75

+13

the latter question, we can do a paired single-group

78

82

+4

t test on the incremental values. We assume the

79

78

1

hypothesis that they came from a population of

80

91

+11

82

82

0

increments having 0 and 65.6 as the parameters for

82

84

+2

µ and σ respectively.

 

 

83

80

3

The calculation shows

85

79

6

d – µ

–23.3

87

94

+7

91

90

−1

t = ------------- =

---------------------- = 1.588

sd / n

65.6/ 20

96

99

+3

 

 

97

91

6

Interpreted at 19 d.f. in Table 7.3, this value of t

97

85

12

has an associated two-tailed P value that is between

97

98

+1

0.1 and 0.2. If given a one-tail interpretation

101

97

4

120

112

8

(because we expected blood sugars to go down-

135

123

12

ward), the P value would be between 0.05 and 0.1.

180

140

40

If this chance is small enough to impress you, you

270

110

160

might reject the conjecture that the group of incre -

400

146

254

ments came from a population whose true mean is

Total

468

0. This rejection would not make the mean of –23.4

become stable. The inference would be that no

Mean

23.3

s.d.n1

65.6

matter how unstable the mean may be, it is unlikely

 

 

to have come from a parent population whose parametric mean is 0.

7.8.2.3 Confidence Interval for One-Group t Test — Another approach to the main question about changes in blood glucose is to calculate a parametric confidence interval for the observed mean of 23.4. Using t19, .05 = 2.093, the 95% confidence interval would be 23.4 ± (2.093)(65.5/20 ) = −23.4 ± 30.65. Extending from 54.05 to 7.25, the interval includes 0, thus suggesting that the observed results are consistent with a no-change hypothesis.

© 2002 by Chapman & Hall/CRC

You may now want to argue, however, that the confidence interval should be one-tailed rather than two-tailed, because of the original assumption that blood sugar values were being lowered. With a onetailed hypothesis, the appropriate value for a 95% confidence interval is t19, .1 = 1.729. With this approach, the interval is 23.4 ± (1.729)(65.5/20 ) = –23.4 ± 25.32. Because the upper end of this interval also includes 0, we might feel more comfortable in acknowledging that the observed value of 23.4 is not unequivocally different from the hypothesized value of 0.

Appendixes: Documentation and Proofs for Parametric Sampling Theory

This appendix contains documentation and “proofs” for the assertions made in Chapter 7 about standard errors, estimation of σ , and other inferential strategies. Because the proofs have been kept relatively simple (avoiding such complexities as “expectation theory”), they are reasonable, but not always math - ematically rigorous.

A.7.1 Determining the Variance of a Sum or Difference of Two Variables

This concept is needed in Chapter 7 to help determine the “standard error” of a mean, but is also used later in Chapter 13 to find the variance for a difference in two means.

To demonstrate the process, suppose we add or subtract the values of two independent variables, W i

and Vi, each containing n members, having the respective means, W and V, with group variances Sww and Svv, and variances s2w and s2v . The result will be a new variable, formed as Wi + Vi or Wi Vi, having n members and its own new mean and new variance.

It is easy to show that the new mean will be W + V or W – V . A more striking point is that the new group variances and variances will be the same, Sww + Svv and s2w + s2v , regardless of whether we subtract or add the two variables.

The latter point can be proved with the following algebra: For the addition of the two variables, the

new group variance will be Σ [(Wi + Vi) ( W + V)]2 = Σ [(Wi W ) + (Vi V)]2 = Σ (Wi W )2 + 2(Wi W )(Vi V) + (Vi V2 )] = Σ (Wi W )2 + 2Σ (Wi W )(Vi V) + Σ (Vi V2 ). In the latter three

expressions, the first and third are Sww and Svv. The middle term essentially vanishes because Wi and Vi are independent and because Σ (Wi – W) = 0 and Σ (Vi – V) = 0 . Therefore, the group variance of Wi + Vi will be Sww + Svv, and the variance will be (Sww + Svv)/(n 1), which is s2w + s2v .

If the two variables are subtracted rather than added, the new mean will be W – V . The new group variance will be Σ[( Wi Vi ) − (W – V) ]2 = Σ [ (Wi – W) − (Vi – V) ]2 = Σ [ (Wi – W) −

2(Wi – W)(Vi – V ) + (Vi – V)2 ] = Sww + Svv, when the middle term vanishes. Thus, when two indepen - dent variables, Vi and Wi, are either added or subtracted, the variance of the new variable is s2w + s2v .

A.7.2 Illustration of Variance for a Difference of Two Variables

Table 7.4 can be used to illustrate this point if we regard the “Before” values as a sample, {Wi}, and the “After” values as another sample, {Vi}, each containing 20 items. The 20 items in the Wi and Vi groups can then be subtracted to form the (Wi Vi) group shown as the “increment’’ in Table 7.4. The

© 2002 by Chapman & Hall/CRC

results show means of 120.1, and V = 96.8. For the difference of the two variables, the mean is (as expected) 23.3 for (W – V) .

The respective variances are 6477.35 and 403.95 for s2w and s2v , with their sum being 6881.30. This sum is much larger than the variance of 4298.77 formed in the (Wi – Vi ) group. The reason for the difference in observed and expected variances is that Wi Vi are not independent samples. The basic principle demonstrated by the mathematics will hold true, on average, for samples that are not related. In this instance, however, the two sets of data are related. They are not really independent, and thus do not have a zero value for covariance. In other words, the statement in A.7.1 that “the middle term essentially vanishes” was not correct. The term vanishes in each instance only if Σ (Wi – W)(Vi – V ) is 0. A little algebra will show that this term becomes Σ Wi Vi – NWV . For samples that are not independent, this term is not zero. Thus, for the data in Table 7.4, Σ Wi Vi = 257047, and NWV = 232513.6 . Consequently, Σ (Wi – W)(Vi – V) = 257047 232513.6 = 24533.4, rather than 0. The value of 24533.4 is doubled to 49066.8 and then divided by 19(= n – 1) to produce 2582.46 as the incremental contribution of the WV covariance component. Accordingly, 6881.30 2582.4 = 4298.84 for variance in the difference of the two variables.

A.7.3 Variance of a Parametric Distribution

By definition, the parametric population has mean µ and standard deviation σ . Any individual item, Xi, in that population will deviate from the parametric mean by the amount Xi − µ. By definition of σ , the average of the Xi − µ values will be σ , and the average of the (Xi − µ)2 values for variance will be σ 2.

In any individual sample of n items, the sum of squared deviations will be Sxx = Σ (Xi – X)2 from the sample mean, and Σ (Xi µ)2 from the parametric mean. With σ as the average value of Xi − µ, the average value of the squared deviations from the parametric mean will be Σσ 2 = nσ 2 .

Because Xi − µ = (Xi X ) + ( X − µ), the average value of X – µ will be the difference between σ and the standard deviation, s, in the sample. We can square both sides of the foregoing expression to get

(Xi µ)2 = (Xi – X)2 + 2(Xi – X)(X – µ) + (X – µ)2

Summing both sides we get

Σ (Xi µ)2 = Σ (Xi – X )2 + 2Σ (Xi – X)(X – µ) + Σ (X – µ)2

For any individual sample, X – µ is a fixed value, and Σ (Xi – X) = 0. The middle term on the right side will therefore vanish, and we can substitute appropriately in the other terms to get the average expression:

nσ 2 = SX X + n(

X

µ)2

[A.7.1]

This expression shows that in a sample of n members the average group variance will be larger around the parametric mean, µ, than around the sample mean, X. The average parametric group variance, nσ 2, will exceed the sample group variance, SXX, by the magnitude of n(X – µ)2 . This concept will be used later in Section A.7.5.

A.7.4 Variance of a Sample Mean

When n members, designated as X1 , …, Xn, are sampled from a parametric distribution, their mean will be

X1 + + Xn

------------------------------ = X n

© 2002 by Chapman & Hall/CRC

The deviation of this value from the parametric mean will be

µ X1 + + Xn µ X1 µ Xn µ X – = ------------------------------ – = --------------- + + ---------------

n n n

In repetitive samples, each constituent value of Xi will take on different values for X1, X2, X3, etc. We can therefore regard each of the (Xi − µ)/n terms as though it were a variable, with the X – µ term being a sum of the n constituent variables.

According to Section A.7.1, the average variance of (X – µ)2 will be the sum of average variances for the constituent variables. Each constituent variable here has the variance

Xi µ 2--------------

n

and the average value of (Xi − µ)2, by definition, is σ 2. Therefore, the average value of (X – µ)2 can be cited as

 

 

 

2

 

σ 2

σ 2

 

σ 2

σ 2

(X – µ)

=

 

----- +

+ ---- = n

 

----

= ----

 

 

 

 

 

n2

n2

n2

n

Thus, the average variance of a sampled mean is σ 2/n. The square root of this value, σ / n, is the special “standard deviation” that is called the “standard error” of the mean.

A.7.5 Estimation of σ

If you understand the mathematics of Appendixes A.7.3 and A.7.4, the procedure here is simple. In Appendix A.7.3, Formula [A.7.1] showed that nσ 2 = SX X + n(X – µ)2; and in Appendix A.7.4, we learned that σ 2/n was the average value of (X – µ)2 . When the latter value is substituted in Formula [A.7.1], we get

nσ 2 = SXX + n(σ 2/n) = SXX + σ 2

Therefore,

(n 1)σ 2 = SXX.

This result tells us that on average, the value of SXX/(n 1) will be the value of σ 2. The result also explains why SXX is divided by n 1, rather than n, to form the variance of a sample. When the sample standard deviation is calculated as

s = SX X ⁄(n – 1 )

we get the best average “unbiased” estimate of the parametric standard deviation, σ .

References

1. Adams, 1974; 2. Carpenter, 2000; 3. Sanderson, 1998; 4. Student, 1908.

© 2002 by Chapman & Hall/CRC

Exercises

7.1. The hematocrit values for a group of eight men are as follows:

{31, 42, 37, 30, 29, 36, 39, 28}

For this set of data, the mean is 34 with sn1 = 5.18 and sn = 4.84. The median is 33.5. [You should verify these statements before proceeding.]

7.1.1.What are the boundaries of a parametric 95% confidence interval around the mean?

7.1.2.What is the lower boundary of a parametric one-tailed 90% confidence interval around the mean?

7.1.3.What are the extreme values and their potential proportionate variations for the jackknifed reduced means in this group?

7.1.4.What are the extreme values and their potential proportionate variations for the reduced medians?

7.1.5.Do you regard the mean of this group as stable? Why? Does it seem more stable than the median? Why?

7.2.One of your colleagues believes the group of men in Exercise 7.1 is anemic because the customary mean for hematocrit should be 40. How would you evaluate this belief?

7.3.A hospital laboratory reports that its customary values in healthy people for serum licorice

concentration have a mean of 50 units with a standard deviation of 8 units. From these values, the laboratory calculates its “range of normal” as 50 ± (2)(8), which yields the interval from 34 to 66. In

the results sent to the practicing physicians, the laboratory says that 34–66 is the “95% confidence interval” for the range of normal. What is wrong with this statement?

7.4.Our clinic has an erratic weighing scale. When no one is being weighed, the scale should read 0,

but it usually shows other values. During one set of observations, spread out over a period of time, the following values were noted with no weight on the scale: +2.1, 4.3, +3.5, +1.7, +4.2, 0, 0.8, +5.2,

+1.3, +4.7. Two scholarly clinicians have been debating over the statistical diagnosis to be given to the scale’s lesion. One of them says that the scale is inconsistent, i.e., its zero-point wavers about in a nonreproducible manner. The other clinician claims that the scale is biased upward, i.e., its zero point, on average, is significantly higher than the correct value of 0. As a new connoisseur of statistics, you have been asked to consult and to settle the dispute. With whom would you agree? What evidence would you offer to sustain your conclusion?

© 2002 by Chapman & Hall/CRC