![](/user_photo/19115_OVnlY.jpg)
An Introduction To Statistical Inference And Data Analysis
.pdf![](/html/19115/102/html_CtUfCEm0C7.gWQk/htmlconvd-UX5Ab9221x1.jpg)
220 |
CHAPTER 11. K-SAMPLE LOCATION PROBLEMS |
² The F -test of H0 : µ = 0 is to reject if and only if
P = PH0 (F (µ) ¸ f(µ)) · ®;
i.e. if and only if
f(µ) ¸ q = qf(1-®,df1=1,df2=N-k); where f(µ) denotes the observed value of F (µ).
²De¯nition 11.2 Two contrasts with coe±cient vectors (c1; : : : ; ck) and (d1; : : : ; dk) are orthogonal if
Xk cidi = 0: i=1 ni
² Notice that, if n1 = ¢¢¢ = nk, then the orthogonality condition simpli-
¯es to
Xk
cidi = 0:
i=1
²In the Heyl (1930) example:
{If n1 = n2 = n3, then µ1 and µ2 are orthogonal because 1 ¢ 1 + 1 ¢ (¡1) + (¡2) ¢ 0 = 0:
{If n1 = 6 and n2 = n3 = 5, then µ1 and µ2 are not orthogonal because
1 ¢ 1 + 1 ¢ (¡1) + (¡2) ¢ 0 = 1 ¡ 1 6= 0: 6 5 5 6 5
However, µ1 is orthogonal to µ3 = 18¹1 ¡ 17¹2 ¡ ¹3 because
1 ¢ 18 |
+ |
1 ¢ (¡17) |
+ |
(¡2) ¢ (¡1) |
= 3 |
¡ |
3:2 + 0:2 = 0: |
|
6 |
5 |
5 |
||||||
|
|
|
|
²One can construct families of up to k¡1 mutually orthogonal contrasts. Such families have several very pleasant properties.
²First, any family of k¡1 mutually orthogonal contrasts partitions SSB into k ¡ 1 separate components,
SSB = SSµ1 + ¢¢¢ + SSµk¡1 ;
each with one degree of freedom.
11.1. THE NORMAL K-SAMPLE LOCATION PROBLEM |
221 |
² For example, Heyl (1930) collected the following data:
Gold |
83 |
81 |
76 |
78 |
79 |
72 |
|
|
|
|
|
|
|
Platinum |
61 |
61 |
67 |
67 |
64 |
|
|
|
|
|
|
|
|
Glass |
78 |
71 |
75 |
72 |
74 |
|
|
|
|
|
|
|
|
This results in the following anova table:
Source |
SS |
df |
MS |
F |
P |
|
|
|
|
|
|
|
|
Between |
565.1 |
2 |
282.6 |
26.1 |
.000028 |
|
µ1 |
29.2 |
1 |
29.2 |
2.7 |
.124793 |
|
µ3 |
535.9 |
1 |
535.9 |
49.5 |
.000009 |
|
Within |
140.8 |
13 |
10.8 |
|
|
|
Total |
705.9 |
15 |
|
|
|
|
|
|
|
|
|
|
²De¯nition 11.3 Given a family of contrasts, the family rate ®0 of Type I error is the probability under H0 : ¹1 = ¢¢¢ = ¹k of falsely rejecting at least one null hypothesis.
²A second pleasant property of mutually orthogonal contrasts is that the tests of the contrasts are mutually independent. This allows us to deduce the relation between the signi¯cance level(s) of the individual tests and the family rate of Type I error.
{Let Er denote the event that H0 : µr = 0 is falsely rejected. Then P (Er) = ® is the rate of Type I error for an individual test.
{Let E denote the event that at least one Type I error is committed, i.e.
k[¡1
E = Er:
r=1
The family rate of Type I error is ®0 = P (E).
{ The event that no Type I errors are committed and
k\¡1 Ec = Erc
r=1
and the probability of this event is P (Ec) = 1 ¡ ®0.
![](/html/19115/102/html_CtUfCEm0C7.gWQk/htmlconvd-UX5Ab9223x1.jpg)
222 |
CHAPTER 11. K-SAMPLE LOCATION PROBLEMS |
{ By independence,
1 ¡ ®0 = P (Ec) = P (E1c) £ ¢¢¢ £ P (Ekc¡1) = (1 ¡ ®)k¡1;
hence,
®0 = 1 ¡ (1 ¡ ®)k¡1:
²Notice that ®0 > ®, i.e. the family rate of Type I error is greater than the rate for an individual test. For example, if k = 3 and ® = :05, then
®0 = 1 ¡ (1 ¡ :05)2 = :0975:
This phenomenon is sometimes called \alpha slippage." To protect against alpha slippage, we usually prefer to specify the family rate of Type I error that will be tolerated and compute a signi¯cance level that will ensure the speci¯ed family rate. For example, if k = 3 and ®0 = :05, then we solve
:05 = 1 ¡ (1 ¡ ®)2
to obtain a signi¯cance level of |
|
|
|
p |
|
: |
|
:95 |
|||
® = 1 ¡ |
= :0253: |
Bonferroni t-Tests
²Now suppose that we plan m pairwise comparisons. These comparisons are de¯ned by contrasts µ1; : : : ; µm of the form ¹i ¡¹j, not necessarily mutually orthogonal. Notice that each H0 : µr = 0 vs. H1 : µr 6= 0 is a normal 2-sample location problem with equal variances.
²Fact: Under H0 : ¹1 = ¢¢¢ = ¹k,
|
|
¹ |
|
¹ |
|
|
|
|
|
|
|
||||||||
|
Z = |
|
|
|
Xi¢ ¡ Xj¢ |
|
|
N(0; 1) |
|
||||||||||
|
1 |
1 |
|
» |
|
||||||||||||||
|
|
|
|
|
|
|
|||||||||||||
|
|
|
r³ |
ni |
+ |
|
nj |
´ ¾2 |
|
|
|
|
|
|
|||||
and |
|
¹ |
¹ |
|
|
|
|
|
|
||||||||||
|
|
|
|
|
|
|
|
||||||||||||
T (µ |
) = |
|
|
|
|
Xi¢ ¡ Xj¢ |
|
|
» |
t(N |
¡ |
k): |
|||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||||||
r |
|
1 |
|
1 |
|
|
|
|
|
|
|||||||||
|
|
r³ |
|
|
+ |
|
´ MSE |
|
|
|
|
||||||||
|
|
ni |
nj |
|
|
|
|
11.1. THE NORMAL K-SAMPLE LOCATION PROBLEM |
223 |
² The t-test of H0 : µr = 0 is to reject if and only if
P = PH0 (jT (µr)j ¸ jt(µr)j) · ®;
i.e. if and only if
jt(µr)j ¸ q = qt(1-®/2,df=N-k);
where t(µr) denotes the observed value of T (µr).
²Unless the contrasts are mutually orthogonal, we cannot use the multiplication rule for independent events to compute the family rate of Type I error. However, it follows from the Bonferroni inequality that
®0 = P (E) = P |
à m |
Er! · m |
P (Er) = m®; |
|
[ |
rX |
|
|
r=1 |
=1 |
|
hence, we can ensure that the family rate of Type I error is no greater than a speci¯ed ®0 by testing each contrast at signi¯cance level ® =
®0=m.
11.1.3Post Hoc Comparisons
²We now consider situations in which we determine that a comparison is of interest after inspecting the data. For example, after inspecting Heyl's (1930) data, we might decide to de¯ne µ4 = ¹1 ¡ ¹3 and test H0 : µ4 = 0 vs. H1 : µ4 6= 0.
Bonferroni t-Tests
²Suppose that only pairwise comparisons are of interest. Because we are testing after we have had the opportunity to inspect the data (and therefore to construct the contrasts that appear to be nonzero), we suppose that all pairwise contrasts were of interest a priori.
²Hence, whatever the number of pairwise contrasts actually tested a
posteriori, we set |
à |
! |
|
||
m = |
k |
= k(k ¡ 1)=2 |
2 |
||
and proceed as before. |
|
|
224 |
CHAPTER 11. K-SAMPLE LOCATION PROBLEMS |
Sche®¶e F -Tests
²The most conservative of all multiple comparison procedures, Sche®¶e's procedure is predicated on the assumption that all possible contrasts were of interest a priori.
² Sche®¶e's F -test of H0 : µr = 0 vs. H1 : µr 6= 0 is to reject H0 if and only if
f(µr)=(k ¡ 1) ¸ q = qf(1-®,k-1,N-k);
where f(µr) denotes the observed value of the F (µr) de¯ned for the method of planned orthogonal contrasts.
²Fact: No matter how many H0 : µr = 0 are tested by Sche®¶e's F -test, the family rate of Type I error is no greater than ®.
²Example: For Heyl's (1930) data, Sche®¶e's F -test produces
Source |
F |
P |
|
|
|
µ1 |
1.3 |
.294217 |
µ2 |
25.3 |
.000033 |
µ3 |
24.7 |
.000037 |
µ4 |
2.2 |
.151995 |
For the ¯rst three comparisons, our conclusions are not appreciably a®ected by whether the contrasts were constructed before or after examining the data. However, if µ4 had been planned, we would have obtained f(µ4) = 4:4 and P = :056772.
11.2The k-Sample Location Problem for a General Shift Family
11.2.1The Kruskal-Wallis Test
11.3Exercises