Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный университет биоресурсов и природопользования

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

Handbook_of_statistical_analysis_using_SAS

.pdf

Скачиваний:

Добавлен:

01.05.2015

Размер:

4.92 Mб

Скачать

☆

<<< < Предыдущая 2 3 4 5 6 7 8 9 10 11 12 1314 / 3614 15 16 17 18 19 20 21 22 23 24 25 26 > Следующая >>>

		The GLM Procedure
		Class Level Information
	Class		Levels	Values
	origin		2	A N
	sex		2	F M
	grade		4	F0 F1 F2 F3
	type		2	AL SL
	Number of observations 154
		The GLM Procedure
Dependent Variable: days
			Sum of
Source	DF		Squares	Mean Square	F Value	Pr > F
Model	6	4953.56458		825.59410	3.60	0.0023
Error	147	33752.57179		229.60933
Corrected Total	153	38706.13636

R-Square Coeff Var Root MSE days Mean

0.127979 93.90508 15.15287 16.13636

Source	DF	Type I SS Mean Square				F Value		Pr > F
origin	1	2645	.652580	2645	.652580	11	.52	0.0009
sex	1	338	.877090	338	.877090	1	.48	0.2264
grade	3	1837	.020006	612	.340002	2	.67	0.0500
type	1	132	.014900	132	.014900	0	.57	0.4495
Source	DF	Type III SS		Mean Square		F Value		Pr > F
origin	1	2403	.606653	2403	.606653	10	.47	0.0015
sex	1	185	.647389	185	.647389	0	.81	0.3700
grade	3	1917	.449682	639	.149894	2	.78	0.0430
type	1	132	.014900	132	.014900	0	.57	0.4495

The GLM Procedure

Class Level Information

Class	Levels	Values
origin	2	A N
sex	2	F M
grade	4	F0 F1 F2 F3
type	2	AL SL

Number of observations 154

The GLM Procedure

Dependent Variable: days

		Sum of
Source	DF	Squares		Mean Square	F Value	Pr > F
Model	6	4953	.56458	825.59410	3.60	0.0023
Error	147	33752	.57179	229.60933
Corrected Total	153	38706	.13636

R-Square Coeff Var Root MSE days Mean

0.127979 93.90508 15.15287 16.13636

Source	DF	Type I SS Mean Square				F Value		Pr > F
grade	3	2277	.172541	759	.057514	3	.31	0.0220
sex	1	124	.896018	124	.896018	0	.54	0.4620
type	1	147	.889364	147	.889364	0	.64 0.4235
origin	1	2403	.606653	2403	.606653	10	.47	0.0015

The GLM Procedure

Class Level Information

Class	Levels	Values
origin	2	A N
sex	2	F M
grade	4	F0 F1 F2 F3
type	2	AL SL

Number of observations 154

		The GLM Procedure
Dependent Variable: days
		Sum of
Source	DF	Squares		Mean Square	F Value	Pr > F
Model	6	4953	.56458	825.59410	3.60	0.0023
Error	147	33752	.57179	229.60933
Corrected Total	153	38706	.13636

R-Square		Coeff Var		Root MSE		days Mean
0.127979		93.90508		15.15287		16.13636
Source	DF	Type I SS Mean Square				F Value		Pr > F
type	1	19	.502391	19	.502391	0	.08	0.7711
sex	1	336	.215409	336	.215409	1	.46 0.2282
origin	1	2680	.397094 2680		.397094	11	.67	0.0008
grade	3	1917	.449682	639	.149894	2	.78	0.0430

The GLM Procedure

Class Level Information

Class	Levels	Values
origin	2	A N
sex	2	F M
grade	4	F0 F1 F2 F3
type	2	AL SL

Number of observations 154

The GLM Procedure

Dependent Variable: days

		Sum of
Source	DF	Squares		Mean Square	F Value	Pr > F
Model	6	4953	.56458	825.59410	3.60	0.0023
Error	147	33752	.57179	229.60933
Corrected Total	153	38706	.13636

	R-Square		Coeff Var	Root MSE		days Mean
	0.127979 93.90508			15.15287		16.13636
Source	DF	Type I SS		Mean Square		F Value		Pr > F
sex	1	308	.062554	308	.062554	1	.34 0.2486
origin	1	2676	.467116	2676	.467116	11	.66	0.0008
type	1	51	.585224	51	.585224	0	.22	0.6362
grade	3	1917	.449682	639	.149894	2	.78	0.0430

Display 6.2

Next we ﬁt a full factorial model to the data as follows:

proc glm data=ozkids;

class origin sex grade type;

model days=origin sex grade type origin|sex|grade|type /ss1 ss3;

run;

Joining variable names with a bar is a shorthand way of specifying an interaction and all the lower-order interactions and main effects implied by it. This is useful not only to save typing but to ensure that relevant terms in the model are not inadvertently omitted. Here we have explicitly speciﬁed the main effects so that they are entered before any interaction terms when calculating Type I sums of squares.

The output is shown in Display 6.3. Note ﬁrst that the only Type I and Type III sums of squares that agree are those for the origin * sex * grade * type interaction. Now consider the origin main effect. The Type I sum of squares for origin is “corrected” only for the mean because it appears ﬁrst in the proc glm statement. The effect is highly signiﬁcant. But using Type III sums of squares, in which the origin effect is corrected for all other main effects and interactions, the corresponding F value has an associated P-value of 0.2736. Now origin is judged nonsigniﬁcant, but this may simply reﬂect the loss of power after “adjusting” for a lot of relatively unimportant interaction terms.

Arriving at a ﬁnal model for these data is not straightforward (see Aitkin [1978] for some suggestions), and the issue is not pursued here because the data set will be the subject of further analyses in Chapter 9. However, some of the exercises encourage readers to try some alternative analyses of variance.

The GLM Procedure

Class Level Information

Class	Levels	Values
origin	2	A N
se	2	F M
grade	4	F0 F1 F2 F3
type	2	AL SL

Number of observations 154

The GLM Procedure

Dependent Variable: days

		Sum of
Source	DF	Squares	Mean Square		F Value	Pr > F
Model	31	15179.41930	489	.65869	2.54	0.0002
Error	122	23526.71706	192	.84194
Corrected Total	153	38706.13636
R-Square		Coeff Var	Root MSE days Mean
0.392171		86.05876	13.88675	16	.13636

Source	DF	Type I SS		Mean Square		F Value		Pr > F
origin	1	2645	.652580	2645	.652580	13	.72	0.0003
sex	1	338	.877090	338	.877090	1	.76	0.1874
grade	3	1837	.020006	612	.340002	3	.18	0.0266
type	1	132	.014900	132	.014900	0	.68	0.4096
origin*sex	1	142	.454554	142	.454554	0	.74	0.3918
origin*grade	3	3154	.799178	1051	.599726	5	.45	0.0015
sex*grade	3	2009	.479644	669	.826548	3	.47	0.0182
originsexgrade	3	226	.309848	75	.436616	0	.39	0.7596
origin*type	1	38	.572890	38	.572890	0	.20	0.6555
sex*type	1	69	.671759	69	.671759	0	.36	0.5489
originsextype	1	601	.464327	601	.464327	3	.12	0.0799
grade*type	3	2367	.497717	789	.165906	4	.09	0.0083
origingradetype	3	887	.938926	295	.979642	1	.53	0.2089
sexgradetype	3	375	.828965	125	.276322	0	.65	0.5847
origisexgrade*type	3	351	.836918	117	.278973	0	.61	0.6109

Source	DF	Type III SS		Mean Square		F Value	Pr > F
origin	1	233	.201138	233	.201138	1.21	0.2736
sex	1	344	.037143	344	.037143	1.78	0.1841
grade	3	1036	.595762	345	.531921	1.79	0.1523
type	1	181	.049753	181	.049753	0.94	0.3345
origin*sex	1	3	.261543	3	.261543	0.02	0.8967
origin*grade	3	1366	.765758	455	.588586	2.36	0.0746
sex*grade	3	1629	.158563	543	.052854	2.82	0.0420
originsexgrade	3	32	.650971	10	.883657	0.06	0.9823
origin*type	1	55	.378055	55	.378055	0.29	0.5930
sex*type	1	1	.158990	1	.158990	0.01	0.9383
originsextype	1	337	.789437	337	.789437	1.75	0.1881
grade*type	3	2037	.872725	679	.290908	3.52	0.0171
origingradetype	3	973	.305369	324	.435123	1.68	0.1743
sexgradetype	3	410	.577832	136	.859277	0.71	0.5480
origisexgrade*type	3	351	.836918	117	.278973	0.61	0.6109

Display 6.3

Exercises

6.1Investigate simpler models for the data used in this chapter by dropping interactions or sets of interactions from the full factorial model ﬁtted in the text. Try several different orders of effects.

6.2The outcome for the data in this chapter — number of days absent

— is a count variable. Consequently, assuming normally distributed errors may not be entirely appropriate, as we will see in Chapter 9. Here, however, we might deal with this potential problem by way of a transformation. One possibility is a log transformation. Investigate this possibility.

6.3Find a table of cell means and standard deviations for the data used in this chapter.

6.4Construct a normal probability plot of the residuals from ﬁtting a main-effects-only model to the data used in this chapter. Comment on the results.

Chapter 7

Analysis of Variance

of Repeated Measures:

Visual Acuity

7.1 Description of Data

The data used in this chapter are taken from Table 397 of SDS. They are reproduced in Display 7.1. Seven subjects had their response times measured when a light was ﬂashed into each eye through lenses of powers 6/6, 6/18, 6/36, and 6/60. Measurements are in milliseconds, and the question of interest was whether or not the response time varied with lens strength. (A lens of power a/b means that the eye will perceive as being at “a” feet an object that is actually positioned at “b” feet.)

7.2 Repeated Measures Data

The observations in Display 7.1 involve repeated measures. Such data arise often, particularly in the behavioural sciences and related disciplines, and involve recording the value of a response variable for each subject under more than one condition and/or on more than one occasion.

Visual Acuity and Lens Strength

		Left Eye				Right Eye
Subject	6/6	6/18	6/36	6/60	6/6	6/18	6/36	6/60

1	116	119	116	124	120	117	114	122
2	110	110	114	115	106	112	110	110
3	117	118	120	120	120	120	120	124
4	112	116	115	113	115	116	116	119
5	113	114	114	118	114	117	116	112
6	119	115	94	116	100	99	94	97
7	110	110	105	118	105	105	115	115

Display 7.1

Researchers typically adopt the repeated measures paradigm as a means of reducing error variability and/or as the natural way of measuring certain phenomena (e.g., developmental changes over time, learning and memory tasks, etc). In this type of design, the effects of experimental factors giving rise to the repeated measures are assessed relative to the average response made by a subject on all conditions or occasions. In essence, each subject serves as his or her own control and, accordingly, variability due to differences in average responsiveness of the subjects is eliminated from the extraneous error variance. A consequence of this is that the power to detect the effects of within-subjects experimental factors is increased compared to testing in a between-subjects design.

Unfortunately, the advantages of a repeated measures design come at a cost, and that cost is the probable lack of independence of the repeated measurements. Observations made under different conditions involving the same subjects will very likely be correlated rather than independent. This violates one of the assumptions of the analysis of variance procedures described in Chapters 5 and 6, and accounting for the dependence between observations in a repeated measures designs requires some thought. (In the visual acuity example, only within-subject factors occur; and it is possible — indeed likely — that the lens strengths under which a subject was observed were given in random order. However, in examples where time is the single within-subject factor, randomisation is not, of course, an option. This makes the type of study in which subjects are simply observed over time rather different from other repeated measures designs, and they are often given a different label — longitudinal designs. Owing to their different nature, we consider them speciﬁcally later in Chapters 10 and 11.)

7.3Analysis of Variance for Repeated Measures Designs

Despite the lack of independence of the observations made within subjects in a repeated measures design, it remains possible to use relatively straightforward analysis of variance procedures to analyse the data if three particular assumptions about the observations are valid; that is

1.Normality: the data arise from populations with normal distributions.

2.Homogeneity of variance: the variances of the assumed normal distributions are equal.

3.Sphericity: the variances of the differences between all pairs of the repeated measurements are equal. This condition implies that the correlations between pairs of repeated measures are also equal, the so-called compound symmetry pattern.

It is the third assumption that is most critical for the validity of the analysis of variance F-tests. When the sphericity assumption is not regarded as likely, there are two alternatives to a simple analysis of variance: the use of correction factors and multivariate analysis of variance. All three possibilities will be considered in this chapter.

We begin by considering a simple model for the visual acuity observations, yijk, where yijk represents the reaction time of the ith subject for eye j and lens strength k. The model assumed is

yijk = µ + α j + β k + (αβ )jk + γ i + (γα )ij + (γβ )ik + (γαβ )ijk + ijk (7.1)

where α j represents the effect of eye j, β k is the effect of the kth lens strength, and (αβ )jk is the eye × lens strength interaction. The term γ i is

a constant associated with subject i and (γα )ij, (γβ )ik, and (γαβ )ijk represent interaction effects of subject i with each factor and their interaction. The

terms α j, β k, and (αβ )jk are assumed to be ﬁxed effects, but the subject and error terms are assumed to be random variables from normal distributions with zero means and variances speciﬁc to each term. This is an example of a mixed model.

Equal correlations between the repeated measures arise as a consequence of the subject effects in this model; and if this structure is valid, a relatively straightforward analysis of variance of the data can be used. However, when the investigator thinks the assumption of equal correlations is too strong, there are two alternatives that can be used:

<<< < Предыдущая 2 3 4 5 6 7 8 9 10 11 12 1314 / 3614 15 16 17 18 19 20 21 22 23 24 25 26 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
14.11.201956.62 Кб3Gal_-Vol_kn.docx
#
01.05.201545.25 Mб31Get_Rid_of_your_Accent_-_Advanced_Level.pdf
#
01.05.201522.82 Mб93gistologia.pdf
#
22.08.20193.23 Mб9Gnuch.-Kovt.-Skoroch puc..doc
#
01.05.2015325.63 Кб5GOST_20850-84_ДКК.doc.столярка.doc
#
01.05.20154.92 Mб17Handbook_of_statistical_analysis_using_SAS.pdf
#
10.08.201983.97 Кб13HARDWARE.doc
#
01.05.201533.9 Кб6History.docx
#
10.03.201612.98 Mб20hmelnickii_g_o_homenko_v_s_veterinarna_farmakologiya.pdf
#
10.03.20164.78 Mб10Hroshi_ta_kredyt_vyd4.pdf
#
01.05.201553.25 Кб68inform_testi (1).doc