Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «Высшая школа экономики»

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

R in Action, Second Edition.pdf

Скачиваний:

540

Добавлен:

26.03.2016

Размер:

20.33 Mб

Скачать

☆

<<< < Предыдущая 61 62 63 64 65 66 67 68 69 70 71 7273 / 17373 74 75 76 77 78 79 80 81 82 83 84 85 > Следующая >>>

One-way ANOVA	219
Signif. codes: 0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> library(gplots)		f Plots group means
> library(gplots)		f Plots group means
> plotmeans(response ~ trt, xlab="Treatment", ylab="Response",		and confidence
main="Mean Plot\nwith 95% CI")		intervals
> detach(cholesterol)


Looking at the output, you can							Mean Plot with 95% CI
see that 10 patients received each							Mean Plot with 95% CI
see that 10 patients received each
of the drug regimens b. From
the means, it appears that drugE						20
produced the greatest cholesterol						20
produced the greatest cholesterol
reduction, whereas 1time pro-
duced the least c. Standard devi-					Response	15
ations	were	relatively		constant
across the five groups, ranging
from 2.88 to 3.48 d. The ANOVA						10
F test for treatment (trt) is signifi-						10
F test for treatment (trt) is signifi-
cant (p < .0001), providing evi-
dence that the five treatments						5
aren’t all equally effective e.						n=10	n=10	n=10	n=10	n=10
The plotmeans() function in						1time	2times	4times	drugD	drugE
the gplots package can be used								Treatment
to produce a graph of group					Figure 9.1 Treatment group means with 95% confidence
means	and	their	confidence		intervals for five cholesterol-reducing drug regimens
intervals f. A plot of the treat-
ment means, with 95% confidence limits, is provided in figure 9.1 and allows you to
clearly see these treatment differences.

9.3.1Multiple comparisons

The ANOVA F test for treatment tells you that the five drug regimens aren’t equally effective, but it doesn’t tell you which treatments differ from one another. You can use a multiple comparison procedure to answer this question. For example, the TukeyHSD() function provides a test of all pairwise differences between group means, as shown next.

Listing 9.2 Tukey HSD pairwise group comparisons

>TukeyHSD(fit)

Tukey multiple comparisons of means 95% family-wise confidence level

Fit: aov(formula = response ~ trt)

$trt

220		CHAPTER 9		Analysis of variance
	diff	lwr	upr	p adj
2times-1time	3.44	-0.658	7.54	0.138
4times-1time	6.59	2.492	10.69	0.000
drugD-1time	9.58	5.478	13.68	0.000
drugE-1time	15.17	11.064	19.27	0.000
4times-2times	3.15	-0.951	7.25	0.205
drugD-2times	6.14	2.035	10.24	0.001
drugE-2times	11.72	7.621	15.82	0.000
drugD-4times	2.99	-1.115	7.09	0.251
drugE-4times	8.57	4.471	12.67	0.000
drugE-drugD	5.59	1.485	9.69	0.003

>par(las=2)

>par(mar=c(5,8,4,2))

>plot(TukeyHSD(fit))

For example, the mean cholesterol reductions for 1time and 2times aren’t significantly different from each other (p = 0.138), whereas the difference between 1time and 4times is significantly different (p < .001).

The pairwise comparisons are plotted in figure 9.2. The first par statement rotates the axis labels, and the second one increases the left margin area so that the labels fit (par options are covered in chapter 3). In this graph, confidence intervals that include 0 indicate treatments that aren’t significantly different (p > 0.5).

95% family−wise confidence level

2times−1tim e

4times−1tim e

drugD−1time

drugE−1time

4times−2times

drugD−2times

drugE−2times

drugD−4times

drugE−4times

drugE−drugD

0	5	10	15	20
	Differences in mean levels of trt

Figure 9.2 Plot of Tukey HSD pairwise mean comparisons

			One-way ANOVA	221
a	a	b
	b	b	c
		c	c	d
				d

	25
response	20



	15




	10


	5

Figure 9.3 Tukey HSD tests provided by the multcomp package

1time

2times

4times

drugD

drugE

trt

The glht() function in the multcomp package provides a much more comprehensive set of methods for multiple mean comparisons that you can use for both linear models (such as those described in this chapter) and generalized linear models (covered in chapter 13). The following code reproduces the Tukey HSD test, along with a different graphical representation of the results (figure 9.3):

>library(multcomp)

>par(mar=c(5,4,6,2))

>tuk <- glht(fit, linfct=mcp(trt="Tukey"))

>plot(cld(tuk, level=.05),col="lightgrey")

In this code, the par statement increases the top margin to fit the letter array. The level option in the cld() function provides the significance level to use (0.05, or 95% confidence in this case).

Groups (represented by box plots) that have the same letter don’t have significantly different means. You can see that 1time and 2times aren’t significantly different (they both have the letter a) and that 2times and 4times aren’t significantly different (they both have the letter b); but that 1time and 4times are different (they don’t share a letter). Personally, I find figure 9.3 easier to read than figure 9.2. It also has the advantage of providing information on the distribution of scores within each group.

From these results, you can see that taking the cholesterol-lowering drug in 5 mg doses four times a day was better than taking a 20 mg dose once per day. The competitor drugD wasn’t superior to this four-times-per-day regimen. But competitor drugE was superior to both drugD and all three dosage strategies for the focus drug.

222	CHAPTER 9 Analysis of variance

Multiple comparisons methodology is a complex and rapidly changing area of study. To learn more, see Bretz, Hothorn, and Westfall (2010).

9.3.2Assessing test assumptions

As you saw in the previous chapter, confidence in results depends on the degree to which your data satisfies the assumptions underlying the statistical tests. In a one-way ANOVA, the dependent variable is assumed to be normally distributed and have equal variance in each group. You can use a Q-Q plot to assess the normality assumption:

>library(car)

>qqPlot(lm(response ~ trt, data=cholesterol),

simulate=TRUE, main="Q-Q Plot", labels=FALSE)

Note the qqPlot() requires an lm() fit. The graph is provided in figure 9.4. The data falls within the 95% confidence envelope, suggesting that the normality assumption has been met fairly well.

R provides several tests for the equality (homogeneity) of variances. For example, you can perform Bartlett’s test with this code:

> bartlett.test(response ~ trt, data=cholesterol)

Bartlett test of homogeneity of variances

data: response by trt

Bartlett's K-squared = 0.5797, df = 4, p-value = 0.9653

Q−Q Plot

cholesterol))	2
data =	1
Residuals(lm(response ~ trt,	−1 0
Studentized	−2
	−2	−1	0	1	2

Figure 9.4

Test of normality

t Quantiles

<<< < Предыдущая 61 62 63 64 65 66 67 68 69 70 71 7273 / 17373 74 75 76 77 78 79 80 81 82 83 84 85 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
05.08.2019741.83 Кб0psihologia.rtf
#
02.06.2015162.69 Кб76Psyh_final_ver.docx
#
02.06.2015141.74 Кб44Psyh_final_ver.docx
#
26.03.2016226.3 Кб23public_corporation.doc
#
26.03.2016451.53 Кб7pud_finansovyy-menedjment_318476.pdf
#
26.03.201620.33 Mб540R in Action, Second Edition.pdf
#
26.03.2016296.21 Кб17Radaev_Kak_napisat_akademicheskiy_text.pdf
#
26.03.20163.76 Mб4Raeff_Modernity.pdf
#
26.03.20162.12 Mб19raigorodskii_d_ya_hrestomatiya_psihologiya_lich.pdf
#
02.06.2015494.59 Кб6raschet_SRK_smorodin.doc
#
02.06.201563.98 Кб4referat_IOGP_3.docx