Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
R in Action, Second Edition.pdf
Скачиваний:
540
Добавлен:
26.03.2016
Размер:
20.33 Mб
Скачать

Creating power analysis plots

251

Graphs such as these can help you estimate the impact of various conditions on your experimental design. For example, there appears to be little bang for the buck in increasing the sample size above 200 observations per group. We’ll look at another plotting example in the next section.

10.3 Creating power analysis plots

Before leaving the pwr package, let’s look at a more involved graphing example. Suppose you’d like to see the sample size necessary to declare a correlation coefficient statistically significant for a range of effect sizes and power levels. You can use the pwr.r.test() function and for loops to accomplish this task, as shown in the following listing.

Listing 10.2 Sample-size curves for detecting correlations of various sizes

library(pwr)

 

r <- seq(.1,.5,.01)

 

 

nr <- length(r)

b Sets the range of correlations

 

p <- seq(.4,.9,.1)

and power values

 

np <- length(p)

 

samsize <- array(numeric(nr*np), dim=c(nr,np))

 

 

 

 

 

for (i in 1:np){

 

 

 

 

 

 

 

for (j in 1:nr){

 

 

 

 

 

 

 

result <- pwr.r.test(n = NULL, r = r[j],

c Obtains sample size

sig.level = .05, power = p[i],

 

 

alternative = "two.sided")

 

 

 

 

 

 

 

samsize[j,i] <- ceiling(result$n)

 

 

 

 

 

 

 

}

 

 

 

 

 

 

 

 

}

 

 

 

 

 

 

 

 

xrange <- range(r)

 

 

 

 

 

 

 

 

yrange <- round(range(samsize))

 

 

 

 

 

 

d Sets up the graph

 

 

 

 

colors <- rainbow(length(p))

 

 

 

 

 

 

 

plot(xrange, yrange, type="n",

 

 

 

 

 

 

 

xlab="Correlation Coefficient (r)",

 

 

 

 

ylab="Sample Size (n)" )

 

 

 

 

 

 

 

for (i in 1:np){

 

 

 

 

e Adds power

 

 

 

lines(r, samsize[,i], type="l", lwd=2, col=colors[i])

 

curves

 

}

 

 

 

 

 

 

 

 

abline(v=0, h=seq(0,yrange[2],50), lty=2, col="grey89")

 

 

 

f Adds grid

 

 

 

abline(h=0, v=seq(xrange[1],xrange[2],.02), lty=2, col="gray89")

 

lines

title("Sample Size Estimation for Correlation Studies\n

 

g Adds annotations

 

Sig=0.05 (Two-tailed)")

 

 

 

 

legend("topright", title="Power", as.character(p),

 

 

 

 

fill=colors)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Listing 10.2 uses the seq function to generate a range of effect sizes r (correlation coefficients under H1) and power levels p b. It then uses two for loops to cycle

252

CHAPTER 10 Power analysis

Sample Size Estimation for Correlation Studies

 

1000

 

800

Size (n)

600

Sample

400

 

200

 

0

Sig=0.05 (Two−tailed)

Power

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

Correlation Coefficient (r)

Figure 10.3 Sample size curves for detecting a significant correlation at various power levels

through these effect sizes and power levels, calculating the corresponding sample sizes required and saving them in the array samsize c. The graph is set up with the appropriate horizontal and vertical axes and labels d. Power curves are added using lines rather than points e. Finally, a grid f and legend g are added to aid in reading the graph. The resulting graph is displayed in figure 10.3.

As you can see from the graph, you’d need a sample size of approximately 75 to detect a correlation of 0.20 with 40% confidence. You’d need approximately 185 additional observations (n = 260) to detect the same correlation with 90% confidence. With simple modifications, the same approach can be used to create sample size and power curve graphs for a wide range of statistical tests.

We’ll close this chapter by briefly looking at other R functions that are useful for power analysis.

10.4 Other packages

There are several other packages in R that can be useful in the planning stages of studies (see table 10.4). Some contain general tools, whereas some are highly specialized. The last five in the table are particularly focused on power analysis in genetic studies. Genome-wide association studies (GWAS) are studies used to identify genetic

Summary

253

associations with observable traits. For example, these studies would focus on why some people get a specific type of heart disease.

Table 10.4 Specialized power-analysis packages

Package

Purpose

 

 

asypow

Power calculations via asymptotic likelihood ratio methods

longpower

Sample-size calculations for longitudinal data

PwrGSD

Power analysis for group sequential designs

pamm

Power analysis for random effects in mixed models

powerSurvEpi

Power and sample-size calculations for survival analysis in epidemio-

 

logical studies

powerMediation

Power and sample-size calculations for mediation effects in linear,

 

logistic, Poisson, and cox regression

powerpkg

Power analyses for the affected sib pair and the TDT (transmission

 

disequilibrium test) design

powerGWASinteraction

Power calculations for interactions for GWAS

pedantics

Functions to facilitate power analyses for genetic studies of natural

 

populations

gap

Functions for power and sample-size calculations in case-cohort

 

designs

ssize.fdr

Sample-size calculations for microarray experiments

 

 

Finally, the MBESS package contains a wide range of functions that can be used for various forms of power analysis and sample size determination. The functions are particularly relevant for researchers in the behavioral, educational, and social sciences.

10.5 Summary

In chapters 7, 8, and 9, we explored a wide range of R functions for statistical hypothesis testing. In this chapter, we focused on the planning stages of such research. Power analysis helps you to determine the sample sizes needed to discern an effect of a given size with a given degree of confidence. It can also tell you the probability of detecting such an effect for a given sample size. You can directly see the tradeoff between limiting the likelihood of wrongly declaring an effect significant (a Type I error) with the likelihood of rightly identifying a real effect (power).

The bulk of this chapter has focused on the use of functions provided by the pwr package. These functions can be used to carry out power and sample-size determinations for common statistical methods (including t-tests, chi-square tests, and tests of proportions, ANOVA, and regression). Pointers to more specialized methods were provided in the final section.

254

CHAPTER 10 Power analysis

Power analysis is typically an interactive process. The investigator varies the parameters of sample size, effect size, desired significance level, and desired power to observe their impact on each other. The results are used to plan studies that are more likely to yield meaningful results. Information from past research (particularly regarding effect sizes) can be used to design more effective and efficient future research.

An important side benefit of power analysis is the shift that it encourages, away from a singular focus on binary hypothesis testing (that is, does an effect exist or not), toward an appreciation of the size of the effect under consideration. Journal editors are increasingly requiring authors to include effect sizes as well as p values when reporting research results. This helps you to determine both the practical implications of the research and provides you with information that can be used to plan future studies.

In the next chapter, we’ll look at additional and novel ways to visualize multivariate relationships. These graphic methods can complement and enhance the analytic methods that we’ve discussed so far and prepare you for the advanced methods covered in part 3.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]