Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Robert I. Kabacoff - R in action

.pdf
Скачиваний:
89
Добавлен:
02.06.2015
Размер:
12.13 Mб
Скачать

256

CHAPTER 10 Power analysis

the alternative hypothesis in a two-way contingency table. Here, P is a hypothesized two-way probability table.

As a simple example, let’s assume that you’re looking the relationship between ethnicity and promotion. You anticipate that 70 percent of your sample will be Caucasian, 10 percent will be African American, and 20 percent will be Hispanic. Further, you believe that 60 percent of Caucasians tend to be promoted, compared with 30 percent for African Americans, and 50 percent for Hispanics. Your research hypothesis is that the probability of promotion follows the values in table 10.2.

Table 10.2 Proportion of individuals expected to be promoted based on the research hypothesis

Ethnicity

Promoted

Not promoted

 

 

 

Caucasian

0.42

0.28

African American

0.03

0.07

Hispanic

0.10

0.10

 

 

 

For example, you expect that 42 percent of the population will be promoted Caucasians (.42 = .70 × .60) and 7 percent of the population will be nonpromoted African Americans (.07 = .10 × .70). Let’s assume a significance level of 0.05 and the desired power level is 0.90. The degrees of freedom in a two-way contingency table are (r-1)*(c-1), where r is the number of rows and c is the number of columns. You can calculate the hypothesized effect size with the following code:

>prob <- matrix(c(.42, .28, .03, .07, .10, .10), byrow=TRUE, nrow=3)

>ES.w2(prob)

[1] 0.1853198

Using this information, you can calculate the necessary sample size like this:

> pwr.chisq.test(w=.1853, df=2, sig.level=.05, power=.9)

Chi squared power calculation

w

= 0.1853

N

= 368.5317

df

= 2

sig.level

=

0.05

power

=

0.9

NOTE: N is the number of observations

The results suggest that a study with 369 participants will be adequate to detect a relationship between ethnicity and promotion given the effect size, power, and significance level specified.

Implementing power analysis with the pwr package

257

10.2.7 Choosing an appropriate effect size in novel situations

In power analysis, the expected effect size is the most difficult parameter to determine. It typically requires that you have experience with the subject matter and the measures employed. For example, the data from past studies can be used to calculate effect sizes, which can then be used to plan future studies.

But what can you do when the research situation is completely novel and you have no past experience to call upon? In the area of behavioral sciences, Cohen (1988) attempted to provide benchmarks for “small,” “medium,” and “large” effect sizes for various statistical tests. These guidelines are provided in table 10.3.

Table 10.3 Cohen’s effect size benchmarks

Statistical method

Effect size measures

Suggested guidelines for effect size

 

 

 

 

 

 

 

Small

Medium

Large

 

 

 

 

 

t-test

d

0.20

0.50

0.80

ANOVA

f

0.10

0.25

0.40

Linear models

f2

0.02

 

 

0.15

0.35

Test of propor tions

h

0.20

 

 

0.50

0.80

Chi-square

w

0.10

 

 

0.30

0.50

 

 

 

 

 

When you have no idea what effect size may be present, this table may provide some guidance. For example, what’s the probability of rejecting a false null hypothesis (that is, finding a real effect), if you’re using a one-way ANOVA with 5 groups, 25 subjects per group, and a significance level of 0.05?

Using the pwr.anova.test() function and the suggestions in f row of table 10.3, the power would be 0.118 for detecting a small effect, 0.574 for detecting a moderate effect, and 0.957 for detecting a large effect. Given the sample size limitations, you’re only likely to find an effect if it’s large.

It’s important to keep in mind that Cohen’s benchmarks are just general suggestions derived from a range of social research studies and may not apply to your particular field of research. An alternative is to vary the study parameters and note the impact on such things as sample size and power. For example, again assume that you want to compare five groups using a one-way ANOVA and a 0.05 significance level. The following listing computes the sample sizes needed to detect a range of effect sizes and plots the results in figure 10.2.

258

CHAPTER 10 Power analysis

One Way ANOVA with Power=.90 and Alpha=.05

 

0.5

 

 

 

 

 

 

0.4

 

 

 

 

 

Effect Size

0.3

 

 

 

 

 

 

0.2

 

 

 

 

 

 

0.1

 

 

 

 

 

 

50

100

150

200

250

300

Sample Size (per cell)

Figure 10.2 Sample size needed to detect various effect sizes in a one-way ANOVA with five groups (assuming a power of 0.90 and significance level of 0.05)

Listing 10.1 Sample sizes for detecting significant effects in a one-way ANOVA

library(pwr)

es <- seq(.1, .5, .01) nes <- length(es)

samsize <- NULL for (i in 1:nes){

result <- pwr.anova.test(k=5, f=es[i], sig.level=.05, power=.9) samsize[i] <- ceiling(result$n)

}

plot(samsize,es, type="l", lwd=2, col="red", ylab="Effect Size",

xlab="Sample Size (per cell)",

main="One Way ANOVA with Power=.90 and Alpha=.05")

Graphs such as these can help you estimate the impact of various conditions on your experimental design. For example, there appears to be little bang for the buck increasing the sample size above 200 observations per group. We’ll look at another plotting example in the next section.

10.3 Creating power analysis plots

Before leaving the pwr package, let’s look at a more involved graphing example. Suppose you’d like to see the sample size necessary to declare a correlation coefficient statistically significant for a range of effect sizes and power levels. You can use the pwr.r.test() function and for loops to accomplish this task, as shown in the following listing.

Creating power analysis plots

259

Listing 10.2 Sample size curves for detecting correlations of various sizes

library(pwr)

r <- seq(.1,.5,.01) nr <- length(r)

p <- seq(.4,.9,.1) np <- length(p)

samsize <- array(numeric(nr*np), dim=c(nr,np)) for (i in 1:np){

for (j in 1:nr){

result <- pwr.r.test(n = NULL, r = r[j], sig.level = .05, power = p[i], alternative = "two.sided")

samsize[j,i] <- ceiling(result$n)

}

}

 

 

xrange <- range(r)

 

 

 

yrange <- round(range(samsize))

 

 

colors <- rainbow(length(p))

 

 

plot(xrange, yrange, type="n",

 

 

xlab="Correlation Coefficient (r)",

 

 

ylab="Sample Size (n)" )

 

 

Set range of correlations &

power values

Obtain sample sizes

Set up graph

for (i in 1:np){

lines(r, samsize[,i], type="l", lwd=2, col=colors[i])

}

abline(v=0, h=seq(0,yrange[2],50), lty=2, col="grey89") abline(h=0, v=seq(xrange[1],xrange[2],.02), lty=2,

col="gray89")

title("Sample Size Estimation for Correlation Studies\n Sig=0.05 (Two-tailed)")

legend("topright", title="Power", as.character(p), fill=colors)

Add power curves

Add annotations

Listing 10.2 uses the seq function to generate a range of effect sizes r (correlation coefficients under H1) and power levels p . It then uses two for loops to cycle through these effect sizes and power levels, calculating the corresponding sample sizes required and saving them in the array samsize . The graph is set up with the appropriate horizontal and vertical axes and labels . Power curves are added using lines rather than points . Finally, a grid and legend are added to aid in reading the graph . The resulting graph is displayed in figure 10.3.

As you can see from the graph, you’d need a sample size of approximately 75 to detect a correlation of 0.20 with 40 percent confidence. You’d need approximately 185 additional observations (n=260) to detect the same correlation with 90 percent confidence. With simple modifications, the same approach can be used to create sample size and power curve graphs for a wide range of statistical tests.

We’ll close this chapter by briefly looking at other R functions that are useful for power analysis.

260

CHAPTER 10 Power analysis

Sample Size Estimation for Correlation Studies

 

 

 

Sig=0.05 (Two−tailed)

 

 

 

1000

 

 

 

Power

 

 

 

 

0.4

 

 

 

 

0.5

 

 

 

 

 

 

 

 

 

 

0.6

 

 

 

 

 

0.7

 

800

 

 

 

0.8

 

 

 

 

0.9

Size (n)

600

 

 

 

 

Sample

400

 

 

 

 

 

200

 

 

 

 

 

0

 

 

 

 

 

0.1

0.2

0.3

0.4

0.5

Correlation Coefficient (r)

Figure 10.3 Sample size curves for detecting a significant correlation

at various power levels

10.4 Other packages

There are several other packages in R that can be useful in the planning stages of studies. Some contain general tools, whereas some are highly specialized.

The piface package (see figure 10.4) provides a Java GUI for sample-size methods that interfaces with R. The GUI allows the user to vary study parameters interactively and see their impact on other parameters.

Although the package is described as Pre-Alpha, it’s definitely worth checking out. You can download the package source and binaries for Windows and Mac OS X from http://r-forge.r-project.org/projects/piface/. In R, enter the code

Figure 10.4 Sample dialog boxes from the piface program

Summary

261

install.packages(“piface”, repos=”http://R-Forge.R-project.org”) library(piface)

piface()

The package is particularly useful for exploring the impact of changes in sample size, effect size, significance levels, and desired power on the other parameters.

Other packages related to power analysis are described in table 10.4. The last five are particularly focused on power analysis in genetic studies. Genome-wide association studies (GWAS) are studies used to identify genetic associations with observable traits. For example, these studies would focus on why some people get a specific type of heart disease.

Table 10.4 Specialized power analysis packages

Package

Purpose

 

 

asypow

Power calculations via asymptotic likelihood ratio methods

PwrGSD

Power analysis for group sequential designs

pamm

Power analysis for random effects in mixed models

powerSurvEpi

Power and sample size calculations for sur vival analysis in

 

epidemiological studies

powerpkg

Power analyses for the affected sib pair and the TDT (transmission

 

disequilibrium test) design

powerGWASinteraction

Power calculations for interactions for GWAS

pedantics

Functions to facilitate power analyses for genetic studies

 

of natural populations

gap

Functions for power and sample size calculations in

 

case-cohor t designs

ssize.fdr

Sample size calculations for microarray experiments

 

 

Finally, the MBESS package contains a wide range of functions that can be used for various forms of power analysis. The functions are particularly relevant for researchers in the behavioral, educational, and social sciences.

10.5 Summary

In chapters 7, 8, and 9, we explored a wide range of R functions for statistical hypothesis testing. In this chapter, we focused on the planning stages of such research. Power analysis helps you to determine the sample sizes needed to discern an effect of a given size with a given degree of confidence. It can also tell you the probability of detecting such an effect for a given sample size. You can directly see the tradeoff between limiting the likelihood of wrongly declaring an effect significant (a Type I error) with the likelihood of rightly identifying a real effect (power).

262

CHAPTER 10 Power analysis

The bulk of this chapter has focused on the use of functions provided by the pwr package. These functions can be used to carry out power and sample size determinations for common statistical methods (including t-tests, chi-square tests, and tests of proportions, ANOVA, and regression). Pointers to more specialized methods were provided in the final section.

Power analysis is typically an interactive process. The investigator varies the parameters of sample size, effect size, desired significance level, and desired power to observe their impact on each other. The results are used to plan studies that are more likely to yield meaningful results. Information from past research (particularly regarding effect sizes) can be used to design more effective and efficient future research.

An important side benefit of power analysis is the shift that it encourages, away from a singular focus on binary hypothesis testing (that is, does an effect exists or not), toward an appreciation of the size of the effect under consideration. Journal editors are increasingly requiring authors to include effect sizes as well as p values when reporting research results. This helps you to determine both the practical implications of the research and provides you with information that can be used to plan future studies.

In the next chapter, we’ll look at additional and novel ways to visualize multivariate relationships. These graphic methods can complement and enhance the analytic methods that we’ve discussed so far and prepare you for the advanced methods covered in part 3.

I ntermediate11graphs

This chapter covers

Visualizing bivariate and multivariate relationships

Working with scatter and line plots

Understanding correlograms

Using mosaic and association plots

In chapter 6 (basic graphs), we considered a wide range of graph types for displaying the distribution of single categorical or continuous variables. Chapter 8 (regression) reviewed graphical methods that are useful when predicting a continuous outcome variable from a set of predictor variables. In chapter 9 (analysis of variance), we considered techniques that are particularly useful for visualizing how groups differ on a continuous outcome variable. In many ways, the current chapter is a continuation and extension of the topics covered so far.

In this chapter, we’ll focus on graphical methods for displaying relationships between two variables (bivariate relationships) and between many variables (multivariate relationships). For example:

What’s the relationship between automobile mileage and car weight? Does it vary by the number of cylinders the car has?

How can you picture the relationships among an automobile’s mileage, weight, displacement, and rear axle ratio in a single graph?

263

264

CHAPTER 11 Intermediate graphs

When plotting the relationship between two variables drawn from a large dataset (say 10,000 observations), how can you deal with the massive overlap of data points you’re likely to see? In other words, what do you do when your graph is one big smudge?

How can you visualize the multivariate relationships among three variables at once (given a 2D computer screen or sheet of paper, and a budget slightly less than that for Avatar)?

How can you display the growth of several trees over time?

How can you visualize the correlations among a dozen variables in a single graph? How does it help you to understand the structure of your data?

How can you visualize the relationship of class, gender, and age with passenger survival on the Titanic? What can you learn from such a graph?

These are the types of questions that can be answered with the methods described in this chapter. The datasets that we’ll use are examples of what’s possible. It’s the general techniques that are most important. If the topic of automobile characteristics or tree growth isn’t interesting to you, plug in your own data!

We’ll start with scatter plots and scatter plot matrices. Then, we’ll explore line charts of various types. These approaches are well known and widely used in research. Next, we’ll review the use of correlograms for visualizing correlations and mosaic plots for visualizing multivariate relationships among categorical variables. These approaches are also useful but much less well known among researchers and data analysts. You’ll see examples of how you can use each of these approaches to gain a better understanding of your data and communicate these findings to others.

11.1 Scatter plots

As you’ve seen in previous chapters, scatter plots describe the relationship between two continuous variables. In this section, we’ll start with a depiction of a single bivariate relationship (x versus y). We’ll then explore ways to enhance this plot by superimposing additional information. Next, we’ll learn how to combine several scatter plots into a scatter plot matrix so that you can view many bivariate relationships at once. We’ll also review the special case where many data points overlap, limiting our ability to picture the data, and we’ll discuss a number of ways around this difficulty. Finally, we’ll extend the two-dimensional graph to three dimensions, with the addition of a third continuous variable. This will include 3D scatter plots and bubble plots. Each can help you understand the multivariate relationship among three variables at once.

The basic function for creating a scatter plot in R is plot(x, y), where x and y are numeric vectors denoting the (x, y) points to plot. Listing 11.1 presents an example.

Scatter plots

265

Listing 11.1 A scatter plot with best fit lines

attach(mtcars) plot(wt, mpg,

main="Basic Scatter plot of MPG vs. Weight", xlab="Car Weight (lbs/1000)",

ylab="Miles Per Gallon ", pch=19)

abline(lm(mpg~wt), col="red", lwd=2, lty=1)

lines(lowess(wt,mpg), col="blue", lwd=2, lty=2)

The resulting graph is provided in figure 11.1.

The code in listing 11.1 attaches the mtcars data frame and creates a basic scatter plot using filled circles for the plotting symbol. As expected, as car weight increases, miles per gallon decreases, though the relationship isn’t perfectly linear. The abline() function is used to add a linear line of best fit, while the lowess() function is used to add a smoothed line. This smoothed line is a nonparametric fit line based on locally weighted polynomial regression. See Cleveland (1981) for details on the algorithm.

Figure 11.1 Scatter plot of car mileage versus weight, with superimposed linear and lowess fit lines.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]