Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «Высшая школа экономики»

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

R in Action, Second Edition.pdf

Скачиваний:

540

Добавлен:

26.03.2016

Размер:

20.33 Mб

Скачать

☆

<<< < Предыдущая 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169170 / 173170 171 172 173 > Следующая >>>

6	BONUS CHAPTER 23 Advanced graphics with the lattice package

You can issue these options in the high-level function calls or within the panel functions discussed in section 23.3.

You can also use the update() function to modify a lattice graphic object. Continuing the singer example, the following

newgraph <- update(mygraph, col="red", pch=16, cex=.8, jitter=.05, lwd=2)

would modify mygraph using red curves and symbols (color="red"), filled dots (pch=16), smaller (cex=.8) and more highly jittered points (jitter=.05), and lines of double thickness (lwd=2). The resulting graph is saved as newgraph. Now that we’ve reviewed the general structure of a high-level lattice function, let’s look at conditioning variables in more detail.

23.2 Conditioning variables

As you’ve seen, one of the most powerful features of lattice graphs is the ability to add conditioning variables. If one conditioning variable is present, a separate panel is created for each level. If two conditioning variables are present, a separate panel is created for each combination of levels for the two variables. It’s rarely useful to include more than two conditioning variables.

Typically, conditioning variables are factors. But what if you want to condition on a continuous variable? One approach would be to transform the continuous variable into a discrete variable using R’s cut() function. Alternatively, the lattice package provides functions for transforming a continuous variable into a data structure called a shingle. Specifically, the continuous variable is divided into a series of (possibly) overlapping ranges. For example, the function

myshingle <- equal.count(x, number=n, overlap=proportion)

takes continuous variable x and divides it into n intervals with proportion overlap and equal numbers of observations in each range, and returns it as the variable myshingle (of class shingle). Printing or plotting this object (for example, plot(myshingle)) displays the shingle’s intervals.

Once a continuous variable has been converted to a shingle, you can use it as a conditioning variable. For example, let’s use the mtcars dataset to explore the relationship between miles per gallon and car weight conditioned on engine displacement. Because engine displacement is a continuous variable, first let’s convert it to a shingle variable with three levels:

displacement <- equal.count(mtcars$disp, number=3, overlap=0)

Next, use this variable in the xyplot() function:

xyplot(mpg~wt|displacement, data=mtcars,

main = "Miles per Gallon vs. Weight by Engine Displacement", xlab = "Weight", ylab = "Miles per Gallon",

layout=c(3, 1), aspect=1.5)

Panel functions

Miles per Gallon

Miles per Gallon vs. Weight by Engine Displacement

2		3	4	5

	displacement	displacement			displacement

Weight

Figure 23.2 Trellis plot of miles per gallon vs. car weight conditioned on engine displacement. Because engine displacement is a continuous variable, it has been converted to three non-overlapping shingles with equal numbers of observations.

The results are shown in figure 23.2. Note that I also used options to modify the layout of the panels (three columns and one row) and the aspect ratio (height/width) in order to make comparisons among the three groups easier.

You can see that the labels in the panel strips of figure 23.1 and figure 23.2 differ. The representation in figure 23.2 indicates the continuous nature of the conditioning variable, with the darker color indicating the range of values for the conditioning variable in the given panel. In the next section, you’ll use panel functions to customize the output further.

23.3 Panel functions

Each of the high-level plotting functions in table 23.1 employs a default function to draw the panels. These default functions follow the naming convention panel

.graph_function, where graph_function is the high-level function. For example,

xyplot(mpg~wt|displacement, data=mtcars)

could also be written as

xyplot(mpg~wt|displacement, data=mtcars, panel=panel.xyplot)

This is a powerful feature because it allows you to replace the default panel function with a customized function of your own design. You can incorporate one or more of the 50+ default panel functions in the lattice package into your customized function as well. Customized panel functions give you a great deal of flexibility in designing output that meets your needs. Let’s look at some examples.

8	BONUS CHAPTER 23 Advanced graphics with the lattice package

In the previous section, you plotted gas mileage by automobile weight, conditioned on engine displacement. What if you want to include regression lines, rug plots, and grid lines? You can do this by creating your own panel function (see the following listing). The resulting graph is provided in figure 23.3.

Listing 23.2 xyplot with custom panel function

library(lattice)

displacement <- equal.count(mtcars$disp, number=3, overlap=0)

mypanel <- function(x, y) { panel.xyplot(x, y, pch=19) panel.rug(x, y) panel.grid(h=-1, v=-1)

panel.lmline(x, y, col="red", lwd=1, lty=2)

}

xyplot(mpg~wt|displacement, data=mtcars, layout=c(3, 1),

aspect=1.5,

main = "Miles per Gallon vs. Weight by Engine Displacement", xlab = "Weight",

ylab = "Miles per Gallon",	b Customized panel function
panel = mypanel)
panel = mypanel)

Here you wrap four separate building-block functions into your own mypanel() function and apply it within xyplot() through the panel= option b. The panel.xyplot() function generates the scatter plot using a filled circle (pch=19). The panel.rug()

Miles per Gallon vs. Weight by Engine Displacement

				2	3	4	5
	35	displacement			displacement			displacement
	35
	30
per Gallon	25
per Gallon
Miles	20
Miles
	15
	10
	2	3	4	5			2	3	4	5
					Weight

Figure 23.3 Trellis plot of miles per gallon vs. car weight conditioned on engine displacement. A custom panel function has been used to add regression lines, rug plots, and grid lines.

Panel functions

function adds rug plots to both the x- and y-axes of each panel. panel.rug(x, FALSE) or panel.rug(FALSE, y) would have added rugs to just the horizontal or vertical axis, respectively. The panel.grid() function adds horizontal and vertical grid lines (using negative numbers forces them to line up with the axis labels). Finally, the panel

.lmline() function adds a regression line that’s rendered as red (col="red"), dashed (lty=2) lines, of standard thickness (lwd=1). Each default panel function has its own structure and options. See the help page on each (for example, help(panel.lmline)) for further details.

As a second example, you’ll graph the relationship between gas mileage and engine displacement (considered as a continuous variable), conditioned on type of automobile transmission. In addition to creating separate panels for automatic and manual transmission engines, you’ll add smoothed fit lines and horizontal mean lines. The code is given in the following listing.

Listing 23.3 xyplot with a custom panel function and additional options

library(lattice)

mtcars$transmission <- factor(mtcars$am, levels=c(0,1), labels=c("Automatic", "Manual"))

panel.smoother <- function(x, y) { panel.grid(h=-1, v=-1) panel.xyplot(x, y) panel.loess(x, y)

panel.abline(h=mean(y), lwd=2, lty=2, col="darkgreen")

}

xyplot(mpg~disp|transmission,data=mtcars, scales=list(cex=.8, col="red"), panel=panel.smoother,

xlab="Displacement", ylab="Miles per Gallon", main="MPG vs Displacement by Transmission Type", sub = "Dotted lines are Group Means", aspect=1)

The graph produced by this code is provided in figure 23.4.

There are several things to note in this new code. The panel.xyplot() function plots the individual points, and the panel.loess() function plots nonparametric fit lines in each panel. The panel.abline() function adds horizontal reference lines at the mean mpg value for each level of the conditioning variable. (If you replaced h=mean(y) with h=mean(mtcars$mpg), a single reference line would be drawn at the mean mpg value for the entire sample.) The scales= option renders scale annotations (the axis numbers and tick marks) in red and at 80% of the default font size.

In the previous example, you could use scales=list(x=list(), y=list()) to specify separate options for the horizontal and vertical axes. See help(xyplot) for details on the many scale options available. In the next section, you’ll learn how to superimpose data from groups of observations, rather than presenting them in separate panels.

<<< < Предыдущая 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169170 / 173170 171 172 173 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
05.08.2019741.83 Кб0psihologia.rtf
#
02.06.2015162.69 Кб76Psyh_final_ver.docx
#
02.06.2015141.74 Кб44Psyh_final_ver.docx
#
26.03.2016226.3 Кб23public_corporation.doc
#
26.03.2016451.53 Кб7pud_finansovyy-menedjment_318476.pdf
#
26.03.201620.33 Mб540R in Action, Second Edition.pdf
#
26.03.2016296.21 Кб17Radaev_Kak_napisat_akademicheskiy_text.pdf
#
26.03.20163.76 Mб4Raeff_Modernity.pdf
#
26.03.20162.12 Mб19raigorodskii_d_ya_hrestomatiya_psihologiya_lich.pdf
#
02.06.2015494.59 Кб6raschet_SRK_smorodin.doc
#
02.06.201563.98 Кб4referat_IOGP_3.docx