Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «Высшая школа экономики»

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

R in Action, Second Edition.pdf

Скачиваний:

540

Добавлен:

26.03.2016

Размер:

20.33 Mб

Скачать

☆

<<< < Предыдущая 123 124 125 126 127 128 129 130 131 132 133 134135 / 173135 136 137 138 139 140 141 142 143 144 145 146 147 > Следующая >>>

Specifying the plot type with geoms

443

you’ll be able to create a wide variety of interesting and useful plots with just a few lines of code.

Let’s start with a description of geom functions and the type of graphs they can create. Then we’ll look at the aes() function in more detail and how you can use it to group data. Next, we’ll consider faceting and the creation of trellis graphs. Finally, we’ll look at ways to tweak the appearance of ggplot2 graphs, including modifying axes and legends, changing color schemes, and adding annotations. The chapter will end with pointers to resources that can help you master the ggplot2 approach more fully.

19.3 Specifying the plot type with geoms

Whereas the ggplot() function specifies the data source and variables to be plotted, the geom functions specify how these variables are to be visually represented (using points, bars, lines, and shaded regions). Currently, 37 geoms are available. Table 19.2 lists the more common ones, along with frequently used options. The options are described more fully in table 19.3.

Table 19.2 Geom functions

Function	Adds	Options

geom_bar()	Bar chart	color, fill, alpha
geom_boxplot()	Box plot	color, fill, alpha, notch, width
geom_density()	Density plot	color, fill, alpha, linetype
geom_histogram()	Histogram	color, fill, alpha, linetype, binwidth
geom_hline()	Horizontal lines	color, alpha, linetype, size
geom_jitter()	Jittered points	color, size, alpha, shape
geom_line()	Line graph	colorvalpha, linetype, size
geom_point()	Scatterplot	color, alpha, shape, size
geom_rug()	Rug plot	color, side
geom_smooth()	Fitted line	method, formula, color, fill, linetype, size
geom_text()	Text annotations	Many; see the help for this function
geom_violin()	Violin plot	color, fill, alpha, linetype
geom_vline()	Vertical lines	color, alpha, linetype, size

Most of the graphs described in this book can be created using the geoms in table 19.2. For example, the code

data(singer, package="lattice")

ggplot(singer, aes(x=height)) + geom_histogram()

444	CHAPTER 19 Advanced graphics with ggplot2

30
20
count
10
0
60	65	70	75
		height

Figure 19.4 Histogram of singer heights

produces the histogram in figure 19.4, and

ggplot(singer, aes(x=voice.part, y=height)) + geom_boxplot()

produces the box plot in figure 19.5.

From figure 19.5, it appears that basses tend to be taller and sopranos tend to be shorter. Although gender wasn’t measured, it probably accounts for much of the variation you see.

height

Bass 2

Bass 1

Tenor 2

Tenor 1

Alto 2

Alto 1

Soprano 2 Soprano 1

voice.part

Figure 19.5 Box plot of singer heights by voice part

Specifying the plot type with geoms

445

Note that only the x variable was specified when creating a histogram, but both an x and a y variable were specified for the box plot. The geom_histogram() function defaults to counts on the y-axis when no y variable is specified. See the documentation for each function for details and additional examples.

Each geom function has a set of options that can be used to modify its representation. Common options are listed in table 19.3.

Table 19.3 Common options for geom functions

Option	Specifies

color	Color of points, lines, and borders around filled regions.
fill	Color of filled areas such as bars and density regions.
alpha	Transparency of colors, ranging from 0 (fully transparent) to 1 (opaque).
linetype	Pattern for lines (1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash,
	6 = twodash).
size	Point size and line width.
shape	Point shapes (same as pch, with 0 = open square, 1 = open circle, 2 = open triangle,
	and so on). See figure 3.4 for examples.
position	Position of plotted objects such as bars and points. For bars, "dodge" places grouped
	bar charts side by side, "stacked" vertically stacks grouped bar charts, and "fill"
	vertically stacks grouped bar charts and standardizes their heights to be equal. For
	points, "jitter" reduces point overlap.
binwidth	Bin width for histograms.
notch	Indicates whether box plots should be notched (TRUE/FALSE).
sides	Placement of rug plots on the graph ("b" = bottom, "l" = left, "t" = top, "r" = right,
	"bl" = both bottom and left, and so on).
width	Width of box plots.

You can examine the use of many of these options using the Salaries dataset. The code

data(Salaries, package="car") library(ggplot2)

ggplot(Salaries, aes(x=rank, y=salary)) + geom_boxplot(fill="cornflowerblue", color="black", notch=TRUE)+

geom_point(position="jitter", color="blue", alpha=.5)+ geom_rug(side="l", color="black")

produces the plot in figure 19.6. The figure displays notched box plots of salary by academic rank. The actual observations (teachers) are overlaid and given some transparency so they don’t obscure the box plots. They’re also jittered to reduce their overlap. Finally, a rug plot is provided on the left to indicate the general spread of salaries.

446	CHAPTER 19 Advanced graphics with ggplot2

salary

●

200000

150000

100000

50000

AsstProf

AssocProf

Prof

rank

Figure 19.6 Notched box plots with superimposed points describing the salaries of college professors by rank. A rug plot is provided on the vertical axis.

From figure 19.6, you can see that the salaries of assistant, associate, and full professors differ significantly from each other (there is no overlap in the box plot notches). Additionally, the variance in salaries increases with greater rank, with a large range of salaries for full professors. In fact, at least one full professor earns less than assistant professors. There are also three full professors whose salaries are so large as to make them outliers (as indicated by the black dots in the Prof box plot). Having been a full professor earlier in my career, the data suggests to me that I was clearly underpaid.

The real power of the ggplot2 package is realized when geoms are combined to form new types of plots. Returning to the singer dataset, the code

library(ggplot2)

data(singer, package="lattice") ggplot(singer, aes(x=voice.part, y=height)) +

geom_violin(fill="lightblue") + geom_boxplot(fill="lightgreen", width=.2)

combines box plots with violin plots to create a new type of graph (displayed in figure 19.7). The box plots show the 25th, 50th, and 75th percentile scores for each voice part in the singer dataframe, along with any outliers. The violin plots provide more visual cues as to the distribution of scores over the range of heights for each voice part.

<<< < Предыдущая 123 124 125 126 127 128 129 130 131 132 133 134135 / 173135 136 137 138 139 140 141 142 143 144 145 146 147 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
05.08.2019741.83 Кб0psihologia.rtf
#
02.06.2015162.69 Кб76Psyh_final_ver.docx
#
02.06.2015141.74 Кб44Psyh_final_ver.docx
#
26.03.2016226.3 Кб23public_corporation.doc
#
26.03.2016451.53 Кб7pud_finansovyy-menedjment_318476.pdf
#
26.03.201620.33 Mб540R in Action, Second Edition.pdf
#
26.03.2016296.21 Кб17Radaev_Kak_napisat_akademicheskiy_text.pdf
#
26.03.20163.76 Mб4Raeff_Modernity.pdf
#
26.03.20162.12 Mб19raigorodskii_d_ya_hrestomatiya_psihologiya_lich.pdf
#
02.06.2015494.59 Кб6raschet_SRK_smorodin.doc
#
02.06.201563.98 Кб4referat_IOGP_3.docx