Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
R in Action, Second Edition.pdf
Скачиваний:
546
Добавлен:
26.03.2016
Размер:
20.33 Mб
Скачать

Creating dynamic reports with R and Markdown

517

The examples in this chapter are based on descriptive statistics, regression, and ANOVA problems. None of them represent full analyses of the data. The goal in this chapter is to learn how to incorporate the R results into various types of reports. Feel free to jump around in this chapter, reading the sections that are most relevant to you.

Depending on the template file you start with and the functions used to process it, different report formats (HTML web pages, Microsoft Word documents, OpenOffice Writer documents, PDF reports, articles, and books) are created. The reports are dynamic in the sense that changing the data and reprocessing the template file will result in a new report.

In this chapter, you’ll work with four types of templates: an R Markdown template, an ODT template, a DOCX template, and a LaTeX template. R Markdown templates can be used to create HTML, PDF, and MS Word documents. ODT and DOCX templates are used to create Open Document and Microsoft Word documents, respectively. LaTeX templates are used to create publication-quality PDF documents, including reports, articles, and books. Let’s consider each in turn.

22.2 Creating dynamic reports with R and Markdown

In this section, you’ll use the rmarkdown package to create documents generated from Markdown syntax and R code. When the document is processed, the R code is executed, and the output is formatted and embedded in the finished document. You can use this approach to generate reports as HTML, Word, or PDF documents. Here are the steps:

1Install the rmarkdown package (install.packages("rmarkdown")). This will install several other packages including knitr. If you’re using a recent version of RStudio, you can skip this step because you already have the necessary packages.

2Install the xtable package (install.packages("xtable")). The xtable() function in this package attractively formats data frames and matrices for inclusion in reports. xtable() can also format objects produced by the lm(), glm(), aov(), table(), ts(), and coxph() functions. After loading the package, use methods(xtable) to view a comprehensive list of the objects it can format.

3Install Pandoc (http://johnmacfarlane.net/pandoc/index.html). Pandoc is a free application available for Windows, Mac OS X, and Linux. It converts files from one markup format to another. Again, RStudio users can skip this step.

4If you want to create PDF documents, install a LaTeX compiler. A LaTeX compiler converts a LaTeX document into a high-quality typeset PDF document. I recommend MiKTeX (www.miktex.org) for Windows, MacTeX for Macs (http:// tug.org/mactex), and TeX Live for Linux (www.tug.org/texlive).

With the software set up, you’re ready to go.

To incorporate R output (values, tables, graphs) in a document using Markdown syntax, first create a text document that contains

Report text

Markdown syntax

R code chunks (R code surrounded by delimiters)

518

CHAPTER 22 Creating dynamic reports

By convention, the text file has the filename extension .Rmd.

A sample file (named women.Rmd) is provided in listing 22.1. To generate an HTML document, process this file using

library(rmarkdown) render("women.Rmd", "html_document")

The results are displayed in figure 22.1.

Listing 22.1 women.Rmd: a Markdown template with embedded R code

# Regression Report

```{r echo=FALSE, results='hide'} n <- nrow(women)

fit <- lm(weight ~ height, data=women) sfit <- summary(fit)

b <- coefficients(fit)

```

b Markdown syntax

c R code chunk

R inline code d

Linear regression was used to model the relationship between weights and height in a sample of `r n` women. The equation

**weight = `r b[[1]]` + `r b[[2]]` * height** accounted for `r round(sfit$r.squared,2)`% of the variance in weights. The ANOVA table is given below.

```{r echo=FALSE, results='asis'} library(xtable) options(xtable.comment=FALSE)

print(xtable(sfit), type="html", html.table.attributes="border=0")

```

The regression is plotted in the following figure.

```{r echo=FALSE, fig.width=5, fig.height=4} library(ggplot2)

ggplot(data=women, aes(x=height, y=weight)) + geom_point() + geom_smooth(method="lm")

```

e Formats output with xtable

The report starts with a first-level header b. It indicates that “Regression Report” should be printed in a large, bold font. Examples of other Markdown syntax are given in table 22.1.

Table 22.1 Markdown code and the resulting output

Markdown syntax

Resulting HTML output

 

 

# Heading 1

<h1>Heading 1</h1>

## Heading 2

<h2>Heading 2</h2>

...

...

###### Heading 6

<h6>Heading 2</h6>

One or more blank lines between text

Separates text into paragraphs

 

 

Creating dynamic reports with R and Markdown

519

Table 22.1 Markdown code and the resulting output

 

 

 

 

Markdown syntax

Resulting HTML output

 

 

 

 

Two or more spaces at the end of a line

Adds a line break

 

*I mean it*

<em>I mean it</em>

 

**I really mean it**

<strong>I really mean it</strong>

 

* item 1

<ul>

 

* item 2

<li> item 1 </li>

 

 

<li> item 2 </li>

 

 

</ul>

 

1. item 1

<ol>

 

2. item 2

<li> item 1 </li>

 

 

<li> item 2 </li>

 

 

</ol>

 

[Google](http://google.com)

<a href="http://google.com">Google</a>

 

![My text](path to image)

<img src="path to image", alt="My text">

 

 

 

 

Next comes an R code chunk. R code in Markdown documents is delimited by ```{r

options} and ``` c. When the file is processed, the R code is executed and the results are inserted. Code chunk options are described in table 22.2.

Table 22.2 Code chunk options

Option

Description

 

 

echo

Whether to include the R source code in the output (TRUE) or not (FALSE)

results

Whether to output raw results (asis) or hide the results (hide)

warning

Whether to include warnings in the output (TRUE) or not (FALSE)

message

Whether to include informational messages in the output (TRUE) or not (FALSE)

error

Whether to include error messages in in the output (TRUE) or not (FALSE)

fig.width

Figure width for plots (inches)

fig.height

Figure height for plots (inches)

 

 

Simple R output (a number or string) can also be placed directly within report text. This inline R code allows you to customize the text in individual sentences. Inline code is placed between `r and ` tags d. In the regression example, the sample size, prediction equation, and R-squared value are embedded in the first paragraph.

Finally, you use the xtable() function to format the regression results e. The statement options(xtable.comment=FALSE) suppresses superfluous messages. The type="html" option in the print() function outputs the xtable object as an HTML table. By default, this table has an unattractive 1-pixel border that’s removed by

520

CHAPTER 22 Creating dynamic reports

adding

html.table.attributes="border=0". See help(print.xtable) for addi-

tional formatting options.

To render the file as a PDF document, you only have to make one change. Replace

print(xtable(sfit), type="html", html.table.attributes="border=0")

with

print(xtable(sfit), type="latex")

Then process the file using

library(rmarkdown) render("women.Rmd", "pdf_document")

to get a nicely formatted PDF document.

Unfortunately, the xtable() function doesn’t work for Word documents. You’ll have to get a bit more creative to render statistical output in an attractive fashion. One possibility is to replace xtable() with the kable() function in the knitr package. It can render matrices and data frames in a simple and appealing manner.

Replace

library(xtable)

options(xtable.comment=FALSE)

print(xtable(sfit), type="html", html.table.attributes="border=0")

with

library(knitr) kable(sfit$coefficients)

Then render the file using

library(rmarkdown) render("women.Rmd", "word_document")

The result is an attractive Word document that you can edit using Word. Note that you had to replace the sfit object with sfit$coefficients. The xtable() function can handle lm() objects, but the kable() function can only handle matrices and data frames. Therefore, you have to extract the parts you want to print from more complicated objects. See help(kable) for more details.

Using RStudio to create and process R Markdown documents

Throughout this book, I’ve tried to keep the presentation independent of the interface used to access R. Each of the techniques described will work in the basic R Console. But there are several other options, including RStudio (see appendix A). RStudio makes it particularly easy to render reports from Markdown documents.

If you choose File > New File > R Markdown from the GUI menu, you’ll see the dialog box shown next.

Creating dynamic reports with R and Markdown

521

Dialog box for creating a new

R Markdown document in

RStudio

Choose the type of report you want to generate, and RStudio will create a skeleton file for you. Edit it with your text and code, and then select the rendering option from the Knit drop-down list. That’s it!

Drop-down menu for generating an HTML, PDF, or Word report from an R Markdown document

RStudio has many useful features for programmers. It’s by far my favorite way to work in R.

Markdown syntax is convenient for creating simple documents quickly. To learn more about Markdown, visit the homepage at http://daringfireball.net/projects/markdown and the rmarkdown documentation at http://rmarkdown.rstudio.com. If you want to create complex documents such as publication-quality articles and books, then you may want to look at using LaTeX as your markup language. In the next section, you’ll use LaTeX and the knitr package to create high-quality typeset documents.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]