Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Robert I. Kabacoff - R in action

.pdf
Скачиваний:
97
Добавлен:
02.06.2015
Размер:
12.13 Mб
Скачать

416

APPENDIX D Creating publication-quality output

If you look at Figure D.4, you’ll note that the ANOVA table isn’t attractively formatted (as it was in Sweave). Rather, the table is in the standard monospaced font produced by R. This is because odfWeave doesn’t have a formatting function for the objects

My Sample Report

Robert I. Kabacoff, Ph.D.

<<echo=false, results=hide>>= library(multcomp) library(xtable) attach(cholesterol)

@

1 Results

Cholesterol reduction was assessed in a study that randomized \Sexpr{nrow(cholesterol)} patients to one of \Sexpr{length(unique(trt))} treatments. Summary statistics are provided in Table 1.

Table 1. Descriptive Statistics for each treatment group <<echo = false, results = xml>>=

descTable <- data.frame("Treatment" = sort(unique(trt)), "N" = as.vector(table(trt)),

"Mean" = tapply(response, list(trt), mean, na.rm=TRUE), "SD" = tapply(response, list(trt), sd, na.rm=TRUE)

)

odfTable(descTable)

@

The analysis of variance is provided Table 2.

Table 2. Analysis of Variance

<<echo=false>>=

fit <- aov(response ~ trt) summary(fit)

@

and group differences are plotted in Figure 1.

<<fig=TRUE,echo=FALSE>>=

par(mar=c(5,4,6,2))

tuk <- glht(fit, linfct=mcp(trt="Tukey"))

plot(cld(tuk, level=.05),col="lightgrey",xlab="Treatment", ylab="Response") box("figure")

@

Figure1. Distribution of response times and pair-wise comparisons.

Figure D.4 Initial noweb file (example.odt) to be processed through odfWeave

Joining forces with OpenOffice using odfWeave

417

My Sample Report

Robert I. Kabacoff, Ph.D.

1 Results

Cholesterol reduction was assessed in a study that randomized 50 patients to one of 5 treatments. Summary statistics are provided in Table 1.

 

Table 1. Descriptive Statistics for each treatment group

 

 

Treatment

N

Mean

SD

1time

1time

10

5.782

2.878

2times

2times

10

9.225

3.483

4times

4times

10

12.375

2.923

drugD

drugD

10

15.361

3.455

drugE

drugE

10

20.948

3.345

The analysis of variance is provided Table 2.

 

 

 

Table 2. Analysis of Variance

 

 

Df

 

Sum Sq Mean Sq F value

Pr(>F)

 

trt

4

1351.37

337.84

32.433

9.819e-13

***

Residuals

45

 

468.75

10.42

 

 

 

---

 

 

 

 

 

 

 

Signif. codes:

0 '***'

0.001 '**' 0.01 '*' 0.05

'.' 0.1 ' ' 1

and group differences are plotted in Figure 1.

Figure D.5 Final report in ODF format (example-out.odt). Page 2 is similar to the second page of the Sweave output in figure D.2 and is omitted to save space

returned by lm(), glm(), and so forth. To properly format these results, we’d have to pull the components out of the object in question (fit in this case), and arrange them in a matrix or data frame.

Once you have your report in ODF format, you can continue to edit it, tighten up the formatting, and save the results to an ODT, HTML, DOC, or DOCX file format. To learn more, read the odfWeave manual and vignette.

418

APPENDIX D Creating publication-quality output

D.3 Comments

There are several advantages to the Sweave and odfWeave approaches described here. By embedding the code needed to perform the statistical analyses directly into the final report, you document exactly how the results were calculated. Six months from now, you can easily see what was done. You can also modify the statistical analyses or add new data and immediately regenerate the report with minimum effort. Additionally, you avoid the need to cut and paste and reformat the results.

Unfortunately, you gain these advantages by putting in significantly more work at the front-end. There are other disadvantages as well. In the case of LaTeX, you need to learn a typesetting language. In the case of ODF, you need to use a program like OpenOffice that may not be standard in your work environment.

For good or ill, Microsoft Word and PowerPoint are the current report and presentation standards in the business world. The packages R2wd and R2PPT can be used to dynamically create Word and PowerPoint documents with inserted R output, but they are in their formative stages of development. I’m looking forward to seeing fully developed implementations.

appendix E Matrix Algebra in R

Many of the functions described in this book operate on matrices. The manipulation of matrices is built deeply into the R language. Table E.1 describes operators and functions that are particularly important for solving linear algebra problems. In the following table, A and B are matrices, x and b are vectors, and k is a scalar.

Table E.1 R functions and operators for matrix algebra

Operator or Function

Description

 

 

+ - * / ^

Element-wise addition, subtraction, multiplication, division, and

 

exponentiation, respectively.

A %*% B

Matrix multiplication.

A %o% B

Outer product. AB'.

cbind(A, B, …)

Combine matrices or vectors horizontally.

chol(A)

Choleski factorization of A. If R <- chol(A), then chol(A)

 

contains the upper triangular factor, such that R’R = A.

colMeans(A)

Returns a vector containing the column means of A.

crossprod(A)

A’A.

crossprod(A,B)

A’B.

 

 

419

420

APPENDIX E Matrix Algebra in R

Table E.1 R functions and operators for matrix algebra (continued )

Operator or Function

Description

 

 

colSums(A)

Returns a vector containing the column sums of A.

diag(A)

Returns a vector containing the elements of the principal diagonal.

diag(x)

Creates a diagonal matrix with the elements of x in the principal diagonal.

diag(k)

If k is a scalar, this creates a k x k identity matrix.

eigen(A)

Eigenvalues and eigenvectors of A. If y <- eigen(A), then

 

y$val are the eigenvalues of A and

 

y$vec are the eigenvectors of A.

ginv(A)

Moore-Penrose Generalized Inverse of A. (Requires the MASS package).

qr(A)

QR decomposition of A. If y <- qr(A), then

 

y$qr has an upper triangle containing the decomposition and a lower

 

triangle that contains information on the decomposition,

 

y$rank is the rank of A,

 

y$qraux is a vector containing additional information on Q, and

 

y$pivot contains information on the pivoting strategy used.

rbind(A, B, …)

Combines matrices or vectors ver tically.

rowMeans(A)

Returns a vector containing the row means of A.

rowSums(A)

Returns a vector containing the row sums of A.

solve(A)

Inverse of A where A is a square matrix.

solve(A, b)

Solves for vector x in the equation b = Ax.

svd(A)

Single value decomposition of A. If y <- svd(A), then

 

y$d is a vector containing the singular values of A,

 

y$u is a matrix with columns containing the left singular vectors of A, and

 

y$v is a matrix with columns containing the right singular vectors of A.

t(A)

Transpose of A.

 

 

There are several user-contributed packages that are particularly useful for matrix algebra. The matlab package contains wrapper functions and variables used to replicate MATLAB function calls as closely as possible. These functions can help port MATLAB applications and code to R. There’s also a useful cheat sheet for converting MATLAB statements to R statements at http://mathesaurus.sourceforge.net/octave-r.html.

The Matrix package contains functions that extend R in order to support highly dense or sparse matrices. It provides efficient access to BLAS (Basic Linear Algebra Subroutines), Lapack (dense matrix), TAUCS (sparse matrix), and UMFPACK (sparse matrix) routines.

Finally, the matrixStats package provides methods for operating on the rows and columns of matrices, including functions that calculate counts, sums, products, central tendency, dispersion, and more. Each is optimized for speed and efficient memory use.

appendix F Packages used in this book

R derives much of its breadth and power from the contributions of selfless authors. Table F.1 lists the user-contributed packages described in this book, along with the chapter(s) in which they appear.

Table F.1 Contributed packages used in this book

Package

Authors

Description

Chapters

 

 

 

 

AER

Christian Kleiber and

Functions, data sets,

13

 

Achim Zeileis

examples, demos,

 

 

 

and vignettes from

 

 

 

the book Applied

 

 

 

Econometrics with R by

 

 

 

Christian Kleiber and

 

 

 

Achim Zeileis

 

Amelia

James Honaker, Gar y

Amelia II: A program

15

 

King, and Matthew

for missing data via

 

 

Blackwell

multiple imputation

 

arrayImpute

Eun-kyung Lee, Dankyu

Missing imputation for

15

 

Yoon, and Taesung Park

microarray data

 

arrayMissPattern

Eun-kyung Lee and

Explorator y analysis of

15

 

Taesung Park

missing patterns for

 

 

 

microarray data

 

 

 

 

 

421

422

APPENDIX F Packages used in this book

Table F.1 Contributed packages used in this book (continued )

Package

Authors

Description

Chapters

 

 

 

 

boot

S original by Angelo Canty.

Bootstrap functions

12

 

R por t by Brian Ripley.

 

 

ca

Michael Greenacre and

Simple, multiple and joint

7

 

Oleg Nenadic

correspondence analysis

 

car

John Fox and Sanford

Companion to Applied

1, 8, 9,

 

Weisberg

Regression

10, 11

cat

Por ted to R by Ted Harding

Analysis of categorical-

15

 

and Fernando Tusell.

variable datasets with

 

 

Original by Joseph L.

missing values

 

 

Schafer.

 

 

coin

Torsten Hothorn, Kur t

Conditional inference

12

 

Hornik, Mark A. van de

procedures in a permutation

 

 

Wiel, and Achim Zeileis

test framework

 

corrgram

Kevin Wright

Plot a correlogram

11

corrperm

Douglas M. Potter

Permutation tests of

12

 

 

correlation with repeated

 

 

 

measurements

 

doBy

Søren Højsgaard with

Group-wise computations of

7

 

contributions from Kevin

summar y statistics, general

 

 

Wright and Alessandro A.

linear contrasts and other

 

 

Leidi.

utilities

 

effects

John Fox and Jangman

Effect displays for linear,

8, 9

 

Hong

generalized linear,

 

 

 

multinomial-logit, and

 

 

 

propor tional-odds logit

 

 

 

models

 

FactoMineR

Francois Husson, Julie

Multivariate explorator y data

14

 

Josse, Sebastien Le, and

analysis and data mining

 

 

Jeremy Mazet

with R

 

FAiR

Ben Goodrich

Factor analysis using a

14

 

 

genetic algorithm

 

fCalendar

Diethelm Wuer tz and

Functions for chronological

4

 

Yohan Chalabi

and calendarical objects

 

foreign

R-core members, Saikat

Read data stored by Minitab,

2

 

DebRoy, Roger Bivand,

S, SAS, SPSS, Stata, Systat,

 

 

and others

dBase, and others

 

gclus

Catherine Hurley

Clustering graphics

1, 11

 

 

 

 

APPENDIX F Packages used in this book

423

Table F.1 Contributed packages used in this book (continued )

Package

Authors

Description

Chapters

 

 

 

 

glmPerm

Wiebke Wer ft and Douglas

Permutation test for

12

 

M. Potter

inference in generalized

 

 

 

linear models

 

gmodels

Gregor y R. Warnes.

Various R programming tools

7

 

Includes R source code

for model fitting

 

 

and/or documentation

 

 

 

contributed by Ben

 

 

 

Bolker, Thomas Lumley,

 

 

 

and Randall C Johnson.

 

 

 

Contributions from Randall

 

 

 

C. Johnson are Copyright

 

 

 

(2005) SAIC-Frederick, Inc.

 

 

gplots

Gregor y R. Warnes.

Various R programming tools

6, 9

 

Includes R source code

for plotting data

 

 

and/or documentation

 

 

 

contributed by Ben Bolker,

 

 

 

Lodewijk Bonebakker,

 

 

 

Rober t Gentleman,

 

 

 

Wolfgang Huber Andy

 

 

 

Liaw, Thomas Lumley,

 

 

 

Mar tin Maechler, Arni

 

 

 

Magnusson, Steffen

 

 

 

Moeller, Marc Schwar tz,

 

 

 

and Bill Venables

 

 

grid

Paul Murrell

A rewrite of the graphics

16

 

 

layout capabilities, plus

 

 

 

some suppor t for interaction

 

gvlma

Edsel A. Pena and

Global validation of linear

8

 

Elizabeth H. Slate

models assumptions

 

hdf5

Marcus G. Daniels

Inter face to the NCSA HDF5

2

 

 

librar y

 

hexbin

Dan Carr, por ted by

Hexagonal binning routines

11

 

Nicholas Lewin-Koh and

 

 

 

Mar tin Maechler

 

 

HH

Richard M. Heiberger

Suppor t software for

9

 

 

Statistical Analysis and Data

 

 

 

Display by Heiberger and

 

 

 

Holland

 

Hmisc

Frank E Harrell Jr, with

Harrell miscellaneous

2, 3, 7

 

contributions from many

functions for data analysis,

 

 

other users

high-level graphics, utility

 

 

 

operations, and more

 

 

 

 

 

424

APPENDIX F Packages used in this book

Table F.1 Contributed packages used in this book (continued )

Package

Authors

Description

Chapters

 

 

 

 

kmi

Ar thur Allignol

Kaplan-Meier multiple

15

 

 

imputation for the analysis

 

 

 

of cumulative incidence

 

 

 

functions in the competing

 

 

 

risks setting

 

lattice

Deepayan Sarkar

Lattice graphics

16

latticist

Felix Andrews

GUI for explorator y

16

 

 

visualization

 

lavaan

Yves Rosseel

Functions for latent

14

 

 

variable models, including

 

 

 

confirmator y factor analysis,

 

 

 

structural equation modeling,

 

 

 

and latent growth cur ve

 

 

 

models

 

lcda

Michael Buecker

Latent class discriminant

14

 

 

analysis

 

leaps

Thomas Lumley using

Regression subset selection

8

 

For tran code by Alan Miller

including exhaustive search

 

lmPerm

Bob Wheeler

Permutation tests for linear

12

 

 

models

 

logregperm

Douglas M. Potter

Permutation test for

12

 

 

inference in logistic

 

 

 

regression

 

longitudinalData

Christophe Genolini

Tools for longitudinal data

15

lsa

Fridolin Wild

Latent semantic analysis

14

ltm

Dimitris Rizopoulos

Latent trait models under

14

 

 

item response theor y

 

lubridate

Garrett Grolemund and

Functions to identify and

4

 

Hadley Wickham

parse date-time data, extract

 

 

 

and modify components of a

 

 

 

date-time, per form accurate

 

 

 

math on date-times, and

 

 

 

handle time zones and

 

 

 

Daylight Savings Time

 

MASS

S original by Venables and

Functions and datasets

4, 5, 7,

 

Ripley. R por t by Brian

to suppor t Venables and

8, 9, 12

 

Ripley, following earlier

Ripley’s Modern Applied

 

 

work by Kur t Hornik and

Statistics with S (4th edition)

 

 

Albrecht Gebhardt.

 

 

 

 

 

 

APPENDIX F Packages used in this book

425

Table F.1 Contributed packages used in this book (continued )

Package

Authors

Description

Chapters

 

 

 

 

mlogit

Yves Croissant

Estimation of the multinomial

13

 

 

logit model

 

multcomp

Torsten Hothorn, Frank

Simultaneous tests and

9, 12

 

Bretz Peter Westfall,

confidence inter vals for

 

 

Richard M. Heiberger, and

general linear hypotheses in

 

 

Andre Schuetzenmeister

parametric models, including

 

 

 

linear, generalized linear,

 

 

 

linear mixed effects, and

 

 

 

sur vival models

 

mvnmle

Kevin Gross, with help

ML estimation for

15

 

from Douglas Bates

multivariate normal data with

 

 

 

missing values

 

mvoutlier

Moritz Gschwandtner and

Multivariate outlier detection

9

 

Peter Filzmoser

based on robust methods

 

ncdf, ncdf4

David Pierce

Inter face to Unidata netCDF

2

 

 

data files

 

nFactors

Gilles Raiche

Parallel analysis and non

14

 

 

graphical solutions to the

 

 

 

Cattell scree test

 

npmc

Joerg Helms and Ullrich

Nonparametric multiple

7

 

Munzel

comparisons

 

OpenMx

Steven Boker, Michael

Advanced structural equation

14

 

Neale, Hermine Maes,

modeling.

 

 

Michael Wilde, Michael

 

 

 

Spiegel, Timothy R. Brick,

 

 

 

Jeffrey Spies, Ryne

 

 

 

Estabrook, Sarah Kenny,

 

 

 

Timothy Bates, Paras

 

 

 

Mehta, and John Fox

 

 

pastecs

Frederic Ibanez, Philippe

Package for the analysis of

7

 

Grosjean, and Michele

space-time ecological series

 

 

Etienne

 

 

piface

Russell Lenth, R package

Java applets for power and

10

 

inter face by Tobias

sample size assessment

 

 

Verbeke

 

 

playwith

Felix Andrews

A GTK+ graphical user

16

 

 

inter face for editing and

 

 

 

interacting with R plots

 

poLCA

Drew Linzer and Jeffrey

Polytomous variable latent

14

 

Lewis

class analysis

 

 

 

 

 

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]