
Robert I. Kabacoff - R in action
.pdf
416 |
APPENDIX D Creating publication-quality output |
If you look at Figure D.4, you’ll note that the ANOVA table isn’t attractively formatted (as it was in Sweave). Rather, the table is in the standard monospaced font produced by R. This is because odfWeave doesn’t have a formatting function for the objects
My Sample Report
Robert I. Kabacoff, Ph.D.
<<echo=false, results=hide>>= library(multcomp) library(xtable) attach(cholesterol)
@
1 Results
Cholesterol reduction was assessed in a study that randomized \Sexpr{nrow(cholesterol)} patients to one of \Sexpr{length(unique(trt))} treatments. Summary statistics are provided in Table 1.
Table 1. Descriptive Statistics for each treatment group <<echo = false, results = xml>>=
descTable <- data.frame("Treatment" = sort(unique(trt)), "N" = as.vector(table(trt)),
"Mean" = tapply(response, list(trt), mean, na.rm=TRUE), "SD" = tapply(response, list(trt), sd, na.rm=TRUE)
)
odfTable(descTable)
@
The analysis of variance is provided Table 2.
Table 2. Analysis of Variance
<<echo=false>>=
fit <- aov(response ~ trt) summary(fit)
@
and group differences are plotted in Figure 1.
<<fig=TRUE,echo=FALSE>>=
par(mar=c(5,4,6,2))
tuk <- glht(fit, linfct=mcp(trt="Tukey"))
plot(cld(tuk, level=.05),col="lightgrey",xlab="Treatment", ylab="Response") box("figure")
@
Figure1. Distribution of response times and pair-wise comparisons.
Figure D.4 Initial noweb file (example.odt) to be processed through odfWeave

Joining forces with OpenOffice using odfWeave |
417 |
My Sample Report
Robert I. Kabacoff, Ph.D.
1 Results
Cholesterol reduction was assessed in a study that randomized 50 patients to one of 5 treatments. Summary statistics are provided in Table 1.
|
Table 1. Descriptive Statistics for each treatment group |
|
||
|
Treatment |
N |
Mean |
SD |
1time |
1time |
10 |
5.782 |
2.878 |
2times |
2times |
10 |
9.225 |
3.483 |
4times |
4times |
10 |
12.375 |
2.923 |
drugD |
drugD |
10 |
15.361 |
3.455 |
drugE |
drugE |
10 |
20.948 |
3.345 |
The analysis of variance is provided Table 2.
|
|
|
Table 2. Analysis of Variance |
|
|||
|
Df |
|
Sum Sq Mean Sq F value |
Pr(>F) |
|
||
trt |
4 |
1351.37 |
337.84 |
32.433 |
9.819e-13 |
*** |
|
Residuals |
45 |
|
468.75 |
10.42 |
|
|
|
--- |
|
|
|
|
|
|
|
Signif. codes: |
0 '***' |
0.001 '**' 0.01 '*' 0.05 |
'.' 0.1 ' ' 1 |
and group differences are plotted in Figure 1.
Figure D.5 Final report in ODF format (example-out.odt). Page 2 is similar to the second page of the Sweave output in figure D.2 and is omitted to save space
returned by lm(), glm(), and so forth. To properly format these results, we’d have to pull the components out of the object in question (fit in this case), and arrange them in a matrix or data frame.
Once you have your report in ODF format, you can continue to edit it, tighten up the formatting, and save the results to an ODT, HTML, DOC, or DOCX file format. To learn more, read the odfWeave manual and vignette.
418 |
APPENDIX D Creating publication-quality output |
D.3 Comments
There are several advantages to the Sweave and odfWeave approaches described here. By embedding the code needed to perform the statistical analyses directly into the final report, you document exactly how the results were calculated. Six months from now, you can easily see what was done. You can also modify the statistical analyses or add new data and immediately regenerate the report with minimum effort. Additionally, you avoid the need to cut and paste and reformat the results.
Unfortunately, you gain these advantages by putting in significantly more work at the front-end. There are other disadvantages as well. In the case of LaTeX, you need to learn a typesetting language. In the case of ODF, you need to use a program like OpenOffice that may not be standard in your work environment.
For good or ill, Microsoft Word and PowerPoint are the current report and presentation standards in the business world. The packages R2wd and R2PPT can be used to dynamically create Word and PowerPoint documents with inserted R output, but they are in their formative stages of development. I’m looking forward to seeing fully developed implementations.


420 |
APPENDIX E Matrix Algebra in R |
Table E.1 R functions and operators for matrix algebra (continued )
Operator or Function |
Description |
|
|
colSums(A) |
Returns a vector containing the column sums of A. |
diag(A) |
Returns a vector containing the elements of the principal diagonal. |
diag(x) |
Creates a diagonal matrix with the elements of x in the principal diagonal. |
diag(k) |
If k is a scalar, this creates a k x k identity matrix. |
eigen(A) |
Eigenvalues and eigenvectors of A. If y <- eigen(A), then |
|
y$val are the eigenvalues of A and |
|
y$vec are the eigenvectors of A. |
ginv(A) |
Moore-Penrose Generalized Inverse of A. (Requires the MASS package). |
qr(A) |
QR decomposition of A. If y <- qr(A), then |
|
y$qr has an upper triangle containing the decomposition and a lower |
|
triangle that contains information on the decomposition, |
|
y$rank is the rank of A, |
|
y$qraux is a vector containing additional information on Q, and |
|
y$pivot contains information on the pivoting strategy used. |
rbind(A, B, …) |
Combines matrices or vectors ver tically. |
rowMeans(A) |
Returns a vector containing the row means of A. |
rowSums(A) |
Returns a vector containing the row sums of A. |
solve(A) |
Inverse of A where A is a square matrix. |
solve(A, b) |
Solves for vector x in the equation b = Ax. |
svd(A) |
Single value decomposition of A. If y <- svd(A), then |
|
y$d is a vector containing the singular values of A, |
|
y$u is a matrix with columns containing the left singular vectors of A, and |
|
y$v is a matrix with columns containing the right singular vectors of A. |
t(A) |
Transpose of A. |
|
|
There are several user-contributed packages that are particularly useful for matrix algebra. The matlab package contains wrapper functions and variables used to replicate MATLAB function calls as closely as possible. These functions can help port MATLAB applications and code to R. There’s also a useful cheat sheet for converting MATLAB statements to R statements at http://mathesaurus.sourceforge.net/octave-r.html.
The Matrix package contains functions that extend R in order to support highly dense or sparse matrices. It provides efficient access to BLAS (Basic Linear Algebra Subroutines), Lapack (dense matrix), TAUCS (sparse matrix), and UMFPACK (sparse matrix) routines.
Finally, the matrixStats package provides methods for operating on the rows and columns of matrices, including functions that calculate counts, sums, products, central tendency, dispersion, and more. Each is optimized for speed and efficient memory use.


422 |
APPENDIX F Packages used in this book |
Table F.1 Contributed packages used in this book (continued )
Package |
Authors |
Description |
Chapters |
|
|
|
|
boot |
S original by Angelo Canty. |
Bootstrap functions |
12 |
|
R por t by Brian Ripley. |
|
|
ca |
Michael Greenacre and |
Simple, multiple and joint |
7 |
|
Oleg Nenadic |
correspondence analysis |
|
car |
John Fox and Sanford |
Companion to Applied |
1, 8, 9, |
|
Weisberg |
Regression |
10, 11 |
cat |
Por ted to R by Ted Harding |
Analysis of categorical- |
15 |
|
and Fernando Tusell. |
variable datasets with |
|
|
Original by Joseph L. |
missing values |
|
|
Schafer. |
|
|
coin |
Torsten Hothorn, Kur t |
Conditional inference |
12 |
|
Hornik, Mark A. van de |
procedures in a permutation |
|
|
Wiel, and Achim Zeileis |
test framework |
|
corrgram |
Kevin Wright |
Plot a correlogram |
11 |
corrperm |
Douglas M. Potter |
Permutation tests of |
12 |
|
|
correlation with repeated |
|
|
|
measurements |
|
doBy |
Søren Højsgaard with |
Group-wise computations of |
7 |
|
contributions from Kevin |
summar y statistics, general |
|
|
Wright and Alessandro A. |
linear contrasts and other |
|
|
Leidi. |
utilities |
|
effects |
John Fox and Jangman |
Effect displays for linear, |
8, 9 |
|
Hong |
generalized linear, |
|
|
|
multinomial-logit, and |
|
|
|
propor tional-odds logit |
|
|
|
models |
|
FactoMineR |
Francois Husson, Julie |
Multivariate explorator y data |
14 |
|
Josse, Sebastien Le, and |
analysis and data mining |
|
|
Jeremy Mazet |
with R |
|
FAiR |
Ben Goodrich |
Factor analysis using a |
14 |
|
|
genetic algorithm |
|
fCalendar |
Diethelm Wuer tz and |
Functions for chronological |
4 |
|
Yohan Chalabi |
and calendarical objects |
|
foreign |
R-core members, Saikat |
Read data stored by Minitab, |
2 |
|
DebRoy, Roger Bivand, |
S, SAS, SPSS, Stata, Systat, |
|
|
and others |
dBase, and others |
|
gclus |
Catherine Hurley |
Clustering graphics |
1, 11 |
|
|
|
|

APPENDIX F Packages used in this book |
423 |
Table F.1 Contributed packages used in this book (continued )
Package |
Authors |
Description |
Chapters |
|
|
|
|
glmPerm |
Wiebke Wer ft and Douglas |
Permutation test for |
12 |
|
M. Potter |
inference in generalized |
|
|
|
linear models |
|
gmodels |
Gregor y R. Warnes. |
Various R programming tools |
7 |
|
Includes R source code |
for model fitting |
|
|
and/or documentation |
|
|
|
contributed by Ben |
|
|
|
Bolker, Thomas Lumley, |
|
|
|
and Randall C Johnson. |
|
|
|
Contributions from Randall |
|
|
|
C. Johnson are Copyright |
|
|
|
(2005) SAIC-Frederick, Inc. |
|
|
gplots |
Gregor y R. Warnes. |
Various R programming tools |
6, 9 |
|
Includes R source code |
for plotting data |
|
|
and/or documentation |
|
|
|
contributed by Ben Bolker, |
|
|
|
Lodewijk Bonebakker, |
|
|
|
Rober t Gentleman, |
|
|
|
Wolfgang Huber Andy |
|
|
|
Liaw, Thomas Lumley, |
|
|
|
Mar tin Maechler, Arni |
|
|
|
Magnusson, Steffen |
|
|
|
Moeller, Marc Schwar tz, |
|
|
|
and Bill Venables |
|
|
grid |
Paul Murrell |
A rewrite of the graphics |
16 |
|
|
layout capabilities, plus |
|
|
|
some suppor t for interaction |
|
gvlma |
Edsel A. Pena and |
Global validation of linear |
8 |
|
Elizabeth H. Slate |
models assumptions |
|
hdf5 |
Marcus G. Daniels |
Inter face to the NCSA HDF5 |
2 |
|
|
librar y |
|
hexbin |
Dan Carr, por ted by |
Hexagonal binning routines |
11 |
|
Nicholas Lewin-Koh and |
|
|
|
Mar tin Maechler |
|
|
HH |
Richard M. Heiberger |
Suppor t software for |
9 |
|
|
Statistical Analysis and Data |
|
|
|
Display by Heiberger and |
|
|
|
Holland |
|
Hmisc |
Frank E Harrell Jr, with |
Harrell miscellaneous |
2, 3, 7 |
|
contributions from many |
functions for data analysis, |
|
|
other users |
high-level graphics, utility |
|
|
|
operations, and more |
|
|
|
|
|

424 |
APPENDIX F Packages used in this book |
Table F.1 Contributed packages used in this book (continued )
Package |
Authors |
Description |
Chapters |
|
|
|
|
kmi |
Ar thur Allignol |
Kaplan-Meier multiple |
15 |
|
|
imputation for the analysis |
|
|
|
of cumulative incidence |
|
|
|
functions in the competing |
|
|
|
risks setting |
|
lattice |
Deepayan Sarkar |
Lattice graphics |
16 |
latticist |
Felix Andrews |
GUI for explorator y |
16 |
|
|
visualization |
|
lavaan |
Yves Rosseel |
Functions for latent |
14 |
|
|
variable models, including |
|
|
|
confirmator y factor analysis, |
|
|
|
structural equation modeling, |
|
|
|
and latent growth cur ve |
|
|
|
models |
|
lcda |
Michael Buecker |
Latent class discriminant |
14 |
|
|
analysis |
|
leaps |
Thomas Lumley using |
Regression subset selection |
8 |
|
For tran code by Alan Miller |
including exhaustive search |
|
lmPerm |
Bob Wheeler |
Permutation tests for linear |
12 |
|
|
models |
|
logregperm |
Douglas M. Potter |
Permutation test for |
12 |
|
|
inference in logistic |
|
|
|
regression |
|
longitudinalData |
Christophe Genolini |
Tools for longitudinal data |
15 |
lsa |
Fridolin Wild |
Latent semantic analysis |
14 |
ltm |
Dimitris Rizopoulos |
Latent trait models under |
14 |
|
|
item response theor y |
|
lubridate |
Garrett Grolemund and |
Functions to identify and |
4 |
|
Hadley Wickham |
parse date-time data, extract |
|
|
|
and modify components of a |
|
|
|
date-time, per form accurate |
|
|
|
math on date-times, and |
|
|
|
handle time zones and |
|
|
|
Daylight Savings Time |
|
MASS |
S original by Venables and |
Functions and datasets |
4, 5, 7, |
|
Ripley. R por t by Brian |
to suppor t Venables and |
8, 9, 12 |
|
Ripley, following earlier |
Ripley’s Modern Applied |
|
|
work by Kur t Hornik and |
Statistics with S (4th edition) |
|
|
Albrecht Gebhardt. |
|
|
|
|
|
|

APPENDIX F Packages used in this book |
425 |
Table F.1 Contributed packages used in this book (continued )
Package |
Authors |
Description |
Chapters |
|
|
|
|
mlogit |
Yves Croissant |
Estimation of the multinomial |
13 |
|
|
logit model |
|
multcomp |
Torsten Hothorn, Frank |
Simultaneous tests and |
9, 12 |
|
Bretz Peter Westfall, |
confidence inter vals for |
|
|
Richard M. Heiberger, and |
general linear hypotheses in |
|
|
Andre Schuetzenmeister |
parametric models, including |
|
|
|
linear, generalized linear, |
|
|
|
linear mixed effects, and |
|
|
|
sur vival models |
|
mvnmle |
Kevin Gross, with help |
ML estimation for |
15 |
|
from Douglas Bates |
multivariate normal data with |
|
|
|
missing values |
|
mvoutlier |
Moritz Gschwandtner and |
Multivariate outlier detection |
9 |
|
Peter Filzmoser |
based on robust methods |
|
ncdf, ncdf4 |
David Pierce |
Inter face to Unidata netCDF |
2 |
|
|
data files |
|
nFactors |
Gilles Raiche |
Parallel analysis and non |
14 |
|
|
graphical solutions to the |
|
|
|
Cattell scree test |
|
npmc |
Joerg Helms and Ullrich |
Nonparametric multiple |
7 |
|
Munzel |
comparisons |
|
OpenMx |
Steven Boker, Michael |
Advanced structural equation |
14 |
|
Neale, Hermine Maes, |
modeling. |
|
|
Michael Wilde, Michael |
|
|
|
Spiegel, Timothy R. Brick, |
|
|
|
Jeffrey Spies, Ryne |
|
|
|
Estabrook, Sarah Kenny, |
|
|
|
Timothy Bates, Paras |
|
|
|
Mehta, and John Fox |
|
|
pastecs |
Frederic Ibanez, Philippe |
Package for the analysis of |
7 |
|
Grosjean, and Michele |
space-time ecological series |
|
|
Etienne |
|
|
piface |
Russell Lenth, R package |
Java applets for power and |
10 |
|
inter face by Tobias |
sample size assessment |
|
|
Verbeke |
|
|
playwith |
Felix Andrews |
A GTK+ graphical user |
16 |
|
|
inter face for editing and |
|
|
|
interacting with R plots |
|
poLCA |
Drew Linzer and Jeffrey |
Polytomous variable latent |
14 |
|
Lewis |
class analysis |
|
|
|
|
|