Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
R in Action, Second Edition.pdf
Скачиваний:
540
Добавлен:
26.03.2016
Размер:
20.33 Mб
Скачать

appendix D Matrix algebra in R

Many of the functions described in this book operate on matrices. The manipulation of matrices is built deeply into the R language. Table D.1 describes operators and functions that are particularly important for solving linear algebra problems. In the table, A and B are matrices, x and b are vectors, and k is a scalar.

Table D.1 R functions and operators for matrix algebra

Operator or function

 

Description

 

 

+ - * / ^

Element-wise addition, subtraction, multiplication, division, and exponentia-

 

tion, respectively.

A %*% B

Matrix multiplication.

A %o% B

Outer product: AB'.

cbind(A, B, …)

Combines matrices or vectors horizontally. Returns a matrix.

chol(A)

Choleski factorization of A. If R <- chol(A), then chol(A) contains the

 

upper triangular factor, such that R'R = A.

colMeans(A)

Returns a vector containing the column means of A.

crossprod(A)

Returns A'A.

crossprod(A,B)

Returns A'B.

colSums(A)

Returns a vector containing the column sums of A.

diag(A)

Returns a vector containing the elements of the principal diagonal.

diag(x)

Creates a diagonal matrix with elements of x in the principal diagonal.

diag(k)

If k is a scalar, this creates a k × k identity matrix. Go figure.

eigen(A)

Eigenvalues and eigenvectors of A. If y <- eigen(A) then

 

y$val are the eigenvalues of A.

 

y$vec are the eigenvectors of A.

 

 

 

542

 

 

APPENDIX D Matrix algebra in R

543

Table D.1 R functions and operators for matrix algebra

 

 

 

 

 

 

 

Operator or function

 

Description

 

 

 

 

 

 

ginv(A)

Moore-Penrose Generalized Inverse of A. (Requires the MASS package.)

 

 

qr(A)

QR decomposition of A. If y <- qr(A), then

 

 

 

y$qr has an upper triangle that contains the decomposition and a lower

 

 

 

 

triangle that contains information on the decomposition.

 

 

 

y$rank is the rank of A.

 

 

 

y$qraux is a vector which contains additional information on Q.

 

 

 

y$pivot contains information on the pivoting strategy used.

 

 

rbind(A, B, …)

Combines matrices or vectors vertically. Returns a matrix.

 

 

rowMeans(A)

Returns a vector containing the row means of A.

 

 

rowSums(A)

Returns a vector containing the row sums of A.

 

 

solve(A)

Inverse of A where A is a square matrix.

 

 

solve(A, b)

Solves for vector x in the equation b = Ax.

 

 

svd(A)

Single-value decomposition of A. If y <- svd(A), then

 

 

 

y$d is a vector containing the singular values of A.

 

 

 

y$u is a matrix with columns containing the left singular vectors of A.

 

 

 

y$v is a matrix with columns containing the right singular vectors of A.

 

 

t(A)

Transpose of A.

 

 

 

 

 

 

 

Several user-contributed packages are particularly useful for matrix algebra. The matlab package contains wrapper functions and variables used to replicate MATLAB function calls as closely as possible. These functions can help you port MATLAB applications and code to R. There’s also a useful cheat sheet for converting MATLAB statements to R statements at http://mathesaurus.sourceforge.net/octave-r.html.

The Matrix package contains functions that extend R in order to support highly dense or sparse matrices. It provides efficient access to BLAS (Basic Linear Algebra Subroutines), Lapack (dense matrix), TAUCS (sparse matrix), and UMFPACK (sparse matrix) routines.

Finally, the matrixStats package provides methods for operating on the rows and columns of matrices, including functions that calculate counts, sums, products, central tendency, dispersion, and more. Each is optimized for speed and efficient memory use.

appendix E Packages used in this book

R derives much of its breadth and power from the contributions of selfless authors. Table E.1 lists the user-contributed packages described in this book, along with the chapter(s) in which they appear.

Table E.1 Contributed packages used in this book

Package

Authors

Description

Chapter(s)

 

 

 

 

AER

Christian Kleiber and Achim

Functions, data sets, examples,

13

 

Zeileis

demos, and vignettes from the

 

 

 

book Applied Econometrics with R

 

 

 

by Christian Kleiber and Achim

 

 

 

Zeileis (Springer, 2008)

 

Amelia

James Honaker, Gary King, and

Amelia II: a program for missing

18

 

Matthew Blackwell

data via multiple imputation

 

arrayImpute

Eun-kyung Lee, Dankyu Yoon, and

Missing imputation for microarray

18

 

Taesung Park

data

 

arrayMiss-

Eun-kyung Lee and Taesung

Exploratory analysis of missing pat-

18

Pattern

Park

terns for microarray data

 

boot

S original by Angelo Canty. R port

Bootstrap functions

12

 

by Brian Ripley

 

 

ca

Michael Greenacre and Oleg

Simple, multiple, and joint corre-

7

 

Nenadic

spondence analysis

 

car

John Fox and Sanford Weisberg

Companion to Applied

1, 8, 9,

 

 

Regression

10, 11,

 

 

 

19, 22

cat

Ported to R by Ted Harding and

Analysis of categorical-variable

15

 

Fernando Tusell; original by

datasets with missing values

 

 

Joseph L. Schafer

 

 

 

 

 

 

544

APPENDIX E Packages used in this book

545

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

coin

Torsten Hothorn, Kurt Hornik,

Conditional inference procedures in

12

 

Mark A. van de Wiel, and

a permutation test framework

 

 

Achim Zeileis

 

 

corrgram

Kevin Wright

Plots a corrgram

11

corrperm

Douglas M. Potter

Permutation tests of correlation

12

 

 

with repeated measurements

 

doBy

Søren Højsgaard with contribu-

Group-wise computations of sum-

7

 

tions from Kevin Wright and Ales-

mary statistics, general linear con-

 

 

sandro A. Leidi

trasts and other utilities

 

doParallel

Revolution Analytics, Steve

foreach parallel adaptor for the

20

 

Weston

parallel package

 

effects

John Fox and Jangman Hong

Effect displays for linear, general-

8, 9

 

 

ized linear, multinomial-logit, and

 

 

 

proportional-odds logit models

 

FactoMineR

Francois Husson, Julie Josse,

Multivariate exploratory data analy-

14

 

Sebastien Le, and Jeremy Mazet

sis and data mining with R

 

FAiR

Ben Goodrich

Factor analysis using a genetic

14

 

 

algorithm

 

fCalendar

Diethelm Wuertz and Yohan

Functions for chronological and

4

 

Chalabi

calendrical objects

 

flexclust

Friedrich Leish and Evgenia

Flexible cluster algorithms

16

 

Dimnitriadou

 

 

forecast

Rob J. Hyndman with contribu-

Methods and tools for displaying

15

 

tions from George Athanasopou-

and analyzing univariate time series

 

 

los, Slava Razbash, Drew Schmidt,

forecasts, including exponential

 

 

Zhenyu Zhou, Yousaf Khan, Chris-

smoothing via state space models

 

 

toph Bergmeir, and Earo Wang

and automatic ARIMA modeling

 

foreach

Revolution Analytics, Steve

foreach looping construct for R

20

 

Weston

 

 

foreign

R Core members Saikat DebRoy,

Reads data stored by Minitab, S,

2

 

Roger Bivand, and others

SAS, SPSS, Stata, Systat, dBase,

 

 

 

and others

 

gclus

Catherine Hurley

Clustering graphics

1, 11

ggplot2

Hadley Wickam

An implementation of the Grammar

19, 20

 

 

of Graphics

 

glmPerm

Wiebke Werft and Douglas M.

Permutation test for inference in

12

 

Potter

generalized linear models

 

 

 

 

 

546

APPENDIX E Packages used in this book

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

gmodels

Gregory R. Warnes. Includes R

Various R programming tools for

7

 

source code and/or documenta-

model fitting

 

 

tion contributed by Ben Bolker,

 

 

 

Thomas Lumley, and Randall C.

 

 

 

Johnson. Contributions from Ran-

 

 

 

dall C. Johnson are copyright

 

 

 

(2005) SAIC-Frederick, Inc.

 

 

gplots

Gregory R. Warnes. Includes R

Various R programming tools for

6, 9

 

source code and/or documenta-

plotting data

 

 

tion contributed by Ben Bolker,

 

 

 

Lodewijk Bonebakker, Robert

 

 

 

Gentleman, Wolfgang Huber, Andy

 

 

 

Liaw, Thomas Lumley, Martin

 

 

 

Maechler, Arni Magnusson,

 

 

 

Steffen Moeller, Marc Schwartz,

 

 

 

and Bill Venables.

 

 

grid

Paul Murrell

A rewrite of the graphics layout

19

 

 

capabilities, plus some support for

 

 

 

interaction

 

gridExtra

Baptiste Auguie

Functions for grid graphics

19

gvlma

Edsel A. Pena and Elizabeth H.

Global validation of linear models

8

 

Slate

assumptions

 

rhdf5

Bernd Fisher and Gregoire Paue

Interface to the NCSA HDF5 library

2

roxygen2

Hadley Wickham

A Doxygen-like in-source documen-

21

 

 

tation system

 

hexbin

Dan Carr, ported by Nicholas

Hexagonal binning routines

11

 

Lewin-Koh and Martin Maechler

 

 

HH

Richard M. Heiberger

Support software for Statistical

9

 

 

Analysis and Data Display by Hei-

 

 

 

berger and Holland (Springer, 2004)

 

kernlab

Alexandros Karatzoglou, Alex

Kernel-based machine learning lab

17

 

Smola, and Kurt Hornik

 

 

knitr

Yihui Xie

A general-purpose package for

22

 

 

dynamic report generation in R

 

Hmisc

Frank E. Harrell Jr., with contribu-

Harrell miscellaneous functions for

2, 3, 7

 

tions from many other users

data analysis, high-level graphics,

 

 

 

utility operations, and more

 

kmi

Arthur Allignol

Kaplan-Meier multiple imputation

18

 

 

for the analysis of cumulative inci-

 

 

 

dence functions in the competing

 

 

 

risks setting

 

 

 

 

 

APPENDIX E Packages used in this book

547

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

lattice

Deepayan Sarkar

Lattice graphics

19

lavaan

Yves Rosseel

Functions for latent variable mod-

14

 

 

els, including confirmatory factor

 

 

 

analysis, structural equation model-

 

 

 

ing, and latent growth-curve models

 

lcda

Michael Buecker

Latent class-discriminant

14

 

 

analysis

 

leaps

Thomas Lumley, using Fortran

Regression subset selection,

8

 

code by Alan Miller

including exhaustive search

 

lmPerm

Bob Wheeler

Permutation tests for linear models

12

logregperm

Douglas M. Potter

Permutation test for inference in

12

 

 

logistic regression

 

longitudinal-

Christophe Genolini

Tools for longitudinal data

18

Data

 

 

 

lsa

Fridolin Wild

Latent semantic analysis

14

ltm

Dimitris Rizopoulos

Latent trait models under item

14

 

 

response theory

 

lubridate

Garrett Grolemund and Hadley

Functions to identify and parse

4

 

Wickham

date-time data, extract and modify

 

 

 

components of a date-time, per-

 

 

 

form accurate math on date-times,

 

 

 

and handle time zones and Daylight

 

 

 

Savings Time

 

MASS

S original by Venables and

Functions and datasets to support

4, 5, 7, 8,

 

Ripley. R port by Brian Ripley,

Venables’ and Ripley’s Modern

9, 12

 

following earlier work by Kurt

Applied Statistics with S, 4th edition

 

 

Hornik and Albrecht Gebhardt.

(Springer, 2003)

 

mlogit

Yves Croissant

Estimation of the multinomial logit

13

 

 

model

 

multcomp

Torsten Hothorn, Frank Bretz,

Simultaneous tests and confi-

9, 12

 

Peter Westfall, Richard M.

dence intervals for general linear

 

 

Heiberger, and Andre Schuetzen-

hypotheses in parametric models,

 

 

meister

including linear, generalized linear,

 

 

 

linear mixed effects, and survival

 

 

 

models

 

mvnmle

Kevin Gross, with help from

ML estimation for multivariate nor-

18

 

Douglas Bates

mal data with missing values

 

mvoutlier

Moritz Gschwandtner and Peter

Multivariate outlier detection based

9

 

Filzmoser

on robust methods

 

 

 

 

 

548

APPENDIX E Packages used in this book

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

NbClustv

Malika Charrad, Nadia Ghazzali,

An examination of indices for deter-

16

 

Veronique Boiteau, and Azam

mining the number of clusters

 

 

Niknafs

 

 

ncdf, ncdf4

David Pierce

Interface to Unidata netCDF data

2

 

 

files

 

nFactors

Gilles Raiche

Parallel analysis and non-

14

 

 

graphical solutions to the Cattell

 

 

 

scree test

 

OpenMx

Steven Boker, Michael Neale,

Advanced structural equation

14

 

Hermine Maes, Michael Wilde,

modeling.

 

 

Michael Spiegel, Timothy R. Brick,

 

 

 

Jeffrey Spies, Ryne Estabrook,

 

 

 

Sarah Kenny, Timothy Bates,

 

 

 

Paras Mehta, and John Fox

 

 

odfWeave

Max Kuhn, with contributions from

Sweave processing of Open

22

 

Steve Weston, Nathan Coulter,

Document Format (ODF) files

 

 

Patrick Lenon, Zekai Otles, and

 

 

 

the R Core Team

 

 

pastecs

Frederic Ibanez, Philippe Gros-

Package for the analysis of

7

 

jean, and Michele Etienne

space-time ecological series

 

party

Torsten Hothorn, Kurt Hornik,

A laboratory for recursive

17

 

Carolin Strobl, and Achim Zeileis

partitioning

 

poLCA

Drew Linzer and Jeffrey Lewis

Polytomous variable latent-class

14

 

 

analysis

 

psych

William Revelle

Procedures for psychological, psy-

7, 14

 

 

chometric, and personality research

 

pwr

Stephane Champely

Basic functions for power analysis

10

qcc

Luca Scrucca

Quality-control charts

13

randomLCA

Ken Beath

Random effects latent-class

14

 

 

analysis

 

randomForest

Fortran original by Leo Breiman

Breiman and Cutler's random

17

 

and Adele Cutler, R port by Andy

forests for classification and

 

 

Liaw and Matthew Wiener

regression

 

R2wd

Christian Ritter

Writes MS-Word documents from R

22

rattle

Graham Williams, Mark Vere Culp,

Graphical user interface for data

16, 17

 

Ed Cox, Anthony Nolan, Denis

mining in R

 

 

White, Daniele Medri, Akbar

 

 

 

Waljee (OOB AUC for Random

 

 

 

Forest), and Brian Ripley (original

 

 

 

author of print.summary.nnet)

 

 

 

 

 

 

APPENDIX E Packages used in this book

549

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

Rcmdr

John Fox, with contributions from

R Commander, a platform-

Appendix A

 

Liviu Andronic, Michael Ash,

independent, basic-statistics

 

 

Theophilius Boye, Stefano Calza,

graphical user interface for R,

 

 

Andy Chang, Philippe Grosjean,

based on the tcltk package

 

 

Richard Heiberger, G. Jay Kerns,

 

 

 

Renaud Lancelot, Matthieu

 

 

 

Lesnoff, Uwe Ligges, Samir

 

 

 

Messad, Martin Maechler,

 

 

 

Robert Muenchen, Duncan

 

 

 

Murdoch, Erich Neuwirth, Dan

 

 

 

Putler, Brian Ripley, Miroslav

 

 

 

Ristic, and Peter Wolf

 

 

reshape2

Hadley Wickham

Flexibly reshape data

4, 5, 7, 20

rgl

Daniel Adler and Duncan Murdoch

3D visualization device system

11

 

 

(OpenGL)

 

RJDBC

Simon Urbanek

Provides access to databases

2

 

 

through the JDBC interface

 

rms

Frank E. Harrell, Jr.

Regression modeling strategies:

13

 

 

about 225 functions that assist

 

 

 

with and streamline regression

 

 

 

modeling, testing, estimations,

 

 

 

validation, graphics, prediction,

 

 

 

and typesetting

 

robust

Jiahui Wang, Ruben Zamar, Alfio

A package of robust methods

13

 

Marazzi, Victor Yohai, Matias

 

 

 

Salibian-Barrera, Ricardo

 

 

 

Maronna, Eric Zivot, David Rocke,

 

 

 

Doug Martin, Martin Maechler,

 

 

 

and Kjell Konis

 

 

RODBC

Brian Ripley and Michael Lapsley

ODBC database access

2

rpart

Terry Therneau, Beth Atkinson,

Recursive partitioning and regres-

17

 

and Brian Ripley (author of the

sion trees

 

 

initial R port)

 

 

ROracle

David A. James and Jake Luciani

Oracle database interface for R

2

rrcov

Valentin Todorov

Robust location and scatter

9

 

 

estimation, and robust multi-

 

 

 

variate analysis with a high

 

 

 

breakdown point

 

sampling

Yves Tillé and Alina Matei

Functions for drawing and calibrat-

4

 

 

ing samples

 

scatterplot3d

Uwe Ligges

Plots a three-dimensional (3D)

11

 

 

point cloud

 

 

 

 

 

550

APPENDIX E Packages used in this book

Table E.1 Contributed packages used in this book (continued)

Package

Authors

Description

Chapter(s)

 

 

 

 

sem

John Fox, with contributions from

Structural equation models

14

 

Adam Kramer and Michael

 

 

 

Friendly

 

 

SeqKnn

Ki-Yeol Kim and Gwan-Su Yi,

Sequential KNN imputation method

18

 

CSBio lab, Information and

 

 

 

Communications University

 

 

sm

Adrian Bowman and Adelchi

Smoothing methods for nonpara-

6, 9

 

Azzalini. Ported to R by B. D.

metric regression and density

 

 

Ripley up to version 2.0, version

estimation

 

 

2.1 by Adrian Bowman and

 

 

 

Adelchi Azzalini, version 2.2 by

 

 

 

Adrian Bowman.

 

 

vcd

David Meyer, Achim Zeileis, and

Functions for visualizing categori-

1, 6, 7,

 

Kurt Hornik

cal data

11, 12

vegan

Jari Oksanen, F. Guillaume

Ordination methods, diversity

9

 

Blanchet, Roeland Kindt, Pierre

analysis, and other functions for

 

 

Legendre, R. B. O’Hara, Gavin L.

community and vegetation

 

 

Simpson, Peter Solymos,

ecologists

 

 

M. Henry, H. Stevens, and

 

 

 

Helene Wagner

 

 

VIM

Matthias Templ, Andreas Alfons,

Visualization and imputation of

18

 

and Alexander Kowarik

missing values

 

xlsx

Adrian A. Dragulescu

Reads, writes, and formats Excel

2

 

 

2007 (.xlsx) files

 

XML

Duncan Temple Lang

Tools for parsing and generating

2

 

 

XML in R and S-Plus

 

 

 

 

 

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]