Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

1greenacre_m_primicerio_r_multivariate_analysis_of_ecological

.pdf
Скачиваний:
2
Добавлен:
19.11.2019
Размер:
7.36 Mб
Скачать

Michael Greenacre

Raul Primicerio

Multivariate Analysis

of Ecological Data

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data

MICHAEL GREENACRE

RAUL PRIMICERIO

First published December 2013

©The authors, 2013

©Fundación BBVA, 2013

Plaza de San Nicolás, 4. 48005 Bilbao www.fbbva.es publicaciones@fbbva.es

Supporting website www.multivariatestatistics.org provide the data sets used in the book, R scripts for the computations, glossary of terms, and additional material.

A digital copy of this book can be downloaded free of charge at www.fbbva.es.

The BBVA Foundation’s decision to publish this book does not imply any responsibility for its content, or for the inclusion therein of any supplementary documents or information facilitated by the authors.

Edition and production: Rubes Editorial

ISBN: 978-84-92937-50-9

Legal deposit no.: BI-1622-2013

Printed in Spain

Printed by Comgrafic on 100% recycled paper, manufactured to the most stringent European environmental standards.

To Zerrin and Maria

5

MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA

6

CONTENTS

Preface

 

Michael Greenacre and Raul Primicerio ..........................................................

9

ECOLOGICAL DATA AND MULTIVARIATE METHODS

 

1.

Multivariate Data in Environmental Science .......................................

15

2.

The Four Corners of Multivariate Analysis ..........................................

25

3.

Measurement Scales, Transformation and Standardization ...............

33

MEASURING DISTANCE AND CORRELATION

 

4.

Measures of Distance between Samples: Euclidean ...........................

47

5.

Measures of Distance between Samples: Non-Euclidean ....................

61

6.

Measures of Distance and Correlation between Variables..................

75

VISUALIZING DISTANCES AND CORRELATIONS

 

7.

Hierarchical Cluster Analysis ................................................................

89

8.

Ward Clustering and k-means Clustering.............................................

99

9.

Multidimensional Scaling .....................................................................

109

REGRESSION AND PRINCIPAL COMPONENT ANALYSIS

 

10.

Regression Biplots..................................................................................

127

11.

Multidimensional Scaling Biplots ........................................................

139

12.

Principal Component Analysis .............................................................

151

CORRESPONDENCE ANALYSIS

 

13.

Correspondence Analysis .....................................................................

165

7

 

MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA

14.

Compositional Data and Log-ratio Analysis ........................................

177

15.

Canonical Correspondence Analysis ...................................................

189

INTERPRETATION, INFERENCE AND MODELLING

 

16.

Variance Partitioning in PCA, LRA, CA and CCA...............................

203

17.

Inference in Multivariate Analysis ........................................................

213

18.

Statistical Modelling...............................................................................

229

CASE STUDIES

19.Case Study 1: Temporal Trends and Spatial Patterns across a Large

Ecological Data Set ...............................................................................

249

20. Case Study 2: Functional Diversity of Fish in the Barents Sea............

261

APPENDICES

 

Appendix A: Aspects of Theory .....................................................................

279

Appendix B: Bibliography and Web Resources............................................

293

Appendix C: Computational Note.................................................................

303

List of Exhibits ................................................................................................

307

Index ...............................................................................................................

325

About the Authors ..........................................................................................

331

8

PREFACE

The world around us – and especially the biological world – is inherently multidimensional. Biological diversity is the product of the interaction between many species, be they marine, plant or animal life, and of the many limiting factors that characterize the environment in which the species live. The environment itself is a complex mix of natural and man-induced parameters: for example, meteorological parameters such as temperature or rainfall, physical parameters such as soil composition or sea depth, and chemical parameters such as level of carbon dioxide or heavy metal pollution.

The properties and patterns we focus on in ecology and environmental biology consist of these many covarying components. Evolutionary ecology has shown that phenotypic traits, with their functional implications, tend to covary due to correlational selection and trade-offs. Community ecology has uncovered gradients in community composition. Much of this biological variation is organized along axes of environmental heterogeneity, consisting of several correlated physical and chemical characteristics. Spectra of functional traits, ecological and environmental gradients, all imply correlated properties that will respond collectively to natural and human perturbations. Small wonder that scientific inference in these fields must rely on statistical tools that help discern structure in datasets with many variables (i.e., multivariate data). These methods are comprehensively referred to as multivariate analysis, or multivariate statistics, the topic of this book. Multivariate analysis uses relationships between variables to order the objects of study according to their collective properties, that is to highlight spectra and gradients, and to classify the objects of study, that is to group species or ecosystems in distinct classes each containing entities with similar properties.

Although multivariate analysis is widely applied in ecology and environmental biology, also thanks to statistical software that makes the variety of methods more accessible, its concepts, potentials and limitations are not always transparent to practitioners. A scattered methodological literature, heterogeneous terminology, and paucity of introductory texts sufficiently comprehensive to provide a methodological overview and synthesis, are partly responsible for this. Another reason is the fact that biologists receive a formal quantitative training often limited to univariate, parametric statistics (regression and analysis of variance), with some

9