Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
1mc_callum_h_population_parameters_estimation_for_ecological.pdf
Скачиваний:
3
Добавлен:
19.11.2019
Размер:
2.19 Mб
Скачать

C H A P T E R 1 1

The state of the art

Introduction

Reliable, statistically sound methods are now available to estimate population size, the survival rate and fecundity of individuals, and the growth rate of populations. I reviewed these methods in Chapters 3, 4 and 5. Given a particular ecological situation, it is possible to make clear recommendations as to which of these methods is the most appropriate to use. Some of these methods, those for mark–recapture data in particular, require quite sophisticated and elaborate computer programs for their analysis. These programs are readily available on the World Wide Web. There is no doubt that the current rapid development of methods better able to handle the complexities of real data will continue. It will always be worth checking the major link sites for statistical analysis of wildlife data for the most recent developments.

The critical issue in applying methods that estimate parameters associated with a particular population at a particular time in models is to ensure that these are the most appropriate parameters for the specific question the model is intended to address. For example, survival rates measured in a wild population of a particular species might well be a good description of the survival rates in that particular population for the time period in question. They would probably be entirely inappropriate to use in an attempt to estimate the maximum potential rate of increase of that species, even in a similar environment, as the measured survival rates would be likely to include density-dependent components.

Most of the second part of the book discusses the estimation of parameters to describe interactions. Chapter 6 describes the estimation of parameters concerned with density dependence, or interactions within species, and Chapters 8, 9 and 10 deal with interactions between species. In most cases, estimating interaction parameters is a matter of applying one of the basic parameter estimation techniques described in the first part of the book to one species, at differing values of some other variable, such as population size of the same species (for density dependence) or another species (for interspecific interactions). This process is not straightforward, for a number of reasons that are discussed throughout the book, and are reiterated below. In general, it is not possible to make prescriptive recommendations about the best methods for estimating particular interaction parameters. The principal problems are in experimental design, and the constraints on the design will depend on the use

313

314 C H A P T E R 1 1

to which the parameters are to be put, the ecology of the organisms in question, and the resources available to the investigator. I have therefore taken a more descriptive approach in the second part of the book, critically reviewing methods that have been used in practice, and with which there have been varying degrees of success. Where possible, I have made suggestions and recommendations, but readers will have to develop their own experimental methods, for their own situation, in the light of the cases I have discussed.

Observation and process error

A recurring theme, particularly in the latter part of the book, is that most ecological data contain errors, or uncertainties, of two types. To reiterate, process error is uncertainty that occurs in the actual values of variables in ecological systems. For example, consider the type II functional response introduced in Table 9.1,

f (N) =

 

 

αN

 

 

 

 

.

(11.1)

1

+ α N

 

 

 

β

 

This equation describes the number of prey eaten per unit of time, per predator, as a function of prey density N. At low densities the prey consumption rate increases linearly with prey density, with a slope α, but at high prey densities the consumption rate tends to β per unit of time. If the actual number of prey taken by a particular individual predator per unit of time was recorded at a variety of prey densities, the points would be most unlikely to fall precisely along the curve described by eqn (11.1), even if the equation was a good description of the functional response in that situation. There would be random variation in the number of prey caught at any prey density, the parameters themselves might vary between sampling times, or the functional response might vary somewhat from eqn (11.1). All these sources of variation contribute to process error: the number of prey actually eaten per unit of time will vary from that predicted by the equation. There may also be observation error: the recorded number of prey eaten per unit of time may be only an estimate, with uncertainty, of the actual number of prey consumed. There may also be observation error in the independent (or predictor) variable, in this case, the population size of the prey N.

As I discuss in Chapters 5, 6, 8 and 9, the distinction between process and observation error is very important when dealing with time series data. Quite different estimation procedures need to be applied in cases in which process errors predominate than where estimation errors predominate. Process errors propagate in a time series: the variables themselves are affected by the error, so that process error at one time will influence the rest of the series. Observation errors do not propagate: the error is in the recorded value, and

T H E S T A T E O F T H E A R T 315

has no impact on the actual value of the time series, and thus no influence on the remainder of the series. In general terms, methods designed to handle process errors predict changes in the variables over one time step, whereas those for observation error aim to predict the actual values of the variables at each time they are recorded. More detail can be found in the relevant chapters.

Unfortunately, almost all ecological data will contain both process and observation error. If the probability distribution of the observation errors is known, or their variance can be estimated separately, it is technically possible to fit a model with both observation and process errors. The papers by Carpenter et al. (1994), discussed in Chapter 9, and Pascual and Kareiva (1996), discussed in Chapter 8, show how this might be done, although the details are beyond the scope of this book. Replicated observations of the same time series at each point where the system is measured can provide an idea of the size of the observation error, independent of the process error. This is always useful information. Failing this, it is worth comparing the estimates from methods assuming only process error with estimates from methods assuming only observation error. As a crude generalization, for most ecological systems, and most estimation methods, process error is likely to be more important than observation error. Development of easy-to-use methods for the analysis of time series containing both process and observation error is another area in which there should be progress over the next few years.

Observation error is also important when it occurs in predictor variables. It is rare that the value of any ecological variable, including population size or density, can be measured without considerable error. I discussed the problem, but not possible solutions, in Chapters 6, 8, 9 and 10. There is a large statistical literature on the subject, where the issue is usually called ‘measurement error’ or ‘errors-in-variables’. Fuller (1987) summarizes the literature for linear models, and Carroll et al. (1995) extend the discussion to nonlinear models.

In most cases, if the objective is to predict the value of a variable Y, given a set of predictor variables X1, . . . ,Xk, observation error does not change the method of analysis or its interpretation (Carroll et al., 1995). However, this book is concerned with parameter estimation, in which case there is a real problem. The simplest case is a univariate linear regression, where observation errors in the predictor are unbiased and are uncorrelated with the error in the response. Observation error then causes ‘bias towards the null’: whatever the true relationship is between the predictor and response, it tends to be lessened and obscured by measurement error. In more complicated cases, the true relationship may be distorted in other ways. For example, in multiple regression problems where the predictor variables are correlated, observation error in one or more of the predictor variables may result in overestimation of the effect of the predictor, or may even cause the sign of its effect to be estimated incorrectly. Nonlinear models with observation errors behave in a broadly

B*x Fw2 |z

316 C H A P T E R 1 1

similar way, but the effects of observation error are more complex still (Carroll et al., 1995). Some generalizations can be made about the likely effects of observation error for some classes of models. The best advice is to follow the suggestion in Hilborn and Mangel (1997), and to simulate data with properties like those of your system, thus exploring the consequences of observation error empirically.

There are several possible ways to correct for observation error bias. These are discussed in Fuller (1987) and Carroll et al. (1995). One approach, which can be applied to a wide range of generalized linear models, including linear and logistic regression, is regression calibration (Carroll et al., 1995). The simplest version is applicable in the following situation:

(i)There is one predictor variable X measured with error, such that what is recorded is the random variable W = X + U, where U is the measurement error. U has a variance σ2u, which is estimated by F2u.

(ii)There may be several covariates (other predictor variables) Z, but these are measured without error.

(iii)The objective is to predict a response Y, via a linear predictor of the form

β0 + βx X + βzt Z. Here, X is a scalar variable, Z is a column vector of covariates, β0 and βx are scalar parameters, and βzt is a row vector of regression coefficients. This linear predictor may generate the predicted response via one of a range of link functions (see Chapter 2). The problem is to estimate βx.

(iv)X, Z and W are normally distributed.

The regression calibration estimator, Bx, can then be obtained by the following steps:

1 Obtain the ‘naïve’ estimator B*x , by fitting the model, ignoring the measurement error.

2 Regress W, the predictor with measurement error, on the other covariates Z. From this regression, obtain the mean square error. Call this F2wlz. If there are no covariates Z, this will just be the sample variance of W.

3 Calculate the corrected parameter estimate

Bx = Fw2 |z − F2u .

This correction requires an estimate of the observation error variance F2u. In some cases, it may readily be available. For example, if the problem is to estimate the effect of population size on a parameter, all the methods described in Chapter 3 will return an estimate of the standard error of the estimated population size. This can simply be squared to give an estimate of the error variance. However, a possible problem in this situation is that it is quite likely that the error will not be additive with constant variance, and may well be non-normal. Carroll et al. (1995) discuss some possible solutions, but these are beyond the scope of this book.

T H E S T A T E O F T H E A R T 317

Spatial structure and scale

There is no doubt that models dealing explicitly with spatial structure and dynamics will increasingly be of importance both to theoretical ecology and for answering practical problems of ecological management (Levin et al., 1997). I discuss methods for estimation of spatial parameters in Chapter 7, but this is again an area in which rapid developments can be expected over the next few years.

Even if spatial structure is not built explicitly into models, it introduces substantial problems into estimation of parameters, particularly the interaction parameters discussed in Chapters 9 and 10. The problem is straightforward: interactions are more likely to occur between close neighbours than between individuals that are far apart. If spatial separation is not included explicitly in a model, an estimate of the mean value of an interaction parameter is a function of the probability distribution of separation distances (or contact frequencies) of the interacting entities. This causes major problems if the spatial scale of the system in which the parameters are estimated differs from the spatial scale at which the parameter estimates are to be used. Further, even at the same spatial scale, the way in which contact frequencies scale with population size is not easy to determine (see the discussion of ‘true’ and ‘pseudo-’ mass action in Chapter 10).

A solution to this problem is to build interaction models in which the spatial relationships between the interacting entities are dealt with explicitly. I briefly discuss some host–pathogen models of this type in Chapter 10. The obvious difficulty is that the model itself becomes considerably more complex.

Levels of abstraction: the mismatch between data and models

The problem I have just discussed is an example of a much wider issue in model parameterization. The level of abstraction at which many models are built is different from the level of abstraction at which parameters can easily be estimated. A simple example occurs in Chapter 4. Models that do not include age structure use a single parameter to describe the average birth rate of individuals in a population. However, in no species can individuals reproduce immediately after birth. The average fecundity in a population is a function of at least four separate parameters: the average time from birth until individuals become reproductive; the proportion of individuals that survive to become reproductive; the proportion of individuals that reproduce once that age is reached; and the number of offspring produced per reproductive individual. These are the quantities that potentially can be estimated from an actual population, but they must be combined into a single parameter for use in a model without age structure. To do so, it is necessary to make simplifying

318 C H A P T E R 1 1

assumptions (for example, that the population has reached a stable age distribution). These assumptions inevitably involve loss of information. It is also necessary to use one model (in this case, an age-structured model) to generate the summary parameter from the data. You need to ask whether it is better to use an age-structured model throughout the modelling process, rather than to use an age-structured model to generate a summary parameter for use in a non-age-structured model. The answer to this question depends on the intended use of the model. It is far easier to examine the qualitative behaviour of a non-age-structured model algebraically than it is to do so for an age-structured model. However, the age-structured model, with parameters directly related to the data, will probably better predict the quantitative behaviour of the system. If you only intend to analyse the model numerically, modern computing power is such that there is little point in using an oversimplified model structure.

The role of brute-force computing

The rate at which computer processing speed and memory capacity have increased over the last few years will be well known to every reader. Modelling approaches that were not feasible 10 years ago, because of the computer resources required, can now be used on a laptop computer. There is little doubt that methods that are impractical now will become commonplace in the near future. We can expect numerically intensive methods to be increasingly practical, both for parameter estimation (see bootstraps and jackknives in Chapter 2), and for analysing models themselves (for example, individualbased models, see Chapter 2; and spatially explicit models, see Chapter 7). In both cases, it will be increasingly unnecessary to make simplifying assumptions for the sake of mathematical tractability alone. Paradoxically, in many cases this means that the models can be conceptually much simpler. Rather than needing to assume a complex functional form for the probability distribution of some property in a population, one can model individuals or sampled units directly, and allow the computing power to take care of working out empirically what the probability distribution should be.

The challenge for the future is to use increased computer power wisely. As I stated at the outset of the book, the skill in model building is to produce a model that is as simple as possible, but which still performs its task. The ability to analyse highly complex models may tempt ecologists into building overelaborate models that require a plethora of parameter estimates, and from which it is impossible to obtain general insights. For example, Ruckelshaus et al. (1997) simulated spatially explicit models, and concluded that they were ‘staggeringly sensitive to the details of dispersal behaviour’. Alternatively, we should be able to develop models that are powerful, elegant and general, but

T H E S T A T E O F T H E A R T 319

are freed from the constraints of computational tractability. The best strategy for progress in ecological model building is to model a system at several different levels of abstraction and aggregation, and then to compare the results. In this way, you can determine the appropriate level of complexity for solution of a given problem.