Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Eviews5 / EViews5 / Docs / EViews 5 Users Guide.pdf
Скачиваний:
152
Добавлен:
23.03.2015
Размер:
11.51 Mб
Скачать

956—Appendix C. Estimation and Solution Options

Optimization Algorithms

Given the importance of the proper setting of EViews estimation options, it may prove useful to review briefly various basic optimization algorithms used in nonlinear estimation. Recall that the problem faced in non-linear estimation is to find the values of parameters θ that optimize (maximize or minimize) an objective function F(θ) .

Iterative optimization algorithms work by taking an initial set of values for the parameters, say θ(0) , then performing calculations based on these values to obtain a better set of parameter values, θ(1) . This process is repeated for θ(2) , θ(3) and so on until the objective function F no longer improves between iterations.

There are three main parts to the optimization process: (1) obtaining the initial parameter values, (2) updating the candidate parameter vector θ at each iteration, and (3) determining when we have reached the optimum.

If the objective function is globally concave so that there is a single maximum, any algorithm which improves the parameter vector at each iteration will eventually find this maximum (assuming that the size of the steps taken does not become negligible). If the objective function is not globally concave, different algorithms may find different local maxima, but all iterative algorithms will suffer from the same problem of being unable to tell apart a local and a global maximum.

The main thing that distinguishes different algorithms is how quickly they find the maximum. Unfortunately, there are no hard and fast rules. For some problems, one method may be faster, for other problems it may not. EViews provides different algorithms, and will often let you choose which method you would like to use.

The following sections outline these methods. The algorithms used in EViews may be broadly classified into three types: second derivative methods, first derivative methods, and derivative free methods. EViews’ second derivative methods evaluate current parameter values and the first and second derivatives of the objective function for every observation. First derivative methods use only the first derivatives of the objective function during the iteration process. As the name suggests, derivative free methods do not compute derivatives.

Second Derivative Methods

For binary, ordered, censored, and count models, EViews can estimate the model using Newton-Raphson or quadratic hill-climbing.

Optimization Algorithms—957

Newton-Raphson

Candidate values for the parameters θ(1) may be obtained using the method of NewtonRaphson by linearizing the first order conditions ∂F ∂θ at the current parameter values,

θ(i) :

g(i) + H(i)( θ(i + 1) θ(i)) = 0

(C.2)

θ(i + 1) = θ(i) H(i1)g(i)

where g is the gradient vector ∂F ∂θ , and H is the Hessian matrix 2F ∂θ2 .

If the function is quadratic, Newton-Raphson will find the maximum in a single iteration. If the function is not quadratic, the success of the algorithm will depend on how well a local quadratic approximation captures the shape of the function.

Quadratic hill-climbing (Goldfeld-Quandt)

This method, which is a straightforward variation on Newton-Raphson, is sometimes attributed to Goldfeld and Quandt. Quadratic hill-climbing modifies the Newton-Raphson algorithm by adding a correction matrix (or ridge factor) to the Hessian. The quadratic hillclimbing updating algorithm is given by:

˜ −1

˜

(C.3)

θ(i + 1) = θ(i) H(i)g(i)

where H(i) = − H(i) + αI

where I is the identity matrix and α is a positive number that is chosen by the algorithm.

The effect of this modification is to push the parameter estimates in the direction of the gradient vector. The idea is that when we are far from the maximum, the local quadratic approximation to the function may be a poor guide to its overall shape, so we may be better off simply following the gradient. The correction may provide better performance at locations far from the optimum, and allows for computation of the direction vector in cases where the Hessian is near singular.

For models which may be estimated using second derivative methods, EViews uses quadratic hill-climbing as its default method. You may elect to use traditional Newton-Raph- son, or the first derivative methods described below, by selecting the desired algorithm in the Options menu.

Note that asymptotic standard errors are always computed from the unmodified Hessian once convergence is achieved.

First Derivative Methods

Second derivative methods may be computationally costly since we need to evaluate the k( k + 1) ⁄ 2 elements of the second derivative matrix at every iteration. Moreover, second derivatives calculated may be difficult to compute accurately. An alternative is to employ

958—Appendix C. Estimation and Solution Options

methods which require only the first derivatives of the objective function at the parameter values.

For general nonlinear models (nonlinear least squares, ARCH and GARCH, nonlinear system estimators, GMM, State Space), EViews provides two first derivative methods: GaussNewton/BHHH or Marquardt.

Gauss-Newton/BHHH

This algorithm follows Newton-Raphson, but replaces the negative of the Hessian by an approximation formed from the sum of the outer product of the gradient vectors for each observation’s contribution to the objective function. For least squares and log likelihood functions, this approximation is asymptotically equivalent to the actual Hessian when evaluated at the parameter values which maximize the function. When evaluated away from the maximum, this approximation may be quite poor.

The algorithm is referred to as Gauss-Newton for general nonlinear least squares problems, and often attributed to Berndt, Hall, Hall, and Hausman (BHHH) for maximum likelihood problems.

The advantages of approximating the negative Hessian by the outer product of the gradient are that (1) we need to evaluate only the first derivatives, and (2) the outer product is necessarily positive semi-definite. The disadvantage is that, away from the maximum, this approximation may provide a poor guide to the overall shape of the function, so that more iterations may be needed for convergence.

Marquardt

The Marquardt algorithm modifies the Gauss-Newton algorithm in exactly the same manner as quadratic hill climbing modifies the Newton-Raphson method (by adding a correction matrix (or ridge factor) to the Hessian approximation).

The ridge correction handles numerical problems when the outer product is near singular and may improve the convergence rate. As above, the algorithm pushes the updated parameter values in the direction of the gradient.

For models which may be estimated using first derivative methods, EViews uses Marquardt as its default method. You may elect to use traditional Gauss-Newton via the Options menu.

Note that asymptotic standard errors are always computed from the unmodified (GaussNewton) Hessian approximation once convergence is achieved.

Nonlinear Equation Solution Methods—959

Choosing the step size

At each iteration, we can search along the given direction for the optimal step size. EViews performs a simple trial-and-error search at each iteration to determine a step size λ that improves the objective function. This procedure is sometimes referred to as squeezing or stretching.

Note that while EViews will make a crude attempt to find a good step, λ is not actually optimized at each iteration since the computation of the direction vector is often more important than the choice of the step size. It is possible, however, that EViews will be unable to find a step size that improves the objective function. In this case, EViews will issue an error message.

EViews also performs a crude trial-and-error search to determine the scale factor α for Marquardt and quadratic hill-climbing methods.

Derivative free methods

Other optimization routines do not require the computation of derivatives. The grid search is a leading example. Grid search simply computes the objective function on a grid of parameter values and chooses the parameters with the highest values. Grid search is computationally costly, especially for multi-parameter models.

EViews uses (a version of) grid search for the exponential smoothing routine.

Nonlinear Equation Solution Methods

When solving a nonlinear equation system, EViews first analyzes the system to determine if the system can be separated into two or more blocks of equations which can be solved sequentially rather than simultaneously. Technically, this is done by using a graph representation of the equation system where each variable is a vertex and each equation provides a set of edges. A well known algorithm from graph theory is then used to find the strongly connected components of the directed graph.

Once the blocks have been determined, each block is solved for in turn. If the block contains no simultaneity, each equation in the block is simply evaluated once to obtain values for each of the variables.

If a block contains simultaneity, the equations in that block are solved by either a GaussSeidel or Newton method, depending on how the solver options have been set.

Gauss-Seidel

By default, EViews uses the Gauss-Seidel method when solving systems of nonlinear equations. Suppose the system of equations is given by:

960—Appendix C. Estimation and Solution Options

x1 = f1(x1, x2, , xN, z)

x2 = f2(x1, x2, , xN, z)

(C.4)

xN = fN(x1, x2, , xN, z )

where x are the endogenous variables and z are the exogenous variables.

The problem is to find a fixed point such that x = f(x, z) . Gauss-Seidel employs an iterative updating rule of the form:

x(i + 1) = f(x(i), z) .

(C.5)

to find the solution. At each iteration, EViews solves the equations in the order that they appear in the model. If an endogenous variable that has already been solved for in that iteration appears later in some other equation, EViews uses the value as solved in that iteration. For example, the k-th variable in the i-th iteration is solved by :

x

(i)

=

f

k

(x(i), x(i), , x(i) , x(i − 1), x(i − 1), , x(i − 1), z) .

(C.6)

 

k

 

 

1 2

k − 1 k

k + 1

N

 

The performance of the Gauss-Seidel method can be affected be reordering of the equations. If the Gauss-Seidel method converges slowly or fails to converge, you should try moving the equations with relatively few and unimportant right-hand side endogenous variables so that they appear early in the model.

Newton's Method

Newton’s method for solving a system of nonlinear equations consists of repeatedly solving a local linear approximation to the system.

Consider the system of equations written in implicit form:

 

F(x, z) = 0

(C.7)

where F is the set of equations, x is the vector of endogenous variables and z is the vector of exogenous variables.

In Newton’s method, we take a linear approximation to the system around some values x and z :

F(x, z) = F(x , z ) +

F(x , z )∆x = 0

(C.8)

 

x

 

and then use this approximation to construct an iterative procedure for updating our current guess for x :

Nonlinear Equation Solution Methods—961

 

 

 

 

 

 

 

 

 

 

−1

 

x

t + 1

= x

 

F( x , z

 

)

F( x , z )

(C.9)

 

 

t

 

 

 

t

 

 

 

 

 

 

x

t

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

where raising to the power of -1 denotes matrix inversion.

The procedure is repeated until the changes in xt between periods are smaller than a specified tolerance.

Note that in contrast to Gauss-Seidel, the ordering of equations under Newton does not affect the rate of convergence of the algorithm.

962—Appendix C. Estimation and Solution Options

Соседние файлы в папке Docs