Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
668
Добавлен:
03.06.2015
Размер:
8.25 Mб
Скачать

Appendix E. Long-run Covariance Estimation

The long-run (variance) covariance matrix (LRCOV) occupies an important role in modern econometric analysis. This matrix is, for example, central to calculation of efficient GMM weighting matrices (Hansen 1982), heteroskedastic and autocorrelation (HAC) robust standard errors (Newey and West 1987), and is employed in unit root (Phillips and Perron 1988) and cointegration analysis (Phillips and Hansen 1990, Hansen 1992b).

EViews offers tools for computing symmetric LRCOV and the one-sided LRCOV using nonparametric kernel (Newey-West 1987, Andrews 1991), parametric VARHAC (Den Haan and Levin 1997), and prewhitened kernel (Andrews and Monahan 1992) methods. In addition, EViews supports Andrews (1991) and Newey-West (1994) automatic bandwidth selection methods for kernel estimators, and information criteria based lag length selection methods for VARHAC and prewhitening estimation.

Technical Discussion

Our basic discussion and notation follows the framework of Andrews (1991) and Hansen (1992a).

Consider a sequence of mean-zero random p -vectors {Vt(v)} that may depend on a

K -vector of parameters v , and let Vt Vt(v0) where v0 is the true value of v . We are interested in estimating the LRCOV matrix Q ,

 

 

Q =

 G(j)

(E.1)

j

=

 

where

 

 

G(j) = E(VtVt j¢)

j 0

G(j) = G(j

j < 0

(E.2)

 

is the autocovariance matrix of Vt at lag j . When Vt

is second-order stationary, Q equals

2p times the spectral density matrix of Vt evaluated at frequency zero (Hansen 1982, Andrews 1991, Hamilton 1994).

Closely related to Q are two measures of the one-sided LRCOV matrix:

 

 

L1

= Â G(j)

 

 

j = 1

(E.3)

 

 

 

L0

= Â G(j) = G(0) + L1

 

j = 0

776—Appendix E. Long-run Covariance Estimation

The matrix L1 , which we term the strict one-sided LRCOV, is the sum of the lag covariances, while the L0 also includes the contemporaneous covariance G(0) . The two-sided

LRCOV matrix Q is related to the one-sided matrices through Q = G(0) + L1 + L1¢ and

Q = L0 + L0¢ G(0).

Despite the important role the one-sided LRCOV matrix plays in the literature, we will focus our attention on Q , since results are generally applicable to all three measures; exception will be made for specific issues that require additional comment.

In the econometric literature, methods for using a consistent estimator ˆ and the corre- v

Vˆ V (ˆ )

sponding t t v to form a consistent estimate of Q are often referred to as heteroskedasticity and autocorrelation consistent (HAC) covariance matrix estimators.

There have been three primary approaches to estimating Q :

1.The nonparametric kernel approach (Andrews 1991, Newey-West 1987) forms estimates of Q by taking a weighted sum of the sample autocovariances of the observed data.

2.The parametric VARHAC approach (Den Haan and Levin 1997) specifies and fits a parametric time series model to the data, then uses the estimated model to obtain the implied autocovariances and corresponding Q .

3.The prewhitened kernel approach (Andrews and Monahan 1992) is a hybrid method that combines the first two approaches, using a parametric model to obtain residuals that “whiten” the data, and a nonparametric kernel estimator to obtain an estimate of the LRCOV of the whitened data. The estimate of Q is obtained by “recoloring” the prewhitened LRCOV to undo the effects of the whitening transformation.

Below, we offer a brief description of each of these approaches, paying particular attention to issues of kernel choice, bandwidth selection, and lag selection.

Nonparametric Kernel

The class of kernel HAC covariance matrix estimators in Andrews (1991) may be written as:

ˆ

 

T

 

 

ˆ

 

=

 

Â

 

(E.4)

Q

--------------

k (j § bT ) G(j)

 

 

T

K

 

 

 

 

 

 

j

=

 

 

 

 

 

ˆ

are given by

 

 

where the sample autocovariances G(j)

 

 

ˆ

 

1

T

ˆ

ˆ

 

 

=

Â

j 0

 

G(j)

---

Vt Vt j¢

(E.5)

 

 

T

 

 

 

 

 

t

= j + 1

 

 

 

ˆ

 

ˆ

 

 

j < 0

 

 

G(j) =

G(j

 

 

 

k is a symmetric kernel (or lag window) function that, among other conditions, is continous at the origin and satisfies k(x) £ 1 for all x with k (0) = 1 , and bT > 0 is a band-

Technical Discussion—777

width parameter. The leading T § (T K) term is an optional correction for degrees-of- freedom associated with the estimation of the K parameters in v .

The choice of a kernel function and a value for the bandwidth parameter completely characterizes the kernel HAC estimator.

Kernel Functions

There are a large number of kernel functions that satisfy the required conditions. EViews supports use of the following kernel shapes:

Truncated uniform

k(x)

=

1

 

 

 

 

 

 

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bartlett

k(x)

=

1

 

x

 

 

 

 

 

 

 

 

 

 

 

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bohman

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

sin(p

x

 

)

 

 

 

 

 

 

 

 

 

(1 –

 

x

 

)cos(px)

+

 

 

 

 

 

 

£ 1.0

 

k(x)

=

 

 

 

-----------------------

 

 

 

 

 

if

x

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Daniell

k(x)

=

sin(px) § (px)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parzen

 

 

 

1

– 6x2(1 –

 

x

 

)

 

 

 

 

 

 

 

 

if 0.0 £

 

x

 

£ 0.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

k(x)

=

 

 

 

 

 

 

 

 

 

 

)3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2(1 –

 

x

 

 

 

 

 

 

 

 

 

 

 

if 0.5 <

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parzen-Riesz

k(x)

=

 

1

x2

 

 

 

 

 

 

 

 

 

 

 

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parzen-Geometric

k(x)

=

1

§ (1

+

 

 

x

 

)

 

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parzen-Cauchy

k(x)

=

 

1

§ (1

+ x2)

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

otherwise

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Quadratic Spectral

k(x)

=

 

25

 

 

 

sin(6px §

5)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-----------------

 

 

------------------------------

– cos(6px § 5)

 

 

 

12p2x2

 

 

 

6px § 5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

778—Appendix E. Long-run Covariance Estimation

Tukey-Hamming

k(x)

=

0.54 + 0.46 cos(px)

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

0

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Tukey-Hanning

k(x)

=

0.50 + 0.50 cos(px)

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

0

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Tukey-Parzen

k(x)

=

0.436 + 0.564 cos(px)

 

if

 

x

 

£ 1.0

 

 

 

 

 

 

 

 

 

0

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Note that k(x) = 0 for x > 1 for all kernels with the exception of the Daniell and the Quadratic Spectral. The Daniell kernel is presented in truncated form in Neave (1972), but EViews uses the more common untruncated form. The Bartlett kernel is sometimes referred to as the Fejer kernel (Neave 1972).

A wide range of kernels have been employed in HAC estimation. The truncated uniform is used by Hansen (1982) and White (1984), the Bartlett kernel is used by Newey and West (1987), and the Parzen is used by Gallant (1987). The Tukey-Hanning and Quadratic Spectral were introduced to the econometrics literature by Andrews (1991), who shows that the latter is optimal in the sense of minimizing the asymptotic truncated MSE of the estimator (within a particular class of kernels). The remaining kernels are discussed in Parzen (1958, 1961, 1967).

Bandwidth

The bandwidth bT operates in concert with the kernel function to determine the weights for the various sample autocovariances in Equation (E.4). While some authors restrict the bandwidth values to integers, we follow Andrews (1991) who argues in favor of allowing real valued bandwidths.

To construct an operational nonparametric kernel estimator, we must choose a value for the bandwidth bT . Under general conditions (Andrews 1991), consistency of the kernel estimator requires that bT is chosen so that bT Æ • and bT § T Æ 0 as T Æ •. Alternately, Kiefer and Vogelsang (2002) propose setting bT = T in a testing context.

For the great majority of supported kernels k (j § bT) = 0 for j > bT so that the bandwidth acts indirectly as a lag truncation parameter. Relating bT to the corresponding integer lag number of included lags m requires, however, examining the properties of the kernel at the endpoints ( j § bT = 1). For kernel functions where k (1) π 0 (e.g., Truncated, Parzen-Geometric, Tukey-Hanning), bT is simply a real-valued truncation lag, with at most m = floor(bT) autocovariances having non-zero weight. Alternately, for kernel functions where k(1) = 0 (e.g., Bartlett, Bohman, Parzen), the relationship is slightly more com-

T 1 § 5

Technical Discussion—779

plex, with m = ceil(bT) – 1 autocovariances entering the estimator with non-zero weights.

The varying relationship between the bandwidth and the lag-truncation parameter implies that one should examine the kernel function when choosing bandwidth values to match computations that are quoted in lag truncation form. For example, matching Newey-West’s (1987) Bartlett kernel estimator which uses m weighted autocovariance lags requires setting bT = m + 1 . In contrast, Hansen’s (1982) or White’s (1984) estimators, which sum the first m unweighted autocovariances, should be implemented using the Truncated kernel with bT = m .

Automatic Bandwidth Selection

Theoretical results on the relationship between bandwidths and the asymptotic truncated MSE of the kernel estimator provide finer discrimination in the rates at which bandwidths should increase. The optimal bandwidths may be written in the form:

bT = gT 1 § (2q + 1)

(E.6)

where g is a constant, and q is a parameter that depends on the kernel function that you select (Parzen 1958, Andrews 1991). For the Bartlett and Parzen-Geometric kernels (q = 1) b should grow (at most) at the rate T 1 § 3 . The Truncated kernel does not have an theoretical optimal rate, but Andrews (1991) reports Monte Carlo simulations that suggest that

works well. The remaining EViews supported kernels have (q = 2) so their optimal bandwidths grow at rate T 1 § 5 (though we point out that Daniell kernel does not satisfy the conditions for the optimal bandwidth theorems).

While theoretically useful, knowledge of the rate at which bandwidths should increase as T Æ • does not tell us the optimal bandwidth for a given sample size, since the constant g remains unspecified.

Andrews (1991) and Newey and West (1994) offer two approaches to estimating g . We may term these techniques automatic bandwidth selection methods, since they involve estimating the optimal bandwidth from the data, rather than specifying a value a priori. Both the Andrews and Newey-West estimators for g may be written as:

ˆ

=

ˆ

1 § (2q + 1)

(E.7)

g(q)

ck a(q)

 

 

 

 

ˆ

is an

where q and the constant ck depend on properties of the selected kernel and a(q)

estimator of a(q), a measure of the smoothness of the spectral density at frequency zero that depends on the autocovariances G(j). Substituting into Equation (E.6), the resulting plug-in estimator for the optimal automatic bandwidth is given by:

ˆ

 

=

ˆ

1 § (2q + 1)

(E.8)

bT

 

ck (a(q)T)

 

The q that one uses depends on properties of the selected kernel function. The Bartlett and Parzen-Geometric kernels should use aˆ (1) since they have q = 1 . aˆ (2) should be used

780—Appendix E. Long-run Covariance Estimation

for the other EViews supported kernels which have q = 2 . The Truncated kernel does not have a theoretically proscribed choice, but Andrews recommends using aˆ (2). The Daniell kernel has q = 2 , though we remind you that it does not satisfy the conditions for Andrews’s theorems. “Kernel Function Properties” on page 785 summarizes the values of ck and q for the various kernel functions.

It is of note that the Andrews and Newey-West estimators both require an estimate of a(q) that requires forming preliminary estimates of Q and the smoothness of Q . Andrews and Newey-West offer alternative methods for forming these estimates.

Andrews Automatic Selection

The Andrews (1991) method estimates a(q) parametrically: fitting a simple parametric time series model to the original data, then deriving the autocovariances G(j) and corresponding a(q) implied by the estimated model.

Andrews derives aˆ (q) formulae for several parametric models, noting that the choice between specifications depends on a tradeoff between simplicity and parsimony on one hand and flexibility on the other. EViews employs the parsimonius approach used by Andrews in his Monte Carlo simulations, estimating p -univariate AR(1) models (one for each element of Vˆ t ), then combining the estimated coefficients into an estimator for a(q).

For the univariate AR(1) approach, we have:

 

p

 

 

ˆ (q )

2

 

p

 

 

ˆ (

0)

2

 

ˆ

Â

w

)

§

Â

w

)

(E.9)

a(q) =

(f

s

(f

s

 

 

s

 

 

 

s

 

 

 

 

 

s =

1

 

 

 

 

 

s =

1

 

 

 

 

 

 

where ˆf (sq ) are parametric estimators of the smoothness of the spectral density for the s -th variable (Parzen’s (1957) q -th generalized spectral derivatives) at frequency zero. Estimators for ˆf (sq ) are given by:

f

(q)

=

1

s

2p

Â

ˆ

 

 

------

 

j =

 

 

q

˜

 

 

 

 

j

 

 

Gs(j)

(E.10)

 

 

for s = 1, º, p and q

 

 

 

 

 

 

 

 

 

 

˜

 

 

are the estimated autocovariances at lag j

= 0, 1, 2 , where Gs(j)

 

implied by the univariate AR(1) specification for the s -th variable.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

 

 

 

 

ˆ

into the

Substituting the univariate AR(1) estimated coefficients rs

and standard errors ss

 

 

 

˜

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

theoretical expressions for Gs(j), we have:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

 

 

 

 

 

 

ˆ4 ˆ 2

 

 

 

 

p

 

 

 

 

ˆ4

 

 

a(1)

=

Â

w

 

 

 

4ss rs

 

 

Â

w

 

 

ss

 

 

 

 

 

 

 

 

 

 

 

 

 

§

 

 

 

 

 

ˆ

 

 

 

 

s

--------------------------------------------

 

 

 

s

---------------------

 

 

 

 

 

 

(1

ˆ

 

6

(1

 

ˆ

 

2

 

 

 

(1

ˆ

4

 

 

 

 

s = 1

 

 

 

 

rs)

 

+ rs)

 

 

s = 1

 

 

 

rs)

 

(E.11)

 

 

 

p

 

 

 

 

 

ˆ4 ˆ 2

 

 

 

p

 

 

 

ˆ4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

a(2)

=

 

w

 

 

 

4ss rs

 

 

w

 

ss

 

 

 

 

 

 

 

 

 

 

 

 

 

§

 

 

 

 

 

 

 

 

 

 

ˆ

 

 

Â

 

s

---------------------

 

 

 s

---------------------

 

 

 

 

 

 

 

 

(1

ˆ

8

 

(1

ˆ

 

4

 

 

 

 

 

 

s = 1

 

 

 

rs)

 

 

 

 

s = 1

 

 

rs)

 

 

 

 

 

 

 

Technical Discussion—781

 

which may be inserted into Equation (E.8) to obtain expressions for the optimal band-

widths.

 

ˆ

depend on the weighting vector w which

Lastly, we note that the expressions for a(q)

ˆ (q )

into a single measure of relative smoothness.

governs how we combine the individual f s

Andrews suggests using either ws = 1 for all s or ws = 1 for all but the instrument corresponding to the intercept in regression settings. EViews adopts the first suggestion, setting ws = 1 for all s .

Newey-West Automatic Selection

Newey-West (1994) employ a nonparametric approach to estimating a(q). In contrast to Andrews who computes parametric estimates of the individual f (sq) , Newey-West uses a Truncated kernel estimator to estimate the f (q) corresponding to aggregated data.

First, Newey and West define, for various lags, the scalar autocovariance estimators:

 

 

 

1

T

ˆ

ˆ

 

ˆ

 

ˆ

j

=

Â

=

 

T

 

 

---

 

(E.12)

j

 

 

 

w¢VtVt j¢w

w¢G(j)w

t = j + 1

The jˆ j may either be viewed as the sample autocovariance of a weighted linear combination of the data using weights w , or as a weighted combination of the sample autocovariances.

Next, Newey and West use the jˆ j to compute nonparametric truncated kernel estimators of the Parzen measures of smoothness:

ˆ

(q )

 

1

n

q

 

 

 

Â

 

 

ˆ

 

 

 

------

 

 

 

 

f

 

=

2p

 

j

 

jj

(E.13)

 

 

 

 

 

 

 

j = n

 

 

 

for q = 0, 1, 2 . These nonparametric estimators are weighted sums of the scalar autocova-

ˆ

obtained above for j from n to n , where n , which Newey and West term the

riances jj

 

 

 

 

 

ˆ (q )

lag selection parameter, may be viewed as the bandwidth of a kernel estimator for f .

 

ˆ

may then be written as:

The Newey and West estimator for a(q)

 

ˆ

ˆ (q )

ˆ (0)

)

2

 

a(q) =

(f

§ f

(E.14)

for q = 1, 2 . This expression may be inserted into Equation (E.8) to obtain the expression for the plug-in optimal bandwidth estimator.

In comparing the Andrews estimator Equation (E.11) with the Newey-West estimator Equation (E.14) we see two very different methods of distilling results from the p -dimen- sions of the original data into a scalar measure a(q). Andrews computes parametric estimates of the generalized derivatives for the p individual elements, then aggregates the estimates into a single measure. In contrast, Newey and West aggregate early, forming lin-

782—Appendix E. Long-run Covariance Estimation

ear combinations of the autocovariance matrices, then use the scalar results to compute nonparametric estimators of the Parzen smoothness measures.

To implement the Newey-West optimal bandwidth selection method we require a value for n , the lag-selection parameter, which governs how many autocovariances to use in forming the nonparametric estimates of f (q) . Newey and West show that n should increase at (less than) a rate that depends on the properties of the kernel. For the Bartlett and the ParzenGeometric kernels, the rate is T 2 § 9 . For the Quadratic Spectral kernel, the rate is T 2 § 25 . For the remaining kernels, the rate is T 4 § 25 (with the exception of the Truncated and the Daniell kernels, for which the Newey-West theorems do not apply).

In addition, one must choose a weight vector w . Newey-West (1987) leave open the choice of w , but follow Andrew’s (1991) suggestion of ws = 1 for all but the intercept in their Monte Carlo simulations. EViews differs from this choice slightly, setting ws = 1 for all s .

Parametric VARHAC

Den Haan and Levin (1997) advocate the use of parametric methods, notably VARs, for LRCOV estimation. The VAR spectral density estimator, which they term VARHAC, involves estimating a parametric VAR model to filter the Vˆ t , computing the contemporaneous covariance of the filtered data, then using the estimates from the VAR model to obtain the implied autocovariances and corresponding LRCOV matrix of the original data.

Suppose we fit a VAR(q ) model to the {Vˆ t}. Let Aˆ j be the p ¥ p matrix of estimated j -th order AR coefficients, j = 1, º, q . Then we may define the innovation (filtered) data and estimated innovation covariance matrix as:

 

 

Vt

 

ˆ

q

ˆ

ˆ

 

 

 

 

 

 

 

 

 

 

 

(E.15)

 

 

= Vt  AjVt j

 

 

 

 

 

 

 

j

= 1

 

 

 

 

 

 

 

and

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

 

 

1

T

 

Vt Vt ¢

 

 

 

(0) =

Â

 

 

(E.16)

G

 

------------

 

 

 

 

 

 

T q

 

 

 

 

 

 

 

 

 

 

t

= q + 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

(0)

and the VAR

Given an estimate of the innovation contemporaneous variance matrix G

 

ˆ

 

 

 

 

 

 

 

 

 

 

 

ˆ

coefficients Aj , we can compute the implied theoretical autocovariances G(j)

of Vt . Sum-

ming the autocovariances yields a parametric estimator for Q , given by:

 

 

 

 

ˆ

T q

ˆ

ˆ

 

 

ˆ

 

 

 

 

------------------------

 

 

 

(E.17)

 

Q =

T

q K

D

G

 

(0)D

 

 

where

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

=

 

q

ˆ

–1

 

 

 

 

 

 

D

Ip

 Aj

 

 

 

 

(E.18)

 

 

 

 

j

= 1

 

 

 

 

 

 

 

Technical Discussion—783

Implementing VARHAC requires a specification for q , the order of the VAR. Den Haan and Levin use model selection criteria (AIC or BIC-Schwarz) using a maximum lag of T1 § 3 to determine the lag order, and provide simulations of the performance of estimator using data-dependent lag order.

The corresponding VARHAC estimators for the one-sided matrices L1 and L0 do not have

ˆ

ˆ

 

(0). We can, however, obtain insight into the

simple expressions in terms of Aj

and G

 

construction of the one-sided VARHAC LRCOVs by examining results for the VAR(1) case. Given estimation of a VAR(1) specification, the estimators for the one-sided long-run variances may be written as:

ˆ

 

T q

ˆ

j ˆ

 

T q

ˆ

ˆ

 

–1

ˆ

=

Â

=

 

L1

------------------------

(A1)

G(0)

------------------------

A1(Ip

A1 )

 

G(0)

 

 

T q K

 

 

 

T q K

 

 

 

 

 

 

 

 

 

j = 1

 

 

 

 

 

 

 

 

 

(E.19)

 

 

T q

 

 

 

T q

 

 

 

 

 

ˆ

 

ˆ

j ˆ

 

ˆ

 

–1

ˆ

 

=

Â

=

 

 

L0

------------------------

(A1)

G(0)

------------------------

(Ip A1)

 

G(0)

 

 

T q K

 

 

 

T q K

 

 

 

 

 

 

 

 

 

j = 0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ˆ

, as well as an esti-

Both estimators require estimates of the VAR(1) coefficient estimates A1

mate of G(0), the contemporaneous covariance matrix of Vˆ t .

One could, as in Park and Ogaki (1991) and Hansen (1992b), use the sample covariance

ˆ

ˆ ˆ

and L0 employ a mix of para-

matrix G(0)

= (1 § T)ÂVtVt¢ so that the estimates of L1

metric and non-parametric autocovariance estimates. Alternately, in keeping with the spirit

of the parametric methodology, EViews constructs a parametric estimator

˜

G(0) using the

ˆ

ˆ

 

(0).

 

estimated VAR(1) coefficients A1

and G

 

 

Prewhitened Kernel

Andrews and Monahan (1992) propose a simple modification of the kernel estimator which performs a parametric VAR prewhitening step to reduce autocorrelation in the data followed by kernel estimation performed on the whitened data. The resulting prewhitened LRVAR estimate is then recolored to undo the effects of the transformation. The Andrews and Monahan approach is a hybrid that combines the parametric VARHAC and nonparametric kernel techniques.

There is evidence (Andrews and Monahan 1992, Newey-West 1994) that this prewhitening approach has desirable properties, reducing bias, improving confidence interval coverage probabilities and improving sizes of test statistics constructed using the kernel HAC estimators.

The Andrews and Monahan estimator follows directly from our earlier discussion. As in a VARHAC, we first fit a VAR(q ) model to the Vˆ t and obtain the whitened data (residuals):

ˆ

 

 

ˆ

q

ˆ

 

=

ˆ

(E.20)

Vt

 

Vt  AjVt j

j = 1

784—Appendix E. Long-run Covariance Estimation

In contrast to the VAR specification in the VARHAC estimator, the prewhitening VAR specification is not necessarily believed to be the true time series model, but is merely a tool for obtaining Vt values that are closer to white-noise. (In addition, Andrews and Monahan adjust their VAR(1) estimates to avoid singularity when the VAR is near unstable, but EViews does not perform this eigenvalue adjustment.)

Next, we obtain an estimate of the LRCOV of the whitened data by applying a kernel estimator to the residuals:

 

 

 

ˆ

 

 

 

 

 

ˆ

 

(j)

 

 

 

 

 

 

 

 

 

 

(E.21)

 

 

 

Q

 

= Â k (j § bT ) G

 

 

 

 

 

 

j =

 

 

 

 

 

 

 

 

 

 

 

ˆ

 

(j) are given by

 

 

where the sample autocovariances G

 

 

 

ˆ

 

 

 

1

 

T

ˆ

 

ˆ

 

 

 

(j) =

 

 

Â

 

 

j 0

G

 

-------------

 

Vt

 

Vt j¢

 

 

 

 

T

q

 

 

 

 

 

 

(E.22)

 

 

 

 

 

t

= j + q + 1

 

 

 

 

 

ˆ

 

(j) =

ˆ

(j

 

 

j < 0

 

 

 

 

G

 

G

 

 

 

 

 

 

 

Lastly, we recolor the estimator to obtain the VAR prewhitened kernel LRCOV estimator:

ˆ

=

T q

 

ˆ ˆ

ˆ

 

------------------------

(E.23)

Q

T q

K

D Q

D

The prewhitened kernel procedure differs from VARHAC only in the computation of the LRCOV of the residuals. The VARHAC estimator in Equation (E.17) assumes that the residu-

als Vt are white noise so that the LRCOV may be estimated using the contemporaneous

variance matrix G , while the prewhitening kernel estimator in Equation (E.21) allows

ˆ (0)

for residual heteroskedasticity and serial dependence through its use of the HAC estimator

ˆ . Accordingly, it may be useful to view the VARHAC procedure as a special case of the

Q

prewhitened kernel with k(0) = 1 and k(x) = 0 for x π 0 .

The recoloring step for one-sided prewhitened kernel estimators is complicated when we

ˆ

 

(Park and Ogaki, 1991). As in the VARHAC setting, the

allow for HAC estimation of L1

 

expressions for one-sided LRCOVs are quite involved but the VAR(1) specification may be used to provide insight. Suppose that the VARHAC estimators of the one-sided LRCOV

 

 

 

 

 

 

 

ˆ

 

matrices defined in Equation (E.19) are given by L1

 

 

be the strict one-

and L0 , and let L1

sided kernel estimator computed using the prewhitened data:

 

ˆ

 

 

 

 

 

 

 

ˆ

(j)

(E.24)

L1 =

 k (j § bT ) G

 

j = 1

Then the prewhitened kernel one-sided LRCOV estimators are given by:

Соседние файлы в папке EViews Guides BITCH