Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Eviews5 / EViews5 / Docs / EViews 5 Users Guide.pdf
Скачиваний:
196
Добавлен:
23.03.2015
Размер:
11.51 Mб
Скачать

400—Chapter 13. Statistical Graphs from Series and Groups

Note that we select the Exact method option since there are only 69 observations to evaluate the kernel. The kernel density result is depicted below:

Kernel Density (Normal, h = 0.0800)

2.0

 

 

 

 

 

 

 

1.6

 

 

 

 

 

 

 

1.2

 

 

 

 

 

 

 

0.8

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

 

0.0

 

 

 

 

 

 

 

7.4

7.6

7.8

8.0

8.2

8.4

8.6

8.8

 

 

 

CDRATE

 

 

 

This density estimate has about the right degree of smoothing. Interestingly enough, this density has a trimodal shape with modes at the “focal” numbers 7.5, 8.0, and 8.5.

Scatter Diagrams with Fit Lines

The view menu of a group includes four variants of scatterplot diagrams. Click on View/ Graph/Scatter, then select Simple Scatter to plot a scatter diagram with the first series on the horizontal axis and the remaining series on the vertical axis. The XY Pairs form of the scatterplot graph, plots scatter diagrams in pairs, with the first series plotted against the second, and the third plotted against the fourth, etc.

Scatter Diagrams with Fit Lines—401

The remaining three graphs, Scatter with Regression, Scatter with Nearest Neighbor Fit, and Scatter with Kernel Fit plot fitted lines for the scatterplot of the first series against the second series.

Scatter with Regression

This view fits a bivariate regression of transformations of the second series in the group Y on transformations of the first series in the group X (and a constant).

The following transformations of the series are available for the bivariate fit:

None

y

x

 

 

 

 

Logarithmic

log ( y)

log ( x)

 

 

 

Inverse

1 ⁄ y

1 ⁄ x

 

 

 

Power

ya

xb

Box-Cox

( ya − 1 ) ⁄ a

( xb − 1) ⁄ b

 

 

 

Polynomial

1, x, x2, …, xb

 

 

 

where you specify the parameters a and b in the edit field. Note that the Box-Cox transformation with parameter zero is the same as the log transformation.

If any of the transformed values are not available, EViews returns an error message. For example, if you take logs of negative values, noninteger powers of nonpositive values, or inverses of zeros, EViews will stop processing and issue an error message.

402—Chapter 13. Statistical Graphs from Series and Groups

If you specify a high-order polynomial, EViews may be forced to drop some of the high order terms to avoid collinearity.

When you click OK, EViews displays a scatter diagram of the series together with a line connecting the fitted values from the regression. You may optionally save the fitted values as a series. Type a name for the fitted series in the Fitted Y series edit field.

Robustness Iterations

The least squares method is very sensitive to the presence of even a few outlying observations. The Robustness Iterations option carries out a form of weighted least squares where outlying observations are given relatively less weight in estimating the coefficients of the regression.

For any given transformation of the series, the Robustness Iteration option carries out robust fitting with bisquare weights. Robust fitting estimates the parameters a , b to minimize the weighted sum of squared residuals,

 

 

N

 

 

 

 

 

 

 

 

 

 

Σ ri( yi a xib)2

(13.7)

 

 

i = 1

 

 

 

 

 

 

 

 

where yi

and xi are the transformed series and the bisquare robustness weights r are

given by,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

2

) )

2

ei ⁄ 6m

 

< 1

 

 

 

r =

( 1 − ei ⁄ ( 36m

 

for

 

(13.8)

 

 

0

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

where ei

= yi a xib is the residual from the previous iteration (the first iteration

weights are determined by the OLS residuals), and m is the median of

ei

. Observations

with large residuals (outliers) are given small weights when forming the weighted sum of squared residuals.

To choose robustness iterations, click on the check box for Robustness Iterations and specify an integer for the number of iterations.

See Cleveland (1993) for additional discussion.

Scatter with Nearest Neighbor Fit

This view displays local polynomial regressions with bandwidth based on nearest neighbors. Briefly, for each data point in a sample, we fit a locally weighted polynomial regression. It is a local regression since we use only the subset of observations which lie in a neighborhood of the point to fit the regression model; it may be weighted so that observations further from the given data point are given less weight.

Scatter Diagrams with Fit Lines—403

This class of regressions includes the popular Loess (also known as Lowess) techniques described by Cleveland (1993, 1994). Additional discussion of these techniques may be found in Fan and Gijbels (1996), and in Chambers, Cleveland, Kleiner, Tukey (1983).

Method

You should choose between computing the local regression at each data point in the sample, or using a subsample of data points.

Exact (full sample) fits a local regression at every data point in the sample.

Cleveland subsampling performs the local regression at only a subset of points. You should provide the size of the subsample M in the edit box.

The number of points at which the local regressions are computed is approximately equal to M . The actual number of points will depend on the distribution of the explanatory variable.

Since the exact method computes a regression at every data point in the sample, it may be quite time consuming when applied to large samples. For samples with over 100 observations, you may wish to consider subsampling.

The idea behind subsampling is that the local regression computed at two adjacent points should differ by only a small amount. Cleveland subsampling provides an adaptive algorithm for skipping nearby points in such a way that the subsample includes all of the representative values of the regressor.

It is worth emphasizing that at each point in the subsample, EViews uses the entire sample in determining the neighborhood of points. Thus, each regression in the Cleveland subsample corresponds to an equivalent regression in the exact computation. For large data sets, the computational savings are substantial, with very little loss of information.

Specification

For each point in the sample selected by the Method option, we compute the fitted value by running a local regression using data around that point. The Specification option determines the rules employed in identifying the observations to be included in each local regression, and the functional form used for the regression.

Bandwidth span determines which observations should be included in the local regressions. You should specify a number α between 0 and 1. The span controls the smoothness of the local fit; a larger fraction α gives a smoother fit. The fraction α instructs EViews to

404—Chapter 13. Statistical Graphs from Series and Groups

include the αN observations nearest to the given point, where αN is 100α % of the total sample size, truncated to an integer.

Note that this standard definition of nearest neighbors implies that the number of points need not be symmetric around the point being evaluated. If desired, you can force symmetry by selecting the Symmetric neighbors option.

Polynomial degree specifies the degree of polynomial to fit in each local regression.

If you mark the Bracket bandwidth span option, EViews displays three nearest neighbor fits with spans of 0.5α , α , and 1.5 α .

Other Options

Local Weighting (Tricube) weights the observations of each local regression. The weighted regression minimizes the weighted sum of squared residuals

 

N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Σ wi( yi a xib1 xi2b2 − … − xikbk) .

(13.9)

 

i = 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The tricube weights w are given by

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

di

 

3 3

for

 

 

di

 

< 1

 

 

 

 

 

 

 

 

 

 

1

-----------------------

 

-----------------------

 

wi

=

 

 

d(

 

αN

 

)

 

 

d(

 

αN

 

)

 

(13.10)

 

 

 

 

0

 

 

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

where di = xi x and d( αN ) is the αN -th smallest such distance. Observations that are relatively far from the point being evaluated get small weights in the sum of squared residuals. If you turn this option off, each local regression will be unweighted with wi = 1 for all i .

Robustness Iterations iterates the local regressions by adjusting the weights to downweight outlier observations. The initial fit is obtained using weights wi , where wi is tricube if you choose Local Weighting and 1 otherwise. The residuals ei from the initial fit are used to compute the robustness bisquare weights ri as given on (p. 402). In the second iteration, the local fit is obtained using weights wiri . We repeat this process for the user specified number of iterations, where at each iteration the robustness weights ri are recomputed using the residuals from the last iteration.

Symmetric Neighbors forces the local regression to include the same number of observations to the left and to the right of the point being evaluated. This approach violates the definition, though not the spirit, of nearest neighbor regression.

To save the fitted values as a series; type a name in the Fitted series field box. If you have specified subsampling, EViews will linearly interpolate to find the fitted value of y for the

Scatter Diagrams with Fit Lines—405

actual value of x . If you have marked the Bracket bandwidth span option, EViews saves three series with _L, _M, _H appended to the name, each corresponding to bandwidths of 0.5 α , α , and 1.5α , respectively.

Note that Loess is a special case of nearest neighbor fit, with a polynomial of degree 1, and local tricube weighting. The default EViews options are set to provide Loess estimation.

Scatter with Kernel Fit

This view displays fits of local polynomial kernel regressions of the second series in the group Y on the first series in the group X. Both the nearest neighbor fit, described above, and the kernel fit are nonparametric regressions that fit local polynomials. The two differ in how they define “local” in the choice of bandwidth. The effective bandwidth in nearest neighbor regression varies, adapting to the observed distribution of the regressor. For the kernel fit, the bandwidth is fixed but the local observations are weighted according to a kernel function.

Extensive discussion may be found in Simonoff (1996), Hardle (1991), Fan and Gijbels (1996).

Local polynomial kernel regressions fit Y at each value x , by choosing the parameters β to minimize the weighted sum-of-squared residuals:

m( x) =

N

k 2

 

 

x Xi

 

 

K

(13.11)

Σ

( Yi β0 β1( x Xi) + −… −βk( x Xi) )

---------------

 

 

 

 

h

 

 

i = 1

 

 

 

 

 

where N is the number of observations, h is the bandwidth (or smoothing parameter), and K is a kernel function that integrates to one. Note that the minimizing estimates of β will differ for each x .

When you select the Scatter with Kernel Fit view, the Kernel Fit dialog appears.

You will need to specify the form of the local regression, the kernel, the bandwidth, and other options to control the fit procedure.

Regression

Specify the order of polynomial k to be fit at each data point. The NadarayaWatson option sets k = 0 and locally fits a constant at each x . Local Linear sets k = 1 at each x . For higher order polynomials, mark the Local Polynomial

option and type in an integer in the field box to specify the order of the polynomial.

406—Chapter 13. Statistical Graphs from Series and Groups

Kernel

The kernel is the function used to weight the observations in each local regression. EViews provides the option of selecting one of the following kernel functions:

Epanechnikov (default)

3

( 1 − u

 

2

) I(

 

u

 

≤ 1 )

 

--

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Triangular

( 1 −

 

u

 

) ( I(

 

u

 

≤ 1) )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Uniform (Rectangular)

 

1

( I(

 

u

 

≤ 1 ))

 

 

--

 

 

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Normal (Gaussian)

 

1

 

 

 

 

 

 

 

 

 

 

1

 

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

---------- exp

 

--u

 

 

 

2 π

 

 

 

 

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Biweight (Quartic)

15

( 1

u

2

)

2

 

 

 

 

 

 

u

 

≤ 1 )

 

-----

 

 

 

 

I(

 

 

16

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Triweight

35

( 1

u

2

)

3

 

 

 

 

 

 

u

 

≤ 1 )

 

-----

 

 

 

 

I(

 

 

32

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Cosinus

π

 

 

π

 

 

 

 

 

 

I

(

 

u

 

≤ 1 )

 

-- cos

 

--u

 

 

 

 

4

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

where u is the argument of the kernel function and I is the indicator function that takes a value of one, if its argument is true, and zero otherwise.

Bandwidth

The bandwidth h determines the weights to be applied to observations in each local regression. The larger the h , the smoother the fit. By default, EViews arbitrarily sets the bandwidth to:

h = 0.15( XU XL)

(13.12)

where ( XU XL) is the range of X .

For nearest neighbor bandwidths, see Scatter with Nearest Neighbor Fit.

To specify your own bandwidth, mark User Specified and enter a nonnegative number for the bandwidth in the edit box.

Bracket Bandwidth option fits three kernel regressions using bandwidths 0.5 h , h , and

1.5 h .

Scatter Diagrams with Fit Lines—407

Number of grid points

You must specify the number of points M at which to evaluate the local polynomial regression. The default is M = 100 points; you can specify any integer in the field. Suppose the range of the series X is [XL,XU] . Then the polynomial is evaluated at M equispaced points:

xi

= XL

+ i

XU XL

for i = 0, 1, … M − 1

(13.13)

---------------------

 

 

 

M

 

 

 

Method

Given a number of evaluation points, EViews provides you with two additional computational options: exact computation and linear binning.

The Exact method performs a regression at each xi , using all of the data points ( Xj, Yj) , for j = 1, 2, …, N . Since the exact method computes a regression at every grid point, it may be quite time consuming when applied to large samples. In these settings, you may wish to consider the linear binning method.

The Linear Binning method (Fan and Marron 1994) approximates the kernel regression by binning the raw data Xj fractionally to the two nearest evaluation points, prior to evaluating the kernel estimate. For large data sets, the computational savings may be substantial, with virtually no loss of precision.

To save the fitted values as a series, type a name in the Fitted Series field box. EViews will save the fitted Y to the series, linearly interpolating points computed on the grid, to find the appropriate value. If you have marked the Bracket Bandwidth option, EViews saves three series with “_L”, “_M”, “_H” appended to the name, each corresponding to bandwidths 0.5α , α , and 1.5 α , respectively.

Example

As an example, we estimate a bivariate relation for a simulated data set of the type used by Hardle (1991). The data were generated by:

scalar pi = @atan(1)*4

series x = rnd

series y = sin(2*pi*x^3)^3 + nrnd*(0.1^.5)

The simple scatter of Y and the “true” conditional mean of Y against X looks as follows:

408—Chapter 13. Statistical Graphs from Series and Groups

2.0

 

1.5

 

1.0

 

0.5

 

 

Y

0.0

YTRUE

-0.5

-1.0

-1.5

0.0

0.2

0.4

0.6

0.8

1.0

X

The “+” shapes in the middle of the scatterplot trace out the “true” conditional mean of Y. Note that the true mean reaches a peak around x = 0.6 , a valley around x = 0.9 , and a saddle around x = 0.8 .

To fit a nonparametric regression of Y on X, you first create a group containing the series Y and X. The order that you enter the series is important; the explanatory series variable must be the first series in the group. Highlight the series name X and then Y, double click in the highlighted area, select Open Group, and select View/Graph/Scatter/Scatter with Nearest Neighbor Fit, and repeat the procedure for Scatter with Kernel Fit.

The two fits, computed using the EViews default settings, are shown below:

LOESS Fit (degree = 1, span = 0.3000)

 

Kernel Fit (Epanechnikov, h= 0.1488)

 

2.0

 

 

 

 

 

2.0

 

 

 

 

 

1.5

 

 

 

 

 

1.5

 

 

 

 

 

1.0

 

 

 

 

 

1.0

 

 

 

 

 

0.5

 

 

 

 

 

0.5

 

 

 

 

 

Y

 

 

 

 

 

Y

 

 

 

 

 

0.0

 

 

 

 

 

0.0

 

 

 

 

 

-0.5

 

 

 

 

 

-0.5

 

 

 

 

 

-1.0

 

 

 

 

 

-1.0

 

 

 

 

 

-1.5

 

 

 

 

 

-1.5

 

 

 

 

 

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

 

 

 

X

 

 

 

 

 

X

 

 

Both local regression lines seem to capture the peak, but the kernel fit is more sensitive to the upturn in the neighborhood of X=1. Of course, the fitted lines change as we modify the options, particularly when we adjust the bandwidth h and window width α .

Соседние файлы в папке Docs