Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский государственный физико-технический университет (МФТИ)

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

EViews Guides BITCH / EV72.pdf

Скачиваний:

670

Добавлен:

03.06.2015

Размер:

8.25 Mб

Скачать

☆

<<< < Предыдущая 37 38 39 40 41 42 43 44 45 46 47 4849 / 11949 50 51 52 53 54 55 56 57 58 59 60 61 > Следующая >>>

	Technical Details—319

where N – p	ˆ
where N – p	is the degrees-of-freedom under the alternative and f is an estimate of the

dispersion. EViews will estimate fˆ under the alternative hypothesis using the method specified in your equation.

We point out that the Ramsey test results (and all other GLM LR test statistics) presented here may be problematic since they rely on the GLM variance assumption, Papke and Wooldridge offer a robust LM formulation for the Ramsey RESET test. This test is not currently built-into EViews, but which may be constructed with some effort using auxiliary results provided by EViews (see Papke and Wooldridge, p. 625 for details on the test construction).

Technical Details

The following discussion offers a brief technical summary of GLMs, describing specification, estimation, and hypothesis testing in this framework. Those wishing greater detail should consult the McCullagh and Nelder’s (1989) monograph or the book-length survey by Hardin and Hilbe (2007).

Distribution

A GLM assumes that Yi are independent random variables following a linear exponential family distribution with density:

f(y	, v , f, w		) =	exp	yivi – b(vi)		+ c(y		(27.6)
						---------------------------		, f, w )
i	i	i				f § wi	i	i
where b and c are distribution specific functions. vi							= v(mi), which is termed the canon-

ical parameter, fully parameterizes the distribution in terms of the conditional mean, the dispersion value f is a possibly known scale nuisance parameter, and wi is a known prior weight that corrects for unequal scaling between observations with otherwise constant f .

The exponential family assumption implies that the mean and variance of Yi may be written as

E(Yi) = b¢(vi) = mi

(27.7)

Var(Yi) = (f § wi)b≤(vi) = (f § wi)Vm (mi)

where b¢(vi) and b≤(vi ) are the first and second derivatives of the b function, respectively, and Vm is a distribution-specific variance function that depends only on mi .

EViews supports the following exponential family distributions:

Family	vi	b(vi)		Vm	f
Normal	mi	vi	2 § 2	1	j2
Gamma	–1 § mi	–log(–vi)		m2	n

320—Chapter 27. Generalized Linear Models

Inverse Gaussian	–1 § (2mi2 )			–(–2v)1 § 2				m3	l
Poisson		log(mi)		evi				m	1
Binomial Proportion			pi		v
(ni trials)	log		-------------	log(1 + e		i	)	m(1 – mi)	1
(ni trials)	log		1 – pi	log(1 + e			)	m(1 – mi)	1
Negative Binomial	log		kimi	–log(1 – evi )				m(1 + kim)	1
(ki is known)	log		------------------	-------------------------------				m(1 + kim)	1
(ki is known)			1 + kimi	ki

The corresponding density functions for each of these distributions are given by:

• Normal

(2pj

wi )

–1 § 2

–(yi2 – 2yimi + mi

2 )

f(yi, mi, j, wi)

exp

-----------------------------------------------

2j2

§ wi

for –• < yi < •.

• Gamma

f(yi, mi, ri) =

(yiri § mi)ri exp(–yi

§ (mi §

ri))

-------------------------------------------------------------------------

yi G(ri)

for yi > 0 where ri

wi § n .

•

Inverse Gaussian

f(yi, mi, l, wi)

l §

wi)

–1 § 2

–(yi

– mi)2

= (2pyi

exp

--------------------------------

2yimi2(l § wi)

for yi > 0 .

•

Poisson

f(yi, mi)

miyi exp(–mi)

-----------------------------

yi!

(27.8)

(27.9)

(27.10)

(27.11)

for yi = 0, 1, 2, º The dispersion is restricted to be 1 and prior weighting is not permitted.

• Binomial Proportion

f(yi, ni, mi) =		ni		ni yi	(1 – mi )	ni(1 – yi)
f(yi, ni, mi) =			mi		(1 – mi )	(27.12)
	niyi

Technical Details—321

for 0 £ yi £ 1 where ni = 1, 2, º is the number of binomial trials. The dispersion
is restricted to be 1 and the prior weights wi = ni .
• Negative Binomial
f(yi, mi, ki) =	G(yi + 1 § ki)	kimi	yi	1	1 § ki
f(yi, mi, ki) =	--------------------------------------------	------------------		------------------	(27.13)
	G(yi + 1)G(1 § ki)	1 + kimi		1 + kimi

for yi = 0, 1, 2, º The dispersion is restricted to be 1 and prior weighting is not permitted.

In addition, EViews offers support for the following quasi-likelihood families:

Quasi-Likelihood Family	Vm
Poisson	m

Binomial Proportion	m(1 – m)

Negative Binomial (k )	m(1 + km)

Power Mean (r )	mr

Exponential Mean	em

Binomial Squared	m2(1 – m)2

The first three entries in the table correspond to overdispersed or prior weighted versions of the specified distribution. The last three entries are pure quasi-likelihood distributions that do not correspond to exponential family distributions. See “Quasi-likelihoods,” beginning on page 323 for additional discussion.

Link

The following table lists the names, functions, and associated range restrictions for the supported links:

Name	Link Function g(m)	Range of m

Identity	m	(–•, •)

Log	log(m)	(0, •)

Log-Complement	log(1 – m)	(–•, 1)

Logit	log(m § (1 – m))	(0, 1)

Probit	F–1(m)	(0, 1)

322—Chapter 27. Generalized Linear Models

Log-Log	–log(–log(m))				(0, 1)

Complementary	log(–log(1 – m))				(0, 1)
Log-Log	log(–log(1 – m))				(0, 1)
Log-Log

Inverse	1 § m				(–•, •)

Power ( p )	mp	if p π 0			(0,	•)
Power ( p )	log(m)	if p =	0		(0,	•)
	log(m)	if p =	0

Power Odds Ratio (p )	(m § (1 – m))p	if p π 0			(0, 1)
Power Odds Ratio (p )	log(m § (1 – m)) if p =			0	(0, 1)
	log(m § (1 – m)) if p =			0

Box-Cox (p )	(mp – 1) § p	if p π 0			(0,	•)
Box-Cox (p )	log(m)	if p = 0			(0,	•)
	log(m)	if p = 0

Box-Cox Odds Ratio	((m § (1 – m))p – 1) § p		if p π 0		(0, 1)
(p )	log(m § (1 – m))		if p = 0		(0, 1)
(p )	log(m § (1 – m))		if p = 0

EViews does not restrict the link choices associated with a given distributional family. Thus, it is possible for you to choose a link function that returns invalid mean values for the specified distribution at some parameter values, in which case your likelihood evaluation and estimation will fail.

One important role of the inverse link function is to map the real number domain of the linear index into the range of the dependent variable. Consequently the choice of link function is often governed in part by the desire to enforce range restrictions on the fitted mean. For example, the mean of a binomial proportions or negative binomial model must be between 0 and 1, while the Poisson and Gamma distributions require a positive mean value. Accordingly, the use of a Logit, Probit, Log-Log, Complementary Log-Log, Power Odds Ratio, or Box-Cox Odds Ratio is common with a binomial distribution, while the Log, Power, and Box-Cox families are generally viewed as more appropriate for Poisson or Gamma distribution data.

EViews will default to use the canonical link for a given distribution. The canonical link is the function that equates the canonical parameter v of the exponential family distribution and the linear predictor h = g(m) = v(m). The canonical links for relevant distributions are given by:

Family	Canonical Link

Normal	Identity

Gamma	Inverse

Technical Details—323

Inverse Gaussian	Power (p = –2)

Poisson	Log

Binomial Proportion	Logit

The negative binomial canonical link is not supported in EViews so the log link is used as the default choice in this case. We note that while the canonical link offers computational and conceptual convenience, it is not necessarily the best choice for a given problem.

Quasi-likelihoods

Wedderburn (1974) proposed the method of maximum quasi-likelihood for estimating regression parameters when one has knowledge of a mean-variance relationship for the response, but is unwilling or unable to commit to a valid fully specified distribution function.

Under the assumption that the Yi are independent with mean mi and variance Var(Yi) = Vm (mi)(f § wi), the function,

Ui	= u(mi, yi, f, wi)	=	yi – mi
			(-----------------------------------f § wi)Vm	(mi-)

has the properties of an individual contribution to a score. Accordingly, the integral,

	mi	yi – t
Q(mi, yi, f, wi) =	Ú		(t)dt
		(f § wi)Vm
		---------------------------------

(27.14)

(27.15)

if it exists, should behave very much like a log-likelihood contribution. We may use to the individual contributions Qi to define the quasi-log-likelihood, and the scaled and unscaled quasi-deviance functions

	N
q(m, y, f, w)	= Â Q(mi, yi, f, wi)
	i = 1	(27.16)
D (m, y, f, w)	= –2q(m, y, f, w)	(27.16)
D (m, y, f, w)	= –2q(m, y, f, w)
D(m, y, w)	= –2fD (m, y, f, w)

We may obtain estimates of the coefficients by treating the quasi-likelihood q(m, y, f, w) as though it were a conventional likelihood and maximizing it respect to b . As with conventional GLM likelihoods, the quasi-ML estimate of b does not depend on the value of the dispersion parameter f . The dispersion parameter is conventionally estimated using the Pearson x2 statistic, but if the mean-variance assumption corresponds to a valid exponential family distribution, one may also employ the deviance statistic.

For some mean-variance specifications, the quasi-likelihood function corresponds to an ordinary likelihood in the linear exponential family, and the method of maximum quasi-like-

324—Chapter 27. Generalized Linear Models

lihood is equivalent to ordinary maximum likelihood. For other specifications, there is no corresponding likelihood function. In both cases, the distributional properties of the maximum quasi-likelihood estimator will be analogous to those obtained from maximizing a valid likelihood (McCullagh 1983).

We emphasize the fact that quasi-likelihoods offer flexibility in the mean-variance specification, allowing for variance assumptions that extend beyond those implied by exponential family distribution functions. One important example occurs when we modify the variance function for a Poisson, Binomial Proportion, or Negative Binomial distribution to allow a free dispersion parameter.

Furthermore, since the quasi-likelihood framework only requires specification of the mean and variance, it may be used to relax distributional restrictions on the form of the response data. For example, while we are unable to evaluate the Poisson likelihood for non-integer data, there are no such problems for the corresponding quasi-likelihood based on meanvariance equality.

A list of common quasi-likelihood mean-variance assumptions is provided below, along with names for the corresponding exponential family distribution:

Vm (m)		Restrictions	Distribution
	1	None	Normal

m		m > 0, y ≥ 0	Poisson

m2		m > 0, y > 0	Gamma
mr		m > 0, r π 0, 1, 2	---

em		None	---

m(1	– m)	0 < m < 1, 0 £ y £ 1	Binomial Proportion

m2(1	– m)2	0 < m < 1, 0 £ y £ 1	---
m(1 + km)		m > 0, y ≥ 0	Negative Binomial

Note that the power-mean mr , exponential mean exp(m), and squared binomial proportion m2(1 – m)2 variance assumptions do not correspond to exponential family distributions.

Estimation

Estimation of GLM models may be divided into the estimation of three basic components: the b coefficients, the coefficient covariance matrix S , and the dispersion parameter f .

Technical Details—325

Coefficient Estimation

The estimation of b is accomplished using the method of maximum likelihood (ML). Let y = (y1, º, yN)¢ and m = (m1, º, mN)¢. We may write the log-likelihood function as

l(m, y, f, w) =

Â logf(yi, vi, wi)

(27.17)

i = 1

Differentiating l(m, y, f, w)

with respect to b yields

∂l

∂logf(yi, vi, f, wi)

∂vi

----- =

----------------------------------------------

------

∂vi

∂b

i = 1

N yi – b¢(vi)

∂vi

∂mi

∂hi

------------------------

------

-------

(27.18)

f § wi

∂m

∂h

∂b

i = 1

yi – mi

∂mi

----

-------

----------------

i = 1

Vm (mi)

∂h

where the last equality uses the fact that ∂vi § ∂m

Vm (mi)–1 . Since the scalar dispersion

parameter f is incidental to the first-order conditions, we may ignore it when estimating b . In practice this is accomplished by evaluating the likelihood function at f = 1 .

It will prove useful in our discussion to define the scaled deviance D and the unscaled deviance D as

D (m, y, f, w) = –2{l(m, y, f, w) – l(y, y, f, w)}

(27.19)

D(m, y, w) = fD (m, y, f, w)

respectively. The scaled deviance D compares the likelihood function for the saturated (unrestricted) log-likelihood, l(y, y, f, w), with the log-likelihood function evaluated at an arbitrary m , l(m, y, f, w).

The unscaled deviance D is simply the scaled deviance multiplied by the dispersion, or equivalently, the scaled deviance evaluated at f = 1 . It is easy to see that minimizing either deviance with respect to b is equivalent to maximizing the log-likelihood with respect to the b .

In general, solving for the first-order conditions for b requires an iterative approach. EViews offers four different algorithms for obtaining solutions: Quadratic Hill Climbing, Newton-Raphson, BHHH, and IRLS - Fisher Scoring. All of these methods are variants of Newton’s method but differ in the method for computing the gradient weighting matrix used in coefficient updates. The first three methods are described in “Optimization Algorithms” on page 755.

IRLS, which stands for Iterated Reweighted Least Squares, is a commonly used algorithm for estimating GLM models. IRLS is equivalent to Fisher Scoring, a Newton-method variant that

326—Chapter 27. Generalized Linear Models

employs the Fisher Information (negative of the expected Hessian matrix) as the update weighting matrix in place of the negative of the observed Hessian matrix used in standard Newton-Raphson, or the outer-product of the gradients (OPG) used in BHHH.

In the GLM context, the IRLS-Fisher Scoring coefficient updates have a particularly simple form that may be implemented using weighted least squares, where the weights are known functions of the fitted mean that are updated at each iteration. For this reason, IRLS is particularly attractive in cases where one does not have access to custom software for estimating GLMs. Moreover, in cases where one’s preference is for an observed-Hessian Newton method, the least squares nature of the IRLS updates make the latter well-suited to refining starting values prior to employing one of the other methods.

Coefficient Covariance Estimation

You may choose from a variety of estimators for , the covariance matrix of ˆ . In describ-

S b

ing the various approaches, it will be useful to have expressions at hand for the expected Hessian (I ), the observed Hessian (H ), and the outer-product of the gradients (J ) for GLM

models. Let X = (X1, X2, º, XN)¢. Then given estimates of b

and the dispersion f

(See “Dispersion Estimation,” on page 327), we may write

–E

∂2l

---------------

= X¢LIX

∂b

∂2l

---------------

–

∂b∂b

X¢LHX

(27.20)

∂logfi

--------------

X¢LJX

∂b

∂b¢

i = 1

are diagonal matrices with corresponding i-th diagonal elements

where LI ,

, and LJ

–1

∂m

-------i

lI, i

= (wi § f)Vm (mi)

∂h

–2 ∂mi

–1

∂

= l

– m )

(m )

∂Vm (mi)

–

lH, i

(w § f)(y

-------

-------------------

V (m )

----------

(27.21)

I, i

∂h

∂m

m i

∂h2

–1

∂m

)

-------i

lJ, i

f)(y

– m

∂h

Given correct specification of the likelihood, asymptotically consistent estimators for the S may be obtained by taking the inverse of one of these estimators of the information matrix. In practice, one typically matches the covariance matrix estimator with the method of esti-

ˆ	=	ˆ –1	when esti-
mation (i.e., using the inverse of the expected information estimator SI		I

Technical Details—327

mation is performed using IRLS) but mirroring is not required. By default, EViews will pair the estimation and covariance methods, but you are free to mix and match as you see fit.

If the variance function is incorrectly specified, the GLM inverse information covariance estimators are no longer consistent for S . The Huber-White Sandwich estimator (Huber 1967, White 1980) permits non GLM-variances and is robust to misspecification of the variance function. EViews offers two forms for the estimator; you may choose between one that

					ˆ	=	ˆ –1 ˆ ˆ –1	) or one that uses the observed Hessian
employs the expected information ( SIJ						=	I J I	) or one that uses the observed Hessian
ˆ	=	ˆ	–1 ˆ ˆˆ	–1	).
(SHJ	=	H	JH		).

Lastly, you may choose to estimate the coefficient covariance with or without a degree-of- freedom correction. In practical terms, this computation is most easily handled by using a non d.f.-corrected version of fˆ in the basic calculation, then multiplying the coefficient covariance matrix by N § (N – k) when you want to apply the correction.

Dispersion Estimation

Recall that the dispersion parameter f may be ignored when estimating b . Once we have

obtained ˆ , we may turn attention to obtaining an estimate of . With respect to the esti- b f

mation of f , we may divide the distribution families into two classes: distributions with a free dispersion parameter, and distributions where the dispersion is fixed.

For distributions with a free dispersion parameter (Normal, Gamma, Inverse Gaussian), we must estimate f . An estimate of the free dispersion parameter f may be obtained using the generalized Pearson x2 statistic (Wedderburn 1972, McCullagh 1983),

ˆ			1	N	ˆ	2
ˆ	P	=	1	Â	wi(yi – mi)
f	P		-------------	Â	ˆ		(27.22)
f			-------------		----------------------------		(27.22)
			N – k	i = 1	Vm (mi)
				i = 1

where k is the number of estimated coefficients. In linear exponential family settings, f may also be estimated using the unscaled deviance statistic (McCullagh 1983),

ˆ	=	D(m, y, w)	(27.23)
fD	=	--------------------------	(27.23)
		N – k

For distributions where the dispersion is fixed (Poisson, Binomial, Negative Binomial), f is naturally set to the theoretically proscribed value of 1.0.

In fixed dispersion settings, the theoretical restriction on the dispersion is sometimes violated in the data. This situation is generically termed overdispersion since f typically exceeds 1.0 (though underdispersion is a possibility). At a minimum, unaccounted for over-

dispersion leads to invalid inference, with estimated standard errors of the ˆ typically b

understating the variability of the coefficient estimates.

The easiest way to correct for overdispersion is by allowing a free dispersion parameter in the variance function, estimating f using one of the methods described above, and using the estimate when computing the covariance matrix as described in “Coefficient Covariance

328—Chapter 27. Generalized Linear Models

Estimation,” on page 326. The resulting covariance matrix yields what are sometimes termed GLM standard errors.

Bear in mind that estimating ˆ given a fixed dispersion distribution violates the assump- f

tions of the likelihood so that standard ML theory does not apply. This approach is, however, consistent with a quasi-likelihood estimation framework (Wedderburn 1974), under which the coefficient estimator and covariance calculations are theoretically justified (see “Quasi-likelihoods,” beginning on page 323). We also caution that overdispersion may be evidence of more serious problems with your specification. You should take care to evaluate the appropriateness of your model.

Computational Details

The following provides additional details for the computation of results:

Residuals

There are several different types of residuals that are computed for a GLM specification:

• The ordinary or response residuals are defined as

ˆ	=	ˆ	(27.24)
eoi		(yi – mi)

The ordinary residuals are simply the deviations from the mean in the original scale of the responses.

• The weighted or Pearson residuals are given by
ˆ	=	ˆ	–1 § 2	ˆ	(27.25)
epi	=	[(1 § wi)Vm (mi)]		(yi – mi)	(27.25)

The weighted residuals divide the ordinary response variables by the square root of the unscaled variance. For models with fixed dispersion, the resulting residuals should have unit variance. For models with free dispersion, the weighted residuals may be used to form an estimator of f .

• The standardized or scaled Pearson residuals) are computed as

ˆ	=	ˆ	ˆ	–1 § 2	(yi – mˆ i)	(27.26)
esi	=	[(f § wi)Vm	(mi)]		(yi – mˆ i)	(27.26)

The standardized residuals are constructed to have approximately unit variance.

• The generalized or score residuals are given by
ˆ	=	ˆ	ˆ	–1	ˆ	ˆ	(27.27)
egi	=	[(f § wi)Vm	(mi)]		(∂mi	§ ∂h )(yi – mi )	(27.27)

The scores of the GLM specification are obtained by multiplying the explanatory variables by the generalized residuals (Equation (27.18)). Not surprisingly, the generalized residuals may be used in the construction of LM hypothesis tests.

Technical Details—329

Sum of Squared Residuals

EViews reports two different sums-of-squared residuals: a basic sum of squared residuals, SSR = Âeˆoi2 , and the Pearson SSR, SSRP = Âeˆ2pi .

Dividing the Pearson SSR by (N – k) produces the Pearson x2 statistic which may be used as an estimator of f , (“Dispersion Estimation” on page 327) and, in some cases, as a measure of goodness-of-fit.

Log-likelihood and Information Criteria

EViews always computes GLM log-likelihoods using the full specification of the density function: scale factors, inessential constants, and all. The likelihood functions are listed in “Distribution,” beginning on page 319.

If your dispersion specification calls for a fixed value for f , the fixed value will be used to compute the likelihood. If the distribution and dispersion specification call for f to be estimated, fˆ will be used in the evaluation of the likelihood. If the specified distribution calls for a fixed value for f but you have asked EViews to estimate the dispersion, or if the specified value is not consistent with a valid likelihood, the log-likelihood will not be computed.

The AIC, SIC, and Hannan-Quinn information criteria are computed using the log-likelihood value and the usual definitions (Appendix D. “Information Criteria,” on page 771).

It is worth mentioning that computed GLM likelihood value for the normal family will differ slightly from the likelihood reported by the corresponding LS estimator. The GLM likelihood follows convention in using a degree-of-freedom corrected estimator for the dispersion while the LS likelihood uses the uncorrected ML estimator of the residual variance. Accordingly, you should take care not compare likelihood functions estimated using the two methods.

Deviance and Quasi-likelihood

EViews reports the unscaled deviance D(m, y, w) or quasi-deviance. The quasi-deviance and quasi-likelihood will be reported if the evaluation of the likelihood function is invalid. You may divide the reported deviance by (N – k) to obtain an estimator of the dispersion, or use the deviance to construct likelihood ratio or F-tests.

In addition, you may divide the deviance by the dispersion to obtain the scaled deviance. In some cases, the scaled deviance may be used as a measure of goodness-of-fit.

Restricted Deviance and LR Statistic

The restricted deviance and restricted quasi-likelihood reported on the main page are the values for the constant only model.

<<< < Предыдущая 37 38 39 40 41 42 43 44 45 46 47 4849 / 11949 50 51 52 53 54 55 56 57 58 59 60 61 > Следующая >>>

Соседние файлы в папке EViews Guides BITCH

#
03.06.20158.25 Mб670EV72.pdf
#
03.06.201513.69 Mб244EViews_Illustrated.pdf
#
03.06.20151 Mб196EViews_tutorial.pdf