Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Санкт-Петербургский политехнический университет Петра Великого (бывш. СПбГПУ)

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

Econometrics2011

.pdf

Скачиваний:

Добавлен:

21.03.2016

Размер:

1.77 Mб

Скачать

☆

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 11 1213 / 3013 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

113

The product xiei is iid (since the observations are iid) and mean zero (since E(xiei) = 0):

De…ne the k k covariance matrix

x x e2 :

(6.8)

We require the elements of to be …nite, orEequivalentlyi i0 i

that

: Using

2 =

kxik2 ei2 and the Cauchy-Schwarz Inequality (B.20),

Ekxieik2 = E kxik2 ei2

Ekxik4

1=2 Eei4 1=2

(6.9)

which is …nite if xi and ei have …nite fourth moments. As ei

is a linear combination of yi and xi;

it is su¢ cient that the observables have …nite fourth moments (Theorem 3.16.1.6).

Assumption 6.4.1 In addition to Assumption

3.16.1,

and

Ekxik4 < 1:

Under Assumption 6.4.1 the CLT (Theorem 2.8.1) can be applied.

Theorem 6.4.1 Under Assumption 1.5.1 and Assumption 6.4.1, as n ! 1

		n
1		Xi
1		d	(6.10)
p	n	=1 xiei ! N (0; )	(6.10)

where = E xix0ie2i :

Putting together (6.1), (6.7), and (6.10),

p			d	1		(0; )
	n				N
			! Qxx			1	1
		b					Qxx
			= N 0; Qxx

as n ! 1; where the …nal equality follows from the property that linear combinations of normal vectors are also normal (Theorem B.9.1).

We have derived the asymptotic normal approximation to the distribution of the least-squares estimator.

Theorem 6.4.2 Asymptotic Normality of Least-Squares Estimator

Under Assumption 1.5.1 and Assumption 6.4.1, as n ! 1

		d
pn		d
pn		! N (0; V )
where	b
	V = Qxx1		Qxx1;	(6.11)
Qxx = E(xixi0) ; and = E xixi0ei2 :

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES					114
In the stochastic order notation, Theorem 6.4.2 implies that
		= + O		(n 1=2)	(6.12)
and		b	p
		b
			= Op(n 1=2)
which is stronger than (6.6).							: Con-
The matrix V = avar( ) is the variance of the asymptotic distribution of p
						n
sequently, V is often	referred to as the		asymptotic covariance matrix		of : The expression
sequently, V is often	b		asymptotic covariance matrix				b

V = Qxx1 Qxx1 is called a sandwich form. It might be worth noticing that there is a di¤erence
between the variance of the asymptotic distribution given in (6.11) and the …nite-sampleb						conditional
variance in the CEF model as given in (5.11):
1	1		1	1	1
V = nX0X		nX0DX nX0X			:
b

While V	and V are di¤erent, the two are close if n is large. Indeed, as n ! 1
	b

V b ! V :

There is a special case where and V simplify. We say that ei is a Homoskedastic Projection Error when

cov(xixi0; ei2) = 0:

(6.13)

Condition (6.13) holds in the homoskedastic linear regression model, but is somewhat broader. Under (6.13) the asymptotic variance formulas simplify as

			x0
=	E	x	x0	E e2	= Q	2		(6.14)
	E	i	i	i	xx
V =	Qxx1 Qxx1 = Qxx1 2						V 0	(6.15)

In (6.15) we de…ne V 0 = Qxx1 2 whether (6.13) is true or false. When (6.13) is true then V = V 0 ; otherwise V 6= V 0 : We call V 0 the homoskedastic asymptotic covariance matrix.

Theorem 6.4.2 states that the sampling distribution of the least-squares estimator, after rescaling, is approximately normal when the sample size n is su¢ ciently large. This holds true for all joint distributions of (yi; xi) which satisfy the conditions of Assumption 6.4.1, and is therefore broadly

applicable. Consequently, asymptotic normality is routinely used to approximate the …nite sample p

distribution of n :

A di¢ culty is that for any …xed n the sampling distribution of can be arbitrarily far from the normal distribution. In Figure 6.1 we have already seen a simple example where the least-squares estimate is quite asymmetric and non-normal even for reasonably large sample sizes. The normal approximation improves as n increases, but how large should n be in order for the approximation to be useful? Unfortunately, there is no simple answer to this reasonable question. The trouble is that no matter how large is the sample size, the normal approximation is arbitrarily poor for some data distribution satisfying the assumptions. We illustrate this problem using a simulation.

Let yi = 1xi + 2 + ei where xi is N (0; 1) ; and ei is independent of xi with the Double Pareto density f(e) = 2 jej 1 ; jej 1: If > 2 the error ei has zero mean and variance =( 2):

As approaches 2, however, its variance diverges to in…nity. In this context the normalized least-

for any > 2.

squares slope estimator q

1 has the N(0; 1) asymptotic distibution

^1 1 ;

In Figure 6.3 we display the …nite sample densities of the normalized estimator n

close to the N(0; 1)

setting n = 100 and varying the parameter . For = 3:0 the density is very q

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

115

Figure 6.3: Density of Normalized OLS estimator with Double Pareto Error

density. As diminishes the density changes signi…cantly, concentrating most of the probability mass around zero.

Another example is shown in Figure 6.4. Here the model is yi = + ei where

		uk		k
ei =	E		i	E uik			(6.16)
ei =		2k		E uik	2	1=2	(6.16)
		ui	E ui
				p		b
and ui N(0; 1): We show the sampling distribution of					n	b	setting n = 100; for k = 1; 4,
and ui N(0; 1): We show the sampling distribution of					n		setting n = 100; for k = 1; 4,

6 and 8. As k increases, the sampling distribution becomes highly skewed and non-normal. The lesson from Figures 6.3 and 6.4 is that the N(0; 1) asymptotic approximation is never guaranteed to be accurate.

6.5Joint Distribution

Theorem 6.4.2 gives the joint asymptotic distribution of the coe¢ cient estimates. We can use the result to study the covariance between the coe¢ cient estimates. For example, suppose k = 2

^	^	2): For simplicity suppose that the regressors are mean zero. Then
and write the estimates as (	1;
we can write		Qxx =	12	1 2
			12	1 2
			1 2	22

where 21 and 22 are the variances of x1i and x2i; and is their correlation. If the error is ho-

^ ^ 0 1 2

moskedastic, then the asymptotic variance matrix for ( 1; 2) is V = Qxx : By the formula for inversion of a 2 2 matrix,

Q 1	=	1	22	1 2	:
		12 22 (1 2)
xx			1 2	12
			^	^	are negatively correlated (and
Thus if x1i and x2i are positively correlated ( > 0) then 1				and 2

vice-versa).

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

116

Figure 6.4: Density of Normalized OLS estimator with error process (6.16)

For illustration, Figure 6.5 displays the probability contours of the joint asymptotic distribution

	^	^	2	2	2	= 1 and = 0:5: The coe¢ cient estimates are negatively
of 1 1		and 2 2 when 1		= 2 =		= 1 and = 0:5: The coe¢ cient estimates are negatively
						^	is unusually negative,
correlated since the regressors are positively correlated. This means that if 1							is unusually negative,
		^	is unusually positive, or conversely. It is also unlikely that we will observe both
it is likely that 2
^	^	unusually large and of the same sign.
1	and 2	unusually large and of the same sign.

This …nding that the correlation of the regressors is of opposite sign of the correlation of the coef- …cient estimates is sensitive to the assumption of homoskedasticity. If the errors are heteroskedastic then this relationship is not guaranteed.

This can be seen through a simple constructed example. Suppose that x1i and x2i only take the values f1; +1g; symmetrically, with Pr (x1i = x2i = 1) = Pr (x1i = x2i = 1) = 3=8; and Pr (x1i = 1; x2i = 1) = Pr (x1i = 1; x2i = 1) = 1=8: You can check that the regressors are mean zero, unit variance and correlation 0.5, which is identical with the setting displayed in Figure 6.5 when the error is homoskedastic.

	Now suppose that the error is heteroskedastic. Speci…cally, suppose that E ei2 j x1i = x2i																																		=
5	and			2	x1i = x2i		=	1	: You can check that															e2	= 1;				x2 e2		=		x2 e2	= 1	and
				2																				e2	= 1;				x2 e2		=		x2 e2		and
			e																					i					1i i				2i i
4		E	2	i j	7	6		4														E		i			E		1i i			E	2i i
E	x1ix2iei			=		: Therefore
E	x1ix2iei			=	8	: Therefore
						V = Qxx1								Qxx1						32 7				8	32		1	2 3
								=		9		2			1			2		32 7				8	32		1	2 3
															1				1		1			7		1			1
													4		1					54	1				54	1				5
													4							54					54					5
											16 6							1		76			1		76				1	7
											16 6					2		1		76		8	1		76		2		1	7
											4	2		1			1	3
								=						1	4				:
														1
										3				1
										3		6		4	1			7
										^		4			^			5
										^					^			are positively correlated (their correlation is 1=4:) The
Thus the coe¢ cient estimates 1												and 2						are positively correlated (their correlation is 1=4:) The

joint probability contours of their asymptotic distribution is displayed in Figure 6.6. We can see how the two estimates are positively associated.

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

117

^ ^

Figure 6.5: Contours of Joint Distribution of ( 1; 2); homoskedastic case

What we found through this example is that in the presence of heteroskedasticity there is no simple relationship between the correlation of the regressors and the correlation of the parameter estimates.

We can extend the above analysis to study the covariance between coe¢ cient sub-vectors. For example, partitioning x0i = (x01i; x02i) and 0 = 01; 02 ; we can write the general model as

	b	yi = x10 i 1 + x20 i 2 + ei
	b	11 b	b	: Make the partitions
and the coe¢ cient estimates as	0	= 10 ; 20		: Make the partitions
Qxx =		Q	Q12	;	11		12	:
Qxx =		Q21	Q22	;	= 21		22	:	(6.17)
From (3.37)			Q 1		Q 1	Q Q 1
1			Q 1		Q 1	Q Q 1
1			11 2		2	12	22
Qxx	=	Q2211Q21Q111			11Q	2211

where Q11 2 = Q11 Q12Q221Q21 and Q22 1 = Q22 Q21Q111Q12. Thus when the error is homoskedastic,

b	b	= 2Q1112Q12Q221
cov 1	; 2	= 2Q1112Q12Q221

which is a matrix generalization of the two-regressor case. In the general case, you can show that (Exercise 6.5)

			V =	V 11	V 12
where			V =	V 21	V 22
where
V 11	= Q1112	11	Q12Q221 21	12Q221Q21 + Q12Q221 22Q221Q21			Q1112
V 21	= Q2211	21	Q21Q111 11	22Q221Q21 + Q21Q111 12Q221Q21		Q1112
V 22	= Q2211	22	Q21Q111 12	21Q111Q12 + Q21Q111 11Q111Q12		Q2211

(6.18)

(6.19)

(6.20)

(6.21)

Unfortunately, these expressions are not easily interpretable.

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

118

^ ^

Figure 6.6: Contours of Joint Distribution of 1 and 2; heteroskedastic case

6.6Uniformly Consistent Residuals*

We have described the least-squares residuals e^i as estimates of the errors ei: Are e^i consistent for ei? Notice that we can write the residual as

e^i = yi xi0

+ xi0

= e

Since ! 0 it seems reasonable to guess that e^i will be close to ei if n is large.
Webcan bound the di¤erence in (6.22) using the Schwarz inequality (A.7) to …nd
je^i eij =	xi0		kxik		:
		b
		b		b

(6.22)

(6.23)

To bound (6.23) we can use		= Op(n 1=2) from Theorem 6.4.2, but we also need to bound

	b

the random variable kxik.

The key is Theorem 2.12.1 which i; or

Applied to (6.23) we obtain

max je^i

1 i n

We have shown the following.

shows that Ekxik4

< 1 implies xi = op n1=4 uniformly in

n 1=4 max

n k

ik !

eij

max

1 i n kxik

1=4

1=2

1=4

O b(n

)

op(n

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES								119

	Theorem 6.6.1 Under Assumptions 1.5.1 and 6.4.1, uniformly in 1 i n
		e^i = ei + op(n 1=4):					(6.24)

What about the squared residuals e^2? Squaring the two sides of (6.24) we obtain
		i
2 =		2ei + op(n	1=4)		2
2 =			1=4)
	e^i			1=4		1=2	)
	= ei + 2eiop(n			) + op(n			)
=		ei2 + op(1)						(6.25)
uniformly in 1 i n; since ei = op n1=4 when Ejeij4 < 1 by Theorem 2.12.1.
	Theorem 6.6.2 Under Assumptions 1.5.1 and 6.4.1, uniformly in 1 i n
		e^i2 = ei2 + op(1)

6.7Asymptotic Leverage*

Recall the de…nition of leverage from (4.21)

hii = x0i X0X 1 xi:

These are the diagonal elements of the projection matrix P and appear in the formula for leave- one-out prediction errors and several covariance matrix estimators. We can show that under iid sampling the leverage values are uniformly asymptotically small.

Let min(A) and max(A) denote the smallest and largest eigenvalues of a symmetric square

matrix A; and note that max(A 1) = ( min(A)) 1 :

Since 1 X0X

Q > 0 then by the CMT,

min

1 X0X

min

(Q ) > 0: (The latter is

positive since Qxx is positive de…nite and thus all its

eigenvalues are positive.) Then by the Trace

Inequality (A.10)

hii = xi0 X01X 1 xi 1

xixi0!

= tr

X0X

max

X0X

!tr

xixi0

min

nX0X

n kxik2

( min (Qxx) + op(1)) 1

1maxi n kxik2 :

(6.26)

Theorem 2.12.1 shows that Ekxik2 < 1 implies

n 1=2 max

n k

ik !

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES							120
and thus					2 p
n 1 max			x	ik	2 p	0:
1	i	n k		ik	!

It follows that (6.26) is op(1); uniformly in i:

Theorem 6.7.1 Under Assumption 1.5.1 and Ekxik2 < 1, uniformly in

1 i n, hii = op(1):

Theorem (6.7.1) implies that under random sampling with …nite variances and large samples, no individual observation should have a large leverage value. Consequently individual observations should not be in‡uential, unless one of these conditions is violated.

6.8Consistent Covariance Matrix Estimation

In Sections 5.7 and 5.8 we introduced estimators of the …nite-sample covariance matrix of the least-squares estimator in the regression model. In this section we show that these estimators are consistent for the asymptotic covariance matrix.

First, consider the covariance matrix estimate constructed under the assumption of homoskedasticity:

X0X

1 2

= Qxx s

b b

Since Qxx ! Qxx

(Theorem 6.2.1), s

(Theorem 6.3.1), and Qxx is invertible (Assumption

3.16.1), it follows that

= Qxx s ! Qxx

= V

b b

the homoskedastic covariance matrix.

so that V is consistent for V ;

Theorem 6.8.1 Under Assumption 1.5.1 and Assumption 3.16.1,

V 0

V 0 as n

! 1

b b

Now consider the heteroskedasticity-robust covariance matrix estimators V

; V

; and

Writing

b b

e b

xix0e^2;

(6.27)

) 2 x

x0e^2

i i

and

1 n

) 1 x

x0e^2

i i

as moment estimators for = E xix0ie2i ; then the covariance matrix estimators are

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

121

= Q 1 Q 1;

b b

bxx b bxx

= Q 1 Q 1;

and

e b

= Q 1

Q 1

consistent for : Combined with the consistency of Q

We can show that , , and are

b b

e b

for Qxx and the

invertibility of Q

we …nd that

; and V

converge in probability to

b e

Qxx1 Qxx1 = V : The complete proof is given in Section 6.18.

Theorem 6.8.2 Under Assumption 1.5.1 and Assumption 6.4.1, as n ! 1;

p ;

V ; and

;

; V

V ; V

V :

! !

b b

e b

b !

6.9Functions of Parameters

Sometimes we are interested in a lower-dimensional function of the parameter vector = ( 1; :::; k): For example, we may be interested in a single coe¢ cient j or a ratio j= l: In these cases we can write the parameter of interest as a function of : Let h : Rk ! Rq denote this function and let

= h( )

denote the parameter of interest. The estimate of is

= h( ):

By the continuous mapping theorem (Theorem 2.9.1) and the fact p we can deduce that

b					b !
is consistent for .					b !
	Theorem 6.9.1 Under Assumption 1.5.1 and Assumption 3.16.1, if h( ) is con-
	tinuous at the true value of ; then as			p	:
			n ! 1; b !
Furthermore, by the Delta Method					is asymptotically normal.
		(Theorem 2.10.3) we know that b

Theorem 6.9.2 Asymptotic Distribution of Functions of Parameters

Under Assumption 1.5.1 and Assumption 6.4.1, if h( ) is continuously di¤erentiable at the true value of ; then as n ! 1;

		d
pn				(6.28)
		! N (0; V )
where	b
	V = H0		V H	(6.29)
and		@
	H =		h( )0:
		@

CHAPTER 6. ASYMPTOTIC THEORY FOR LEAST SQUARES

122

In many cases, the function h( ) is linear:

h( ) = R0

for some k q matrix R: In this case, H = R. In particular, if R is a “selector matrix”

R =	I	(6.30)
	0

so that = R0 = 1 for = ( 10 ; 20 )0; then
V = I 0 V	I	= V 11;
V = I 0 V	0	= V 11;

where V 11 is given in (6.19). Under homoskedasticity the covariance matrix (6.19) simpli…es to

V 011 = Q1112 2:

We have shown that for the case (6.30) of a subset of coe¢ cients, (6.28) is

	b	d

		1 ! N (0; V 11)
pn 1

with V 11 given in (6.19).

6.10Asymptotic Standard Errors

How do we estimate the covariance matrix V for ? From (6.29) we see we need estimates of

H and V . We already have an estimate of the

latter, V

(or V

or V

). To estimate H

use

h( ): b b

e b

Putting the parts together we obtain

V = H V

b: As

as the covariance matrix estimator for

the primaryb

justi…cation for V

is the asymptotic

c b c

approximation (6.28), V is often calledban asymptotic covariance matrix

estimator.

In particular, whenbh( ) is linear h( ) = R0 then

V = R0V R:

When R takes the form of a selector

matrix as in (6.30) then

b b

V = V 11 = hV i11 ;

the upper-left block of the covariance

matrix estimate V :

; that is,

When q = 1 (so h( ) is real-valued), the standard

error for is the square root of V

b b

s(^) = n	1=2	b	1=2	c	0	b b c
		b		c		b b c
		V = n		H V			H :

This is known as an asymptotic standard error for s( ).

The estimator V is consistent for V under the conditions of Theorem 6.9.2 since V

and

b b

by Theorem 6.8.2, b

h( )0

h( )0 = H

H =

since

h( )0 is

and the function

continuous.

! V

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 11 1213 / 3013 14 15 16 17 18 19 20 21 22 23 24 25 > Следующая >>>

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]

#
21.03.20167.29 Mб58dm_lektsii.pdf
#
24.11.2018268.8 Кб7Doc12.doc
#
16.04.2015366.98 Кб5Doklad_3.docx
#
16.09.2019136.7 Кб95Doklad_dlya_fila (1).doc
#
20.09.2019182.78 Кб1Domentikristalka_19-24.doc
#
21.03.20161.77 Mб10Econometrics2011.pdf
#
18.12.20181.47 Mб4EDS_final.doc
#
25.09.201962.46 Кб1Ekonomika_s_29_po_32.doc
#
16.04.201536.86 Кб31Ekzamenatsionny_test (2).doc
#
11.09.20191.21 Mб16ekzamen_kontrolling.doc
#
17.04.20192.9 Mб4Ekzamen_po_mikre_Otvety_1 (3).doc