Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

An Introduction to Statistical Signal Processing

.pdf
Скачиваний:
2
Добавлен:
10.07.2022
Размер:
1.81 Mб
Скачать

4.17. STATIONARITY

255

times, not on where they begin. Thus in this special case, knowing a process is weakly stationary is su cient to conclude it is stationary. In general, stationarity can be quite di cult to prove, even for simple processes.

Strict Stationarity

In fact the above is not the definition of stationarity used in the mathematical and statistical literature, but it is equivalent to it. We pause for a moment to describe the more fundamental (but abstract) definition and its relation to the above definition, but the reader should keep in mind that it is the above definition that is the important one for practice: it is the definition that is almost always used to verify that a process is stationary or not.

To state the alternative definition, recall that a random process {Xt; t T } can be considered to be a mapping from a probability space (Ω, F, P ) into a space of sequences or waveforms {xt; t T } and that the inverse image formula implies a probability measure called a process distribution,

say PX, on this complicated space, i.e.,

PX(F ) = PX({{xt; t T } :

{xt; t T } F }) = P (: {Xt(ω);

t T } F }). The abstract

definition of stationarity places a condition on the process distribution: a random process {Xt; t T } is stationary if the process distribution PX is unchanged by shifting, that is, if

PX({{xt; t T } : {xt; t T } F }) =

 

PX({{xt; t T } : {xt+τ ; t T } F }); all F, τ.

(4.127)

The only di erence between the left and right hand side is that the right hand side takes every sample waveform and shifts it by a common amount τ. If the abstract definition is applied to finite-dimensional events, that is, events which actually depend only on a finite number of sample times, then this definition reduces to that of (4.126). Conversely, it turns out that having this property hold only on all finite-dimensional events is enough to imply that the property holds for all possible events, even those depending on an infinite number of samples (such as the event one gets an infinite binary sequence with exactly p limiting relative frequency of heads). Thus the two definitions of strict stationarity are equivalent.

Why is stationary important? Are processes that are not stationary interesting? The answer to the first question is that this property leads to the most famous of the law of large numbers, which will be quoted without proof later. The answer to the second question is yes, nonstationary processes play an important role in theory and practice, as will be seen by example. In particular, some nonstationary processes will have a form of law of

256

CHAPTER 4. EXPECTATION AND AVERAGES

large numbers, and others will have no such property, yet be quite useful in modeling real phenomena. Keep in mind that strict stationarity is stronger than weak stationarity. Thus if a process is not even weakly stationary then the process is also not strictly stationary. Two examples of nonstationary processes already encountered are the Binomial counting process and the discrete time Wiener process. These processes have marginal distributions which change with time and hence the processes cannot be stationary. We shall see in chapter 5 that these processes are also not weakly stationary.

4.18Asymptotically Uncorrelated Processes

We close this chapter with a generalization of the mean ergodic theorem and the weak law of large numbers that demonstrates that weak stationarity plus an asymptotic form of uncorrelation is su cient to yield a weak law of large numbers by a fairly modest variation of the earlier proof. The class of asymptotically uncorrelated processes is often encountered in practice. Only the result itself is important, the proof is a straightforward but tedious extension of the proof for the uncorrelated case.

An advantage of this more general result over the result for uncorrelated discrete time random processes is that it extends in a sensible way to continuous time processes.

A discrete time weakly stationary process {Xn; n Z} is said to be asymptotically uncorrelated if its covariance function is absolutely summable, that is, if

|KX(k)| < ∞.

(4.128)

k=−∞

This condition implies that also

lim KX(k) = 0,

(4.129)

k→∞

 

and hence this property can be considered as a weak form of uncorrelation, a generalization of the fact that a weakly stationary process is uncorrelated if KX(k) = 0 when k = 0. If a process is process is uncorrelated, then Xn and Xn+k are uncorrelated random variables for all nonzero k, if it is asymptotically uncorrelated, the correlation between the two random variables decreases to zero as k grows. We use (4.128) rather than (4.129) as the definition as it also ensures the existence of a Fourier transform of KX, which will be useful later, and simplifies the proof of the resulting law of large numbers.

4.18. ASYMPTOTICALLY UNCORRELATED PROCESSES

257

Theorem 4.14 (A mean ergodic theorem): Let {Xn} be a weakly stationary asymptotically uncorrelated discrete time random process such that EXn = X is finite and σX2 n = σX2 < ∞ for all n. Then .

l.i.m. 1 n−1 Xi = X ,

n→∞ n i=0

that is, n1 n−1Xi → X in mean square.

i=0

Note that the theorem is indeed a generalization of the previous mean ergodic theorem since a weakly stationary uncorrelated process is trivially an asymptotically uncorrelated process. Note also that the Tchebychev inequality and this theorem immediately imply convergence in probability and hence a weak law of large numbers for weakly stationary asymptotically uncorrelated processes. A common example of asymptotically uncorrelated

processes are processes with exponentially decreasing covariance, i.e., of the form KX(k) = σX2 ρ|k| for ρ < 1.

Proof:

Exactly as in the proof of Theorem 4.11 we have with with Sn = n1 n−1 Xi

i=0

that

E[(Sn − X)2] = E[(Sn − ESn)2] = σS2n .

From (4.104) we have that

 

 

 

 

 

 

n−1 n−1

 

 

 

 

 

 

 

 

σ2

= n2

i

 

 

 

(i

j) .

 

(4.130)

 

 

 

 

 

K

X

 

 

Sn

 

 

 

 

=0 j=0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This sum can be rearranged as in Lemma B.1 of appendix B as

 

 

 

 

 

n−1

 

 

 

 

 

 

 

 

 

 

 

 

σS2n =

1

 

 

(1

|k|

)KX(k).

 

(4.131)

 

n

 

 

 

 

n

 

 

 

 

 

 

k=−n+1

 

 

 

 

 

 

 

 

 

 

From Lemma B.2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n−1

 

 

 

 

|k|

 

 

 

 

 

 

 

 

 

 

lim

(1

)K

X

(k) =

 

K

X

(k),

n

→∞

 

 

 

n

 

 

 

 

 

k=−∞

 

 

k=−n+1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

258

CHAPTER 4.

EXPECTATION AND AVERAGES

which is finite by assumption, hence dividing by n yields

 

 

 

 

n−1

 

 

lim

1

 

 

|k|

 

n

→∞

n

 

n

X

k=−n+1

In a similar manner, a continuous time weakly stationary process {X(t); t} is said to be asymptotically uncorrelated if its covariance function is absolutely integrable,

 

−∞ |KX(τ)| < ∞,

(4.132)

which implies that

 

lim KX(τ) = 0.

(4.133)

τ→∞

 

No sensible continuous time random process can be uncorrelated (why not? ), but many are asymptotically uncorrelated. For a continuous time process a sample or time average can be defined by replacing the sum operation by an integral, that is, by

 

1

0

T

 

ST =

X(t) dt.

(4.134)

T

(We will ignore the technical di culties that must be considered to assure that the integral exists in a suitable fashion. Su ce it to say that an integral can be considered as a limit of sums, and we have seen ways to make such limits of random variables precise.) The definition of weakly stationary extends immediately to continuous time processes. The following result can be proved by extending the discrete time result to continuous time and integrals.

Theorem 4.15 (A mean ergodic theorem): Let {X(t)} be a weakly stationary asymptotically uncorrelated continuous time random process such that EX(t) = X is finite and σX2 (t) = σX2 < ∞ for all t. Then .

 

 

 

 

m.

1

0

T

 

 

 

 

 

 

X(t) dt = X ,

 

 

 

 

 

 

0

lT.i→∞.

T

that is,

1

T

 

 

 

 

 

 

 

X(t) dt → X in mean square.

 

T

As in the discrete time case, convergence in mean square immediately implies converges in probability, but much additional work is required to

4.19. PROBLEMS

259

prove convergence with probability one. Also as in the discrete case, we can define a limiting time average

< X(t) >= lim 1

0

T

 

X(t) dt

(4.135)

T →∞ T

and interpret the law of large numbers as stating that the time average < X(t) > exists in some sense and equals the expectation.

4.19Problems

1. The Cauchy pdf is defined by

1 1

fX(x) = π 1 + x2 ; x .

Find EX. Hint: This is a trick question. Check the definition of Riemann integration over (−∞, ∞) before deciding on a final answer.

2. Suppose that Z is a discrete random variable with probability mass function

ak

pZ(k) = C (1 + a)k+1 , k = 0, 1, · · · .

(This is sometimes called “Pascal’s distribution.”) Find the constant

C and the mean, characteristic function, and variance of Z.

3.State and prove the fundamental theorem of expectation for the case where a discrete random variable X is defined on a probability space where the probability measures is described by a pdf f.

4.Suppose that X is a random variable with pdf fX(α) and characteristic function MX(ju) = E[ejuX]. Define the new random variable Y = aX + b, where both a and b are positive constants. Find the pdf fY and characteristic function MY (ju) in terms of fX and MX, respectively.

5.X, Y and Z are iid Gaussian random variables with N(1, 1) distributions.

Define the random variables:

V= 2X + Y

W= 3X − 2Z + 5.

(a) Find E[V W ].

260

CHAPTER 4. EXPECTATION AND AVERAGES

(b)Find the 2 parameters that completely specify the random variable V + W.

(c)Find the characteristic function of the random vector [V W ]t, where t denotes “transpose.”

ˆ

(d) Find the linear estimator V (W ) of V, given W .

(e)

Is this an optimal estimator? Why?

 

 

(f)

¯

¯

¯

The zero-mean random variables X − X, Y

− Y and Z − Z

are the inputs to a black box. There are 2 outputs, A and B. It is determined that the covariance matrix of the vector of its outputs [A B]t should be

ΛAB =

3

2

 

2

5

Find expressions for A and B in terms of the black box inputs so that this is in fact the case (design the black box). Your answer does not necessarily have to be unique.

(g)You are told that a di erent black box results in an output vector [C D]t with the following covariance matrix:

2 0 ΛCD = 0 7

How much information about output C does output D give you? Briefly but fully justify your answer.

6. Assume that {Xn} is an iid process with Poisson marginal pmf

pX(l) =

λle−λ

l! ; l = 0, 1, 2, . . . .

and define the process {Nk; k = 0, 1, 2, . . . }

Nk =

0

k

k = 0

 

 

l=1 Xl

k = 1, 2, . . .

 

Define the process {Yk} by Yk = (1)Nk for k = 0, 1, 2, . . . .

(a)Find the mean E[Nk], characteristic function MNk (ju) = E[ejuNk ], and pmf pNk (m).

(b)Find the mean E[Yk] and variance σY2k .

4.19. PROBLEMS

261

(c)Find the conditional pmfs pNk|N1,N2,... ,Nk−1 (nk|n1, n2, . . . , nk−1) and pNk|Nk−1 (nk|nk−1). Is {Nk} a Markov process?

7.Let {Xn} be an iid binary random process with equal probability of +1 or 1 occurring at any time n. Show that if Yn is the standardized sum

n−1

Yn = n1/2 Xk ,

k=0

then

MYn (ju) = en log cos (u/ n) .

Find the limit of this expression as n → ∞.

8.Suppose that a fair coin is flipped 1,000,000 times. Write an exact expression for the probability that between 400,000 and 500,000 heads occur. Next use the central limit theorem to find an approximation to this probability. Use tables to evaluate the resulting integral.

9.Using an expansion of the form of equation (4.102), show directly that the central limit theorem is satisfied for a sequence of iid random variables with pdf

2

p(x) = π(1 + x2)2 , x .

Try to use the same expansion for

1

p(x) = π(1 + x2) , x .

Explain your result.

10.Suppose that {Xn} is a weakly stationary random process with a marginal pdf fX(α) = 1 for 0 < α < 1 and a covariance function

 

 

 

 

1

 

 

KX(k) =

 

 

ρ|k|

12

for all integer k (ρ < 1). What is

 

 

 

 

1

 

n

 

l.i.m.

 

 

 

 

 

 

 

 

Xk ?

n→∞ n

k=1

 

 

 

 

 

 

What is

n

 

1

 

 

l.i.m.

 

 

 

 

 

 

 

 

Xk ?

n→∞ n2

 

 

 

 

k=1

262

CHAPTER 4. EXPECTATION AND AVERAGES

11.If {Xn} is an uncorrelated process with constant first and second moments, does it follow for an arbitrary function g that

 

 

n−1

 

 

i

n

1

 

g(Xi) n → ∞ E[g(X)]

 

 

=0

in mean square? (E[g(X)] denotes the unchanging value of E[g(Xn)].) Show that it does follow if the process is iid

12.Apply problem 4.11 to indicator functions to prove that relative rel-

ative frequencies of order n converge to pmf’s in mean square and in

probability for iid random processes. That is, if ra(n) is defined as in the chapter, then ra(n) → pX(a) ad n → ∞ in both senses for any a in the range space of X.

13.Define the subsets of the real line

Fn = r : |r| >

1

, n = 1, 2, . . .

n

and

F + {0} .

Show that

F c = Fn .

n=1

Use this fact, the Tchebychev inequality, and the continuity of probability to show that if a random variable X has variance 0, then Pr(|X−EX| ≥ *|) 0 independent of and hence Pr(|X = EX) = 1.

14.True or False? Given a nonnegative random variable X, for any * > 0 and a > 0.

Pr(X ≥ *)

E[eaX]

 

.

ea

15. Show that for a discrete random variable X,

|E(X)| ≤ E(|X|) .

Repeat for a continuous random variable.

16.This problem considers some useful properties of autocorrelation or covariance function.

4.19. PROBLEMS

263

(a) Use the fact that E[(Xt −Xs)2] 0 to prove that if EXt = EX0

for all t and E(Xt2 = RX(t, t) = RX(0, 0) for all t — that is, if the mean and variance do not depend on time — then

|RX(t, s)| ≤ RX(0, 0)

and

|KX(t, s)| ≤ KX(0, 0) .

Thus both functions take on their maximum value when t = x. This can be interpreted as saying that no random variable can be more correlated with a given random variable than it is with itself.

(b)Show that autocorrelations and covariance functions are symmetric functions, e.g., RX(t, s) = RX(s, t).

17.The Cauchy-Schwarz Inequality: Given random variables X and Y , define a = E(X2)1/2 and b = E(Y 2)1/2. By considering the quantity E[(X/a ± Y/b)2] prove the following inequality:

|E(XY )| ≤ E(X2)1/2E(Y 2)1/2 .

18.Given two random processes {Xt; t T } and {Xt; t T } defined on the same probability space, the cross correlation function RXY (t, s); t, s T is defined as

RXY (t, s) = E(XtYs) .

since RX(t, s) = RXX(t, s). Show that RXY is not, in general, a symmetric function of its arguments. Use the Cauchy-Schwarz inequality of 4.17 to find an upper bound to |RXY (t, s)| in terms of the autocorrelation functions RX and RY .

19.Let Θ be a random variable described by a uniform pdf on [−π, π] and let Y be a random variable with mean m and variance σ2; assume that Θ and Y are independent. Define the random process {X(t); t } by X(t) = Y cos(2πf0t + Θ), where f0 is a fixed frequency in hertz. Find the mean and autocorrelation function of this process. Find the limiting time average

lim

1

0

T

X(t)dt .

 

T →∞ T

(Only in trivial processes such as this can one find exactly such a limiting time average.)

264

CHAPTER 4. EXPECTATION AND AVERAGES

20. Suppose that {Xn} is an iid process with a uniform pdf on [0,1). Does Yn = X1X2 · · · Xn converge in mean square as n → ∞? If so, to what?

21.Let r(n)(a) denote the relative frequency of the letter a in a sequence x0, . . . , xn−1. Show that if we define q(a) = r(n)(a), then q(a) is a valid pmf. (This pmf is called the “sample distribution,” or “empirical distribution.”)

One measure of the distance or di erence between two pmf’s p and q

is

 

||p − q||1

|p(a) − q(a)|.

a

Show that if the underlying process is iid with marginal pmf p, then the empirical pmf will converge to the true pmf in the sense that

lim ||p − r(n)||1 = 0.

n→∞

22.Given two sequences of random variables {Xn; n = 1, 2, . . . } and {Yn; n = 1, 2, . . . } and a random variable X, suppose that with probability one |Xn − X| ≤ Yn and n and that EYn 0 as n → ∞. Prove that EXn → EX and that Xn converges to X in probability as n → ∞.

23.This problem provides another example of the use of covariance functions. Say that we have a discrete time random process {Xn} with a covariance function KX(t, s) and a mean function mn = EXn. Say that we are told the value of the past sample, say Xn−1 = α, and we are asked to make a good guess of the next sample on the basis of the old sample. furthermore, we are required to make a linear guess or estimate, called a prediction, of the form

2n(α) = + b , X

for some constants a and b. Use ordinary calculus techniques to find the values of a and b that are “best” in the sense of minimizing the mean squared error

E[(X

X

(X

n−1

))2

] .

 

n 2n

 

 

 

Give your answer in term of the mean and covariance function. Generalize to a linear prediction of the form

2n(Xn−1, Xn−m) = a1Xn−1 + amXn−m + b , X

where m is an arbitrary integer, m ≥ 2. When is am = 0?

Соседние файлы в предмете Социология