
An Introduction to Statistical Signal Processing
.pdf6.13. ERGODICITY AND STRONG LAWS OF LARGE NUMBERS375
|
|
|
ˆ |
and |
that is, the limit exists. If the process is also ergodic, then X = mX |
||||
hence |
∞ |
|
|
|
1 |
|
|
||
lim |
|
|
|
|
|
Xn = mX with probability 1. |
(6.55) |
||
n→∞ n n=0 |
|
|
The conditions also imply convergence in mean square (an L2 or mean ergodic theorem); that is,
1 |
∞ |
ˆ |
|
|
l.i.m. |
|
|
Xn = X, |
(6.56) |
|
|
|||
n→∞ n n=0 |
|
|
but we shall focus on the convergence with probability 1 form. There are also continuous time versions of the theorem to the e ect that under suitable conditions
lim |
1 |
∞ |
ˆ |
|
|
0 |
(6.57) |
||
T →∞ T |
X(t) dt = X with probability 1, |
but these are much more complicated to describe because special conditions are needed to ensure the existence of the time average integrals.
The strong law of large numbers shows that for stationary and ergodic processes, time averages converge with probability one to the corresponding expectation. Suppose that a process is stationary but not ergodic. Then the theorem is that time averages still converge, but possibly not to the expectation. Consider the following example of a random process which exhibits this behavior. Suppose that nature at the beginning of time flips a fair coin. If the coin ends up heads, she sends thereafter a Bernoulli process with parameter p1, that is, an iid sequence of coin flips with a probability p1 of getting a head. If the original coin comes up tails, however, nature sends thereafter a Bernoulli process with parameter p0 = p1. In other words, you the observer are looking at the output of one of two iid processes, but you do not know which one. This is an example of a mixture random process, also sometimes called a doubly stochastic random process because of the random selection of a parameter followed by the random generation of a sequence using that parameter. Another way to view the process is as follows: Let {Un} denote the Bernoulli process with parameter p1 and {Wn} denote the Bernoulli process with parameter p0. Then the mixture process {Xn} is formed by connecting a switch at the beginning of time to either the {Un} process or the {Wn} process, and soldering the switch shut. The point is you either see {Un} forever with probability 1/2, or you see {Wn} forever. A little elementary conditional probability shows that for any dimension k,
pX0,... ,Xk−1 (x) = |
pU0,... ,Uk−1 (x) + pW0,... ,Wk−1 |
(x) |
(6.58) |
|
2 |
|
. |
376 |
CHAPTER 6. A MENAGERIE OF PROCESSES |
Thus, for example, the probability of getting a head in the mixture process is pX0 (1) = (p0 + p1)/2. Similarly, the probability of getting two heads in a row is pX0,X1 (1, 1) = (p20 + p21)/2. Since the joint pmf’s for the two Bernoulli processes are not changed by shifting, neither is the joint pmf for the mixture process. Hence the mixture process is stationary and from the strong law of large numbers its relative frequencies will converge to something. Is the mixture process ergodic? It is certainly not iid For example, the probability of getting two heads in a row was found to be
pX0,X1 (1, 1) = (p20 + p21)/2, which is not the same as pX0 (1)pX1 (1) = [(p0 + p1)/2]2 (unless p0 = p1), so that X0 and X1 are not independent! It could
conceivably be ergodic, but is it? Suppose that {Xn} were indeed ergodic, than the strong law would say that the relative frequency of heads would have to converge to the probability of a head, i.e., to (p0 + p1)/2. But this is clearly not true since if you observe the outputs of Xn you are observing a Bernoulli process of bias either p0 or p1 and hence you should expect to compute a limiting relative frequency of heads that is either p0 or p1, depending on which of the Bernoulli processes you are looking at. In other words, your limiting relative frequency is a random variable, which depends on Nature’s original choice of which process to let you observe. This explains one possible behavior leading to the general strong law: you observe a mixture of stationary and ergodic processes, that is, you observe a randomly selected stationary and ergodic process, but you do not a priori know which process it is. Since conditioned on this selection the strong law holds, relative frequencies will converge, but they do not converge to an overall expectation. They converge to a random variable, which is in fact just the conditional expectation given knowledge of which stationary and ergodic random process is actually being observed! Thus the strong law of large numbers can be quite useful in such a stationary but nonergodic case since one can estimate which stationary ergodic process is actually being observed by measuring the relative frequencies.
A perhaps surprising fundamental result of random processes is that this special example is in a sense typical of all stationary nonergodic processes. The result is called the ergodic decomposition theorem and it states that under quite general assumptions, any nonergodic stationary process is in fact a mixture of stationary and ergodic processes and hence you are always observing a stationary and ergodic process, you just do not know in advance which one. In our coin example, you know you are observing one of two Bernoulli processes, but we could equally consider an infinite mixture by selecting p from a uniform distribution on (0, 1). You do not know p in advance, but you can estimate it from relative frequencies. The interested reader can find a development of the ergodic decomposition theorem and its history in chapter 7 of [22].

6.14. PROBLEMS |
377 |
The previous discussion implies that ergodicity is not required for the strong law of large numbers to be useful. The next question is whether or not stationarity is required. Again the answer is no. Given that the main concern is the convergence of sample averages and relative frequencies, it should be reasonable to expect that random processes could exhibit transient or short term behavior that violated the stationarity definition, yet eventually dies out so that if one waited long enough the process would look increasingly stationarity. In fact one can make precise the notion of asymptotically stationary (in several possible ways) and the strong law extends to this case. Again the interested reader is referred to chapter 7 of [22]. The point is that the notions of stationarity and ergodicity should not be taken too seriously since ergodicity can easily be dispensed with and stationarity can be significantly weakened and still have processes for which laws of large numbers hold so that time averages and relative frequencies have well defined limits.
6.14Problems
1.Let {Xn} be an iid process with a Poisson marginal pmf with param-
eter λ. Let {Yn} denote the induced sum process as in equation (6.6). Find the pmf for Yn and find σY2n , EYn, and KY (t, s).
2.Let {Xn} be an iid process. Define a new process {Un} by
Un = Xn − Xn−1 ; n = 1, 2, . . . .
Find the characteristic function and the pmf for Un. Find RU (t, s). Is {Un} an independent increment process?
3.Let {Xn} be a ternary iid process with pXn (+1) = pXn (−1) = */2 and pXn (0) = 1−*. Fix an integer N and define the “sliding average”
1 N−1
Yn = N i=0 Xn−i .
(a)Find EXn, σX2 n , MXn (ju), and KX(t, s).
(b)Find EYn, σY2n , MYn (ju).
(c)Find the cross-correlation RX,Y (t, s) ≡ E[XtYs].
(d)Given δ > 0 find a simple upper bound to Pr(|Yn| > δ) in terms of N and *.
4.Find the characteristic function MUn (ju) for the {Un} process of exercise 5.2.

378 |
CHAPTER 6. A MENAGERIE OF PROCESSES |
5.Find a complete specification of the binary autoregressive process of exercise 5.11. Prove that the process is Markov. (One name for this process is the binary symmetric Markov source.)
6.A stationary continuous time random process {X(t)} switches randomly between the values of 0 and 1. We have that
Pr(X(t) = 1) = Pr(X(t) = 0) = 12 ,
and if Nt is the number of changes of output during (0, t], then
|
1 |
|
αt |
|
k |
|
pNt (k) = |
; k = 0, 1, 2, . . . , |
|||||
|
|
|||||
1 + αt |
1 + αt |
where α > 0 is a fixed parameter. (This is called the Bose-Einstein distribution.)
(a)Find MNt (ju), ENt, and σN2 t .
(b)Find EX(t) and RX(t, s).
7.Given two random processes {Xt}, called the signal process, and {Nt}, called the noise process, define the process {Yt} by
Yt = Xt + Nt .
The {Yt} process can be considered as the output of a channel with additive noise where the {Xt} process is the input. This is a common model for dealing with noisy linear communication systems; e.g., the noise may be due to atmospheric e ects or to front-end noise in a receiver. Assume that the signal and noise processes are independent; that is, any vector of samples of the N process. Find the characteristic function, mean, and variance of Yt in terms of those for Xt and Nt. Find the covariance of the output process in terms of the covariances of the input and noise process.
8.Find the inverse of the covariance matrix of the discrete time Wiener process, that is, the inverse of the matrix {min(k, j); k = 1, 2, . . . , n, j = 1, 2, . . . , n}.
9.Let {X(t)} be a Gaussian random process with zero mean and autocorrelation function
RX(τ) = N20 e−|τ| .
Is the process Markov? Find its power spectral density. Let Y (t) be the process formed by DSB-SC modulation of X(t) as in (5.37) with

6.14. PROBLEMS |
379 |
a0 = 0. If the phase angle Θ is assumed to be 0, is the resulting modulated process Gaussian? Letting Θ be uniformly distributed, sketch the power spectral density of the modulated process. Find
MY (0)(ju).
10.Let {X(t)} and {Y (t)} be the two continuous time random processes of exercise 5.14 and let
W (t) = X(t) cos(2πf0t) + Y (t) sin(2πf0t) ,
as in that exercise. Find the marginal probability density function
fW (t) and the joint pdf fW (t),W (s)(u, ν). Is {W (t)} a Gaussian process? Is it strictly stationary?
11.Let {Nk} be the binomial counting process and define the discrete time random process {Yn} by
Yn = (−1)Nn .
(This is the discrete time analog to the random telegraph wave.) Find the autocorrelation, mean, and power spectral density of the given process. Is the process Markov?
12.Find the power spectral density of the random telegraph wave. Is this process a Markov process? Sketch the spectrum of an amplitude modulated random telegraph wave.
13. Suppose that (U, W ) is a Gaussian random vector with EU = EW = 0 , E(U2) = E(W 2) = σ2, and E(UW ) = ρσ2. (The parameter ρ has magnitude less than or equal to 1 and is called the correlation coe cient.) Define the new random variables
S = U + W
D = U − W
(a)Find the marginal pdf’s for S and D.
(b)Find the joint pdf fS,D(α, β) or the joint characteristic function MS,D(ju, jν). Are S and D independent?
14.Suppose that K is a random variable with a Poisson distribution, that is, for a fixed parameter λ
Pr(K = k) = pk(k) = λke−λ ; k = 0, 1, 1, . . .
k!
6.14. PROBLEMS |
381 |
(c)Is {Yn} an autoregressive process? a moving average process? Is it weakly stationary? Is Vn an autoregressive process? a moving average process? Is it weakly stationary? (Note: answers to parts (a) and (b) are su cient to answer the stationarity questions, no further computations are necessary.)
(d)Find the conditional pmf
pVn|Vn−1,Vn−2,... ,V0 (νn|νn−1, . . . , ν0)
Is {Vn} a Markov process?
16.Suppose that {Zn} and {Wn} are two mutually independent twosided zero mean iid Gaussian processes with variances σZ2 and σW2 , respectively. Zn is put into a linear time-invariant filter to form an output process {Xn} defined by
Xn = Zn − rZn−1,
where 0 < r < 1. (Such a filter is sometimes called a preemphasis filter in speech processing.) This process is then used to form a new process
Yn = Xn + Wn,
which can be viewed as a noisy version of the preemphasized Zn process. Lastly, the Yn process is put through a “deemphasis filter” to form an output process Un defined by
Un = rUn−1 + Yn.
(a)Find the autocorrelation RZ and the power spectral density SZ. Recall that for a weakly stationary discrete time process with zero mean RZ(k) = E(ZnZn+k) and
∞
SZ(f) = RZ(k)e−j2πfk,
k=−∞
the discrete time Fourier transform of RZ.
(b)Find the autocorrelation RX and the power spectral density SX.
(c)Find the autocorrelation RY and the power spectral density SY .
(d)Find the conditional pdf fYn|Xn (y|x).
(e)Find the pdf fUn,Wn (or the corresponding characteristic function MUn,Wn (ju, jv)).

382 |
CHAPTER 6. A MENAGERIE OF PROCESSES |
(f)Find the overall mean squared error E[(Un − Zn)2].
17.Suppose that {Nt} is a binomial counting process and that {Xn} is
an iid process that is mutually independent of {Nt}. Assume that the Xn have zero mean and variance σ2. Let Yk denote the compound process
Nk
Yk = Xi.
i=1
Use iterated expectation to evaluate the autocorrelation function RY (t, s).
18.Suppose that {Wn} is a discrete time Wiener process. What is the minimum mean squared estimate of Wn given Wn−1, Wn−2, . . . ? What is the linear least squares estimator?
19.Let {Xn} be an iid binary random process with Pr(Xn = ±1) = 1/2 and let {Nt} be a Poisson counting process. A continuous time random walk Y (t) can be defined by
Nt
Yt = Xi; t > 0.
i=1
Find the expectation, covariance, and characteristic function of Yt.
20.Are compound processes independent increment processes?
21.Suppose that {Nt; t ≥ 0} is a process with independent and stationary increments and that
pNt (k) = |
(λt)ke−λt |
|
k! |
; k = 0, 1, 2, · · · . |
Suppose also that {Lt; t ≥ 0} is a process with independent and stationary increments and that
pLt (k) = |
(νt)ke−νt |
; k = 0, 1, 2, · · · . |
k! |
Assume that the two processes Nt and Lt are mutually independent of each other and define for each t the random variable It = Nt + Lt. It might model, for example, the number of requests for cpu cycles arriving from two independent sources, each of which produces requests according to a Poisson process.
(a)What is the characteristic function for It? What is the corresponding pmf?

6.14. PROBLEMS |
383 |
(b)Find the mean and covariance function of {It}.
(c)Is {It; t ≥ 0} an independent increment process?
(d)Suppose that Z is a discrete random variable, independent of Nt, with probability mass function
ak
pZ(k) = (1 + a)k+1 , k = 0, 1, · · ·
as in the first problem. Find the probability P (Z = Nt).
(e)Suppose that {Zn} is an iid process with marginal pmf pZ(k) as in the previous part. Define the compound process
Nt
Yt = Zk.
k=0
Find the mean E(Yt) and variance σY2t .
22.Suppose that {Xn; n Z} is a discrete time iid Gaussian random processes with 0 mean and variance σX2 = E[X02]. We consider this an input signal to a signal processing system. Suppose also that
{Wn; n Z} is a discrete time iid Gaussian random processes with 0 mean and variance σW2 and that the two processes are mutually independent. Wn is considered to be noise. Suppose that Xn is put into a linear filter with unit pulse response h, where
1 |
k = 0 |
|
|
|
k = −1 |
hk = −1 |
|
|
otherwise |
0 |
to form an output U = X h, the convolution of the input signal and the unit pulse response. The final output signal is then formed by adding the noise to the filtered input signal, Yn = Un + Wn.
(a)Find the mean, power spectral density, and marginal pdf for Un.
(b)Find the joint pdf fU1,U2 (α, β). You can leave your answer in terms of an inverse matrix Λ−1, but you must accurately describe Λ.
(c)Find the mean, covariance, and power spectral density for Yn.
(d)Find E[YnXn].
(e)Does the mean ergodic theorem hold for {Yn}?

384 |
CHAPTER 6. A MENAGERIE OF PROCESSES |
23.Suppose that {X(t); t R} is a weakly stationary continuous time Gaussian random processes with 0 mean and autocorrelation function
RX(τ) = E[X(t)X(t + τ)] = σX2 e−|τ|.
(a)Define the random process {Y (t); t R} by
t
Y (t) = X(α) dα,
t−T
where T > 0 is a fixed parameter. (This is a short term integrator.) Find the mean and power spectral density of {Y (t)}.
(b)For fixed t > s, find the characteristic function and the pdf for the random variable X(t) − X(s).
24.Consider the process {Xk; k = 0, 1, ···} defined by X0 = 0 and
Xk+1 = aXk + Wk , k ≥ 0 |
(6.59) |
where a is a constant, {Wk; k = 0, 1, ···} is a sequence of iid Gaussian random variables with E(Wk) = 0 and E(Wk2) = σ2.
(a)Calculate E(Xk) for k ≥ 0.
(b)Show that Xk and Wk are uncorrelated for k ≥ 0.
(c)By squaring both sides of (6.59) and taking expectation, obtain a recursive equation for KX(k, k).
(d)Solve for KX(k, k) in term of a and σ. Hint: distinguish between a = 1 and a = 1.
(e)Is the process {Xk; k = 1, 2, ···} weakly stationary?
(f)Is the process {Xk; k = 1, 2, ···} Gaussian?
(g)For −1 < a < 1, show that
σ2
P (|Xn| > 1) ≤ 1 − a2 .
25.A distributed system consists of N sensors which view a common random variable corrupted by di erent observation noises. In particular, suppose that the ith sensor measures a random variable
Wi = X + Yi, i = 0, 1, 2, ··· , N − 1,
Where the random variables X, Y1, ··· , YN are all mutually independent Gaussian random variables with 0 mean. The variance of X is