Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Теория информации / Gray R.M. Entropy and information theory. 1990., 284p

.pdf
Скачиваний:
28
Добавлен:
09.08.2013
Размер:
1.32 Mб
Скачать

10.5. D-BAR CONTINUOUS CHANNELS

199

length one, is a stationary channel, and g is a length m sliding block decoder. The probability of error for the resulting hookup is deflned by

Z

6 ^

Pe(„; ”; f; g) = Pr(U0 = U0) = „”(E) = d„(u)f (u)(Eu);

where E is the error event fu; y : u0 6= gm(Y¡qm)g and Eu = fy : (u; y) 2 Eg is the section of E at u.

Lemma 10.5.2: Given a stationary channel , a stationary source [G; „; U ], a length m sliding block decoder, and two encoders f and `, then for any positive integer r

jPe(„; ”; f; g) ¡ Pe(„; ”; `; g)j

 

 

 

 

m

 

max

 

r

r

 

 

 

 

 

 

 

 

 

 

 

 

sup r

(x

; ”x0 ):

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

r + r Pr(f 6= `) + m ar

2

Ar

dr

 

 

 

 

 

 

 

 

 

 

 

 

 

x;x02c(a

)

 

 

 

 

 

Proof: Deflne ⁄ = fu : f (u) = `(u)g and

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i\

 

 

 

 

r = fu : f (T iu) = `(T iu); i = 0; 1 ¢ ¢ ¢ ; r ¡ 1g =

T i:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

=0

 

 

 

From the union bound

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(⁄rc ) • r„(⁄c) = rPr(f 6= `):

 

 

 

 

(10.12)

From stationarity, if g = gm(Y m ) then

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

¡q

 

 

 

 

 

 

 

 

 

 

 

 

Pe(„; ”; f; g) = Z d„(u)f (u)(y : gm(y¡mq) 6= u0)

 

 

 

 

 

 

 

 

 

1 1

d„(u)f (u)(y : gm(yim¡q) 6= u0)

 

 

 

 

 

 

 

 

 

= r i=0 Z

 

 

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

m

1 r¡q

 

 

 

 

 

 

 

 

 

 

(10.13)

r

+ r i=q Zr d„(u)fr(u)(yr : gm(yim¡q) 6= ui) + (⁄rc ):

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

yr pu(yr; wr) = `r(u)

r

 

 

P

 

r

r

r

r

(wr), and

 

 

 

 

 

 

 

Fix u 2 r and let pu

yield dr(f (u);`(u)); that is,

wr pu(y ; w ) = f (u)(y ),

P

1

1 X

r

i=0

r

; w

r

r

(10.14)

pu(y

 

: yi 6= wi) = dr(f (u);`(u)):

We have that

1 r¡q

 

1 r¡q

 

X

 

 

X

r

fr(u)(yr : gm(yim¡q) 6= ui) =

r

pu(yr; wr : gm(yim¡q) 6= ui)

 

i=q

 

 

i=q

200

 

 

 

 

CHAPTER 10. DISTORTION

 

1 r¡q

 

1 r¡q

 

 

X

 

 

X

r

i=q

pu(yr; wr : gm(yim¡q) 6= wim¡q) +

r

pu(yr; wr : gm(wim¡q) 6= ui)

 

 

 

 

i=q

 

1 r¡q

 

 

 

 

X pu(yr; wr : yir¡q 6= wir¡q) + Pe(„; ”; `; g)

r

 

 

 

 

i=q

 

 

1 r¡q i¡q+m

 

 

 

X X

pu(yr; wr : yj 6= wj ) + Pe(„; ”; `; g)

r

i=q j=i¡q

 

 

 

 

 

 

 

 

 

r

r

 

 

 

 

• mdr

(f (u)

; ”`(u)) + Pe(„; ”; `; g);

which with (10.12)-(10.14) proves the lemma. 2

The following corollary states that the probability of error using sliding block

codes over a d-continuous channel is a continuous function of the encoder as measured by the metric on encoders given by the probability of disagreement of the outputs of two encoders.

Corollary 10.5.1: Given a stationary d-continuous channel and a flnite length decoder gm : Bm ! A, then given † > 0 there is a – > 0 so that if f and ` are two stationary encoders such that Pr(f 6= g) , then

jPe(„; ”; f; g) ¡ Pe(„; ”; `; g)j • †:

Proof: Fix † > 0 and choose r so large that

max

sup

(

r

 

r

)

d

 

; ”

 

 

;

 

 

 

ar

x;x02c(ar )

r

 

x

 

x0

 

3m

mr 3;

and choose = †=(3r). Then Lemma 10.5.2 implies that

jPe(„; ”; f; g) ¡ Pe(„; ”; `; g)j • †: 2

Given an arbitrary channel [A; ”; B], we can deflne for any block length

Na closely related CBI channel [A; ”;~ B] as the CBI channel with the same probabilities on output N -blocks, that is, the same conditional probabilities for YkNN given x, but having conditionally independent blocks. We shall call ~ the

N-CBI approximation to . A channel is said to be conditionally almost block independent or CABI if given there is an N0 such that for any N ‚ N0 there is an M0 such that for any x and any N -CBI approximation ~ to

M

M

) • †; all M ‚ M0;

d(~x

; ”x

where xM denotes the restriction of x to BBN , that is, the output distribution on Y N given x. A CABI channel is one such that the output distribution is close (in

a d sense) to that of the N -CBI approximation provided that N is big enough. CABI channels were introduced by Neuhofi and Shields [110] who provided

10.6. THE DISTORTION-RATE FUNCTION

201

several examples alternative characterizations of the class. In particular they

showed that flnite memory channels are both d-continuous and CABI. Their

principal result, however, requires the notion of the d distance between channels.

0

Given two channels [A; ”; B] and [A; ” ; B], deflne the d distance between the

channels to be

0 n 0N d(”; ” ) = lim sup sup d(x ; ” x ):

n!1 x

Neuhofi and Shields [110] showed that the class of CABI channels is exactly

the class of primitive channels together with the d limits of such channels.

10.6The Distortion-Rate Function

We close this chapter on distortion, approximation, and performance with the introduction and discussion of Shannon’s distortion-rate function. This function (or functional) of the source and distortion measure will play a fundamental role in evaluating the OPTA functions. In fact, it can be considered as a form of information theoretic OPTA. Suppose now that we are given a source [A; „]

and a fldelity criterion n; n = 1; 2; ¢ ¢ ¢ deflned on A £

^

^

A, where A is called

the reproduction alphabet. Then the Shannon distortion rate function (DRF) is deflned in terms of a nonnegative parameter called rate by

 

 

 

 

D(R; „) = lim sup

1

DN (R; „N )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N!1

 

N

 

 

 

 

 

 

 

where

 

DN (R; „N ) =

 

 

inf

 

 

 

EpN N (XN ; Y N );

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

pN 2RN (R;„N )

 

 

 

 

 

 

 

where

 

N

is the collection of all distributions pN for the coordinate

 

RN (R; „ )

N

and Y

N

on the space (A

N

 

^N

 

N

N

random vectors X

 

 

 

£ A

, BA

£ BA^ ) with the

properties that

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

 

 

 

 

 

 

N

; that is, p

N

^N

 

 

N

 

 

induces the given marginal

 

(A

 

F ) = (F ) for all

(1) pF 2 BAN , and

 

 

 

 

 

 

 

 

 

 

 

 

 

 

£

 

 

(2) the mutual information satisfles

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

N

^ N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IpN (X

 

; X

)

• R:

 

 

 

 

 

 

 

 

 

 

N

 

 

 

 

If RN (R; „N ) is empty, then DN (R; „N ) is 1. DN is called the Nth order

distortion-rate function.

 

Lemma 10.6.1: DN (R; „) and D(R; „) are nonnegative convex

functions

of R and hence are continuous in R for R > 0.

S

Proof: Nonnegativity is obvious from the nonnegativity of distortion. Suppose that pi 2 RN (Ri; „N ); i = 1; 2 yields

Epi N (XN ; Y N ) • DN (Ri; „) + †:

202

CHAPTER 10. DISTORTION

From Corollary 5.5.5 mutual information is a convex

function of the condi-

tional distribution and hence if p„ = ‚p1 + (1 ¡ ‚)p2,

then

 

S

Ip• ‚Ip1 + (1 ¡ ‚)Ip2 • ‚R1 + (1 ¡ ‚)R2

and hence p2 RN (‚R1 + (1 ¡ ‚)R2) and therefore

DN (‚R1 + (1 ¡ ‚)R2) • EpN (XN ; Y N )

= ‚Ep1 N (XN ; Y N ) + (1 ¡ ‚)Ep2 N (XN ; Y N )

• ‚DN (R1; „) + (1 ¡ ‚)DN (R2; „):

Since D(R; „) is the limit of DN (R; „), it too is convex. It is well known from real analysis that convex functions are continuous except possibly at their end points. 2

The following lemma shows that when the underlying source is stationary and the fldelity criterion is subadditive (e.g., additive), then the limit deflning D(R; „) is an inflmum.

Lemma 10.6.2: If the source is stationary and the fldelity criterion is

subadditive, then

 

 

 

 

 

 

 

D(R; „) = lim

DN (R; „) = inf

1

DN (R; „):

 

 

 

 

 

N!1

N

N

 

 

Proof: Fix N and n < N and let pn 2 Rn(R; „n) yield

 

 

Epn n(Xn; Y n) • Dn(R; „n) +

 

 

 

 

 

 

2

 

 

and let pN¡n 2 RN¡n(R; „N¡n) yield

 

 

 

 

 

 

EpN ¡n N¡n(XN¡n; Y N¡n) • DN¡n(R; „N¡n) +

:

 

2

pn together with n implies a regular conditional probability q(F jxn), F 2 Bn .

^

A

Similarly pN¡n and N¡n imply a regular conditional probability r(GjxN¡n). Deflne now a regular conditional probability t(¢jxN ) by its values on rectangles as

t(F £ GjxN ) = q(F jxn)r(GjxNn ¡n); F 2 Bn^; G 2 BN^ ¡n:

A A

Note that this is the flnite dimensional analog of a block memoryless channel with two blocks. Let pN = N t be the distribution induced by and t. Then exactly as in Lemma 9.4.2 we have because of the conditional independence that

IpN (XN ; Y N ) • IpN (Xn; Y n) + IpN (XnN¡n; YnN¡n)

and hence from stationarity

IpN (XN ; Y N ) • Ipn (Xn; Y n) + IpN ¡n (XN¡n; Y N¡n)

10.6. THE DISTORTION-RATE FUNCTION

 

 

 

 

 

203

 

 

• nR + (N ¡ n)R = N R

 

 

 

 

 

so that pN 2 RN (R; „N ). Thus

 

 

 

 

 

 

 

 

 

 

DN (R; „N )

 

EpN N (XN ; Y N )

 

EpN

n(Xn; Y n) + N¡n(XnN¡n; YnN¡n)

 

= Epn n(X

n

; Y

n

 

¡

 

N

n

; Y

N

n

¢

 

 

 

 

) + EpN ¡n N¡n(X

 

¡

 

¡

)

• Dn(R; „n) + DN¡n(R; „N¡n) + †:

Thus since is arbitrary we have shown that if dn = Dn(R; „n), then

dN • dn + dN¡n; n • N ;

that is, the sequence dn is subadditive. The lemma then follows immediately from Lemma 7.5.1 of [50]. 2

As with the „ distance, there are alternative characterizations of the distortionrate function when the process is stationary. The remainder of this section is devoted to developing these results. The idea of an SBM channel will play an important role in relating nth order distortion-rate functions to the process deflnitions. We henceforth assume that the input source is stationary and we conflne interest to additive fldelity criteria based on a per-letter distortion

= 1.

The basic process DRF is deflned by

; Y0);

Ds(R; „) = inf Ep(X0

 

p2Rs (R;„)

 

 

where Rs(R; „) is the collection of all stationary processes p having as an

input distribution and having mutual information rate Ip = Ip(X; Y ) • R. The

original idea of a process rate-distortion function was due to Kolmogorov and his colleagues [87] [45] (see also [23]). The idea was later elaborated by Marton [101] and Gray, Neuhofi, and Omura [55].

Recalling that the L1 ergodic theorem for information density holds when

Ip = Ip ; that is, the two principal deflnitions of mutual information rate yield the same value, we also deflne the process DRF

Ds(R; „) = inf Ep(X0; Y0);

p2Rs (R;„)

where Rs (R; „) is the collection of all stationary processes p having as an

input distribution, having mutual information rate Ip R, and having Ip = Ip .

If is both stationary and ergodic, deflne the corresponding ergodic process

DRF’s by

inf

Ep(X0; Y0);

 

 

De(R; „) =

 

 

 

 

 

p2Re (R;„)

 

 

De(R; „) =

inf

Ep(X0; Y0);

p2Re(R;„)

 

where Re(R; „) is the subset of Rs(R; „) containing only ergodic measures and Re (R; „) is the subset of Rs (R; „) containing only ergodic measures.

204

CHAPTER 10.

DISTORTION

 

Theorem 10.6.1: Given a stationary source which possesses a reference

letter in the sense that there exists a letter a2 A^ such that

 

 

E(X0; a) • ‰< 1:

(10.15)

Fix R > 0. If D(R; „) < 1, then

 

 

D(R; „) = Ds(R; „) = Ds(R; „):

 

If in addition is ergodic, then also

 

 

D(R; „) = De(R; „) = De(R; „):

 

 

The proof of the theorem depends strongly on the relations among distortion

and mutual information for vectors and for SBM channels. These are stated and proved in the following lemma, the proof of which is straightforward but somewhat tedious. The theorem is proved after the lemma.

 

 

Lemma 10.6.3: Let be the process distribution of a stationary source

fXng. Let n;

n = 1; 2; ¢ ¢ ¢ be a subadditive (e.g., additive) fldelity criterion.

Suppose that there is a reference letter a

A^ for which (10.15) holds. Let pN be

a measure on (A

N

^N

N

 

N

 

 

 

2 N

 

N

(F £

 

£A

, BA £BA^ ) having

as input marginal; that is, p

 

^N

) =

N

 

 

N . Let q denote the induced conditional probability

A

 

 

(F ) for F 2 BAN

2 A

N

 

N

, is a regular conditional probability

measure; that is, qxN (F ), x

 

 

, F 2 BA^

measure.

 

(This exists because the spaces are standard.)

We abbreviate this

relationship as pN = N q.

Let XN ; Y N denote the coordinate functions on

A

N

^N

and suppose that

 

 

 

 

 

 

 

 

 

 

 

 

£ A

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

N (XN ; Y N ) • D

(10.16)

 

 

 

 

 

 

 

EpN

 

 

 

 

 

 

 

 

N

and

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IpN

(XN ; Y N ) • R:

(10.17)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

If is an (N; ) SBM channel induced by q as in Example 9.4.11 and if p = „” is the resulting hookup and fXn; Yng the input/output pair process, then

1

EpN (XN ; Y N ) • D +

(10.18)

 

N

and

 

 

 

 

 

I

(X; Y ) = I(X; Y )

R;

(10.19)

 

 

p

p

 

 

that is, the resulting mutual information rate of the induced stationary process satisfles the same inequality as the vector mutual information and the resulting distortion approximately satisfles the vector inequality provided is su–ciently small. Observe that if the fldelity criterion is additive, the (10.18) becomes

Ep1(X0; Y0) • D + –:

10.6. THE DISTORTION-RATE FUNCTION

205

Proof: We flrst consider the distortion as it is easier to handle. Since the SBM channel is stationary and the source is stationary, the hookup p is stationary and

n Epn(Xn; Y n) =

n Z

dmZ (z)Epz n(Xn; Y n);

1

 

1

 

 

where pz is the conditional distribution of fXn; Yng given fZng. Note that the above formula reduces to Ep(X0; Y0) if the fldelity criterion is additive because of the stationarity. Given z, deflne J0n(z) to be the collection of indices of zn for which zi is not in an N -cell. (See the discussion in Example 9.4.11.) Let J1n(z) be the collection of indices for which zi begins an N -cell. If we deflne the event G = fz : z0 begins an N ¡ cellg, then i 2 J1n(z) if T iz 2 G. From Corollary 9.4.3 mZ (G) • N ¡1. Since is stationary and fXng and fZng are mutually independent,

2X

 

2X

nEpz n(Xn; Y n)

Epz (Xi; a) + N

Epz (XiN ; YiN )

i Jn(z)

 

i Jn(z)

0

 

1

1

1

 

X

X

 

= 1Gc (T iz)+ EpN N 1G(T iz):

i=0

i=0

 

Since mZ is stationary, integrating the above we have that

Ep1(X0; Y0) = mZ (Gc) + N mZ (G)EpN N

• ‰+ EpN N ;

proving (10.18).

^

Let rm and tm denote asymptotically accurate quantizers on A and A; that is, as in Corollary 6.2.1 deflne

^ n

= rm(X)

n

= (rm(X0); ¢ ¢ ¢ ; rm(X1))

X

 

^ n

 

 

 

 

n

. Then

 

 

 

and similarly deflne Y

 

= tm(Y )

 

 

 

 

I(rm(X)n; tm(Y )n) m! I(Xn; Y n)

 

 

 

 

 

 

 

 

 

 

 

 

 

!1

 

 

and

I(rm(X); tm(Y )) m! I(X; Y ):

 

 

 

We wish to prove that

 

 

 

 

 

 

 

 

 

 

 

!1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

n

n

 

I(X; Y ) =

 

lim

 

 

lim

 

I(rm(X)

; tm(Y )

)

 

 

 

n!1 m!1 n

 

 

 

= lim

 

lim

 

1

I(rm(X)n; tm(Y )n): = I(X; Y )

 

 

 

m!1 n!1 n

 

 

 

 

 

 

 

 

 

Since I‚ I, we must show that

 

 

1

 

 

 

 

 

 

 

lim

 

lim

 

 

I(rm(X)n; tm(Y )n)

 

 

 

 

 

 

 

 

 

 

n!1 m!1 n

 

 

 

206

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 10.

DISTORTION

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

lim

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

nlim

n I(rm(X) ; tm(Y ) ):

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m

!1

 

 

 

 

 

 

 

 

 

 

 

We have that

 

 

 

 

 

 

 

 

 

 

 

 

 

 

!1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

^ n

 

^ n

 

 

 

 

 

^ n

; Z

n

 

 

^ n

) ¡ I(Z

n

^ n

^ n

)

 

 

 

 

 

 

 

 

I(X

 

; Y

 

 

) = I((X

 

 

 

); Y

 

 

 

; Y

 

 

jX

 

 

 

 

 

 

 

and

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

^ n

; Z

n

 

 

 

 

^ n

 

 

 

 

^ n

 

 

^ n

jZ

n

 

 

 

 

 

 

^ n

; Z

n

 

 

 

 

 

^ n

 

^ n

jZ

n

)

I((X

 

 

); Y

 

) = I(X

 

 

; Y

 

 

 

 

) + I(Y

 

 

) = I(X

 

 

; Y

 

 

^ n

and Z

n

are independent. Similarly,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

since X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I(Z

n

 

 

^ n

^ n

) = H(Z

n

 

 

^ n

)

¡ H(Z

n

 

^ n

 

^ n

)

 

 

 

 

 

 

 

 

 

 

; Y

 

 

jX

 

 

jX

 

 

 

jX

 

; Y

 

 

 

 

 

 

 

 

 

 

= H(Z

n

)

¡ H(Z

n

^ n

 

 

^ n

) = I(Z

n

 

 

 

 

^ n

 

 

^ n

)):

 

 

 

 

 

 

 

 

 

 

 

jX

 

; Y

 

; (X

 

; Y

 

 

 

 

 

Thus we need to show that

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n!1 m!1

µ n

 

 

m

 

 

 

 

m

 

 

 

 

 

 

j

 

 

 

 

 

 

¡ n

 

 

 

 

 

 

 

 

m

 

 

 

 

 

 

m

 

 

 

lim

lim

 

 

1

I(r

 

 

(X)n; t

 

 

 

(Y )n

 

Zn)

 

 

 

1

I(Zn; (r

 

(X)n; t

 

(Y )n))

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

m!1 n!1 µ n

 

 

 

m

 

 

 

 

m

 

 

 

 

 

 

j

 

 

 

 

 

¡ n

 

 

 

 

 

 

 

m

 

 

 

 

 

m

 

 

lim lim

 

 

 

 

1

I(r

 

 

(X)n; t

 

 

 

(Y )n

 

Zn)

 

 

 

1

I(Zn; (r

 

 

 

(X)n; t

(Y )n)) :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Since Zn has a flnite alphabet, the limits of n¡1I(Zn; (rm(X)n; tm(Y )n)) are

the same regardless of the order from Theorem 6.4.1. Thus I will equal I if we can show that

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

I(r

 

n

 

 

 

 

n

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

lim

 

 

 

 

m(X) ; tm(Y )

 

jZ )

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

I(X; Y jZ) = nlim

 

m

!1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

!1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

lim

lim

 

 

1

I(r

 

(X)n; t

 

(Y )n Zn) = I(X; Y Z):

(10.20)

 

 

 

m

m

m

!1

n

!1

n

 

 

 

 

 

 

 

 

 

 

 

 

j

 

 

 

 

 

 

j

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This we now proceed to do. From Lemma 5.5.7 we can write

 

 

 

 

 

I(rm(X)n; tm(Y )njZn) = Z I(rm(X)n; tm(Y )njZn = zn) dPZn (zn):

 

 

 

 

 

 

n

 

 

 

 

 

 

n

Z

n

 

=

z

n

) to I

^ n

 

^ n

).

 

This is simply the

Abbreviate I(rm(X)

 

 

 

 

 

 

 

 

 

 

 

 

 

(X

; Y

 

 

; tm(Y ) j n

 

 

 

 

 

 

^

n

 

z

 

 

 

 

 

 

 

 

^

n

^

n

 

 

 

 

 

 

 

 

 

 

 

^

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

)

mutual information between X

 

and Y

 

 

under the distribution for (X ; Y

given a particular random blocking sequence z. We have that

 

 

 

 

 

 

 

 

 

 

 

 

 

^ n

 

^ n

) = Hz

 

^ n

 

 

^ n

^ n

):

 

 

 

 

 

 

 

 

 

 

Iz (X

; Y

(Y

 

 

) ¡ Hz (Y

 

jX

 

 

 

 

 

 

 

 

Given z, let J0n(z) be as before. Let J2n(z) denote the collection of all indices i of zi for which zi begins an N cell except for the flnal such index (which may begin an N -cell not completed within zn). Thus J2n(z) is the same as J1n(z) except that the largest index in the latter collection may have been removed

10.6. THE DISTORTION-RATE FUNCTION

207

if the resulting N -cell was not completed within the n-tuple. We have using standard entropy relations that

Iz (X ; Y ) i J0n(z) Hz (YijY ) ¡ Hz (YijY ; X

 

^ n ^ n

2X

 

^ ^ i

 

^ ^ i ^ i+1

 

 

 

 

 

 

 

+ i J2n(z) Hz (Yi

jY ) ¡ Hz (Yi

jY ; X )· :

(10.21)

 

2X

^ N

^ i

 

^ N ^ i ^ i+N

 

 

 

 

 

 

 

For i 2 J0n(z), however, Yi is awith probability one and hence

 

 

 

 

^

^ i

 

 

^

 

 

 

 

 

Hz (YijY

) • Hz (Yi) • Hz (Yi) = 0

 

and

 

 

^ ^ i

^ i+1

 

^

 

 

 

 

 

 

 

 

 

 

 

Hz (YijY

; X

 

) • Hz (Yi) • Hz (Yi) = 0:

 

Thus we have the bound

 

 

 

 

 

 

 

Iz (X ; Y ) i J2n(z) Hz (Yi jY ) ¡ Hz (Yi jY ; X :

 

^ n ^ n

 

2X

 

^ N ^ i

 

^ N ^ i ^ i+N

 

 

 

 

 

 

 

=

X ‡Iz (Y^iN ; (Y^ i; X^ i + N )) ¡ Iz (Y^iN ; Y^ i)·

 

i2J2n(z)

i

J2n(z) Iz (Yi

; Xi

) ¡ Iz (Yi

; Y )· ;

(10.22)

 

2X

^ N

^ N

^ N

^ i

 

 

 

 

where the last inequality follows from the fact that I(U ; (V; W )) ‚ I(U ; V ). For i 2 J2n(z) we have by construction and the stationarity of that

^ N

^ N

^ N

^ N

):

(10.23)

Iz (Xi

; Yi

) = IpN (X

; Y

As before let G = fz : z0 begins an N ¡ cellg. Then i 2 J2n(z) if T iz 2 G and i < n ¡ N and we can write

1 ^ n ^ n

1

 

^ N ^ N

n¡N¡1

i

 

Iz (X ; Y )

 

IpN (X ; Y )

X

1G(T z)

n

n

i=0

 

 

 

 

 

 

 

 

 

 

 

1 n¡N¡1

^ N ^ i

i

 

 

¡

 

 

X

Iz (Yi ; Y )1G(T z):

 

 

n

 

i=0

 

 

 

 

 

 

 

 

All of the above terms are measurable functions of z and are nonnegative. Hence they are integrable (although we do not yet know if the integral is flnite) and we have that

1

^ n

^ n

n

^

n I(X

; Y

) ‚ Ip

(X

N ^ N n ¡ N

; Y )mZ (G)

n

 

1 n¡N¡1

Z

dmZ (z)Iz (Y^iN ; Y^ i)1G(T iz):

¡n i=0

 

 

X

 

 

208

CHAPTER 10. DISTORTION

To continue we use the fact that since the processes are stationary, we can consider it to be a two sided process (if it is one sided, we can imbed it in a two sided process with the same probabilities on rectangles). By construction

 

 

 

 

 

 

 

 

^ N

 

^ i

 

 

 

 

^ N

 

 

 

 

¢ ¢ ¢ ; Y¡1))

 

 

 

 

 

 

 

 

 

Iz (Yi

 

; Y

) = IT iz (Y0 ; (Y¡i;

 

 

 

 

and hence since mZ is stationary we can change variables to obtain

 

 

 

 

 

1

I(X^ n; Y^ n)

I n (X^ N ; Y^ N )m

 

 

(G)

n ¡ N

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p

 

 

 

 

 

Z

 

 

n

 

 

 

 

 

 

1 n¡N¡1

Z

dmZ (z)Iz (Y^0N ; (Y^¡i; ¢ ¢ ¢ ; Y^¡1))1G(z):

 

 

¡n

i=0

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

We obtain a further bound from the inequalities

 

 

 

 

 

 

 

 

I

(Y^ N ; (Y^

;

¢ ¢ ¢

; Y^

 

 

))

I

(Y N ; (Y

¡i

;

¢ ¢ ¢

; Y

))

I

(Y N ; Y ¡)

z

0

¢ ¢ ¢

¡i

 

 

¡1

 

 

z

 

0

 

 

¡1

 

z

0

where Y ¡ = (

¡2

; Y

¡1

 

 

 

 

 

 

z

0

 

 

 

 

 

 

 

 

 

 

 

 

; Y

 

 

 

 

). Since I

(Y N ; Y ¡) is measurable and nonnegative,

its integral is deflned and hence

n!1 n

 

 

j

 

 

p

 

 

 

Z

¡ ZG

Z z 0

 

1

^ n

^ n

 

n

)

 

n

^ N

^ N

)m (G)

 

dm (z)I (Y

N

; Y

lim

I(X

; Y

Z

 

I

 

(X

; Y

 

 

We can now take the limit as m ! 1 to obtain

Z

I(X; Y jZ) ‚ Ipn (XN ; Y N )mZ (G) ¡ dmZ (z)Iz (Y0N ; Y ¡):

G

This provides half of what we need.

Analogous to (10.21) we have the upper bound

¡):

(10.24)

X

^ n ^ n ^ N ^ i ^ i+N

Iz (X ; Y ) Iz (Yi ; (Y ; X

·

¡ ^ N ^ i

)) Iz (Yi ; Y ) (10.25)

i2J1n(z)

We note in passing that the use of J1 here assumes that we are dealing with a one sided channel and hence there is no contribution to the information from any initial symbols not contained in the flrst N -cell. In the two sided case time 0 could occur in the middle of an N -cell and one could flx the upper bound by adding the flrst index less than 0 for which zi begins an N -cell to the above sum. This term has no afiect on the limits. Taking the limits as m ! 1 using Lemma 5.5.1 we have that

Iz (Xn; Y n)

X

Iz (YiN ; (Y i; Xi+N )) ¡ Iz (YiN ; Y i) :

 

¡

¢

i2J1n(z)

Given Zn = zn and i 2 J1n(z), (Xi; Y i) ! XiN ! YiN forms a Markov chain because of the conditional independence and hence from Lemma 5.5.2 and Corol-

lary 5.5.3

Iz (YiN ; (Y i; Xi+N )) = Iz (XiN ; YiN ) = IpN (XN ; Y N ):