Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Сумский государственный педагогический университет им. Макаренко

Предмет:

Теория вероятностей и математическая статистика

Файл:

Теория информации / Gray R.M. Entropy and information theory. 1990., 284p

.pdf

Скачиваний:

Добавлен:

09.08.2013

Размер:

1.32 Mб

Скачать

☆

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 2324 / 3124 25 26 27 28 29 30 31 > Следующая >>>

10.6. THE DISTORTION-RATE FUNCTION

209

Thus we have the upper bound

n¡1

1 n¡1

Iz (Xn; Y n) •

IpN (XN ; Y N )

1G(T iz) ¡

Iz (YiN ; Y i)1G(T iz):

i=0

Taking expectations and using stationarity as before we ﬂnd that

I(Xn; Y njZn) • IpN (XN ; Y N )mZ (G)

1 n¡1

¡n i=0

ZG dmZ (z)Iz (Y0N ; (Y¡i; ¢ ¢ ¢ ; Y¡1)):

Taking the limit as n ! 1 using Lemma 5.6.1 yields

I„(X; Y jZ) • IpN (XN ; Y N )mZ (G) ¡ ZG dmZ (z)Iz (Y0N ; Y ¡):

(10.26)

Combining this with (10.24) proves that

I„(X; Y jZ) • I⁄(X; Y jZ)

and hence that

I„(X; Y ) = I⁄(X; Y ):

It also proves that

„

I(X; Y ) = I(X; Y jZ) ¡ I(Z; (X; Y )) • I(X; Y jZ)

• IpN (XN ; Y N )mZ (G) •

IpN (XN ; Y N )

using Corollary 9.4.3 to bound mX (G). This proves (10.19). 2

Proof of the theorem: We have immediately that

⁄(R; „)

„

(R; „)

‰ Rs

and

„

⁄

(R; „)

(R; „);

‰ Re

‰ Rs

and hence we have for stationary sources that

D„s(R; „) • Ds⁄(R; „)

(10.27)

and for ergodic sources that

D„s(R; „) • Ds⁄(R; „) • De⁄(R; „)

(10.28)

and

D„s(R; „) • D„e(R; „) • De⁄(R; „):

(10.29)

210	CHAPTER 10.	DISTORTION
We next prove that	„	(10.30)
	„
	Ds(R; „) ‚ D(R; „):

„
If Ds(R; „) is inﬂnite, the inequality is obvious. Otherwise ﬂx † > 0 and choose
„	„
a p 2 Rs(R; „) for which Ep‰1	(X0; Y0) • Ds(R; „) + † and ﬂx – > 0 and choose

m so large that for n ‚ m we have that

¡1 n n • „ •

n Ip(X ; Y ) Ip(X; Y ) + – R + –:

For n ‚ m we therefore have that pn 2 Rn(R + –; „n) and hence

„	n	‰n ‚ Dn(R + –; „) ‚ D(R + –; „):
Ds(R; „) + † = Ep

From Lemma 10.6.1 D(R; „) is continuous in R and hence (10.30) is proved. Lastly, ﬂx † > 0 and choose N so large and pN 2 RN (R; „N ) so that

EpN ‰N • DN (R; „N ) + 3† • D(R; „) + 23† :

Construct the corresponding (N; –)-SBM channel as in Example 9.4.11 with – small enough to ensure that –‰⁄ • †=3. Then from Lemma 10.6.2 we have

„ ⁄ •

that the resulting hookup p is stationary and that Ip = Ip R and hence

p	2 Rs	‰ Rs		(R; „). Furthermore, if „ is ergodic then so is p and hence
	⁄(R; „)	„
p	2 Re	‰ Re	(R; „). From Lemma 10.6.2 the resulting distortion is
	⁄(R; „)	„

Ep‰1(X0; Y0) • EpN ‰N + ‰⁄– • D(R; „) + †:

Since † > 0 this implies the exisitence of a p 2 R⁄s (R; „) (p 2 R⁄e (R; „) if „ is ergodic) yielding Ep‰1(X0; Y0) arbitrarily close to D(R; „. Thus for any

stationary source

Ds⁄(R; „) • D(R; „)

and for any ergodic source

De⁄(R; „) • D(R; „):

With (10.27){(10.30) this completes the proof. 2

The previous lemma is technical but important. It permits the construction of a stationary and ergodic pair process having rate and distortion near that of that for a ﬂnite dimensional vector described by the original source and a ﬂnite-dimensional conditional probability.

UnKK

Chapter 11

Source Coding Theorems

11.1Source Coding and Channel Coding

In this chapter and the next we develop the basic coding theorems of information theory. As is traditional, we consider two important special cases ﬂrst and then later form the overall result by combining these special cases. In the ﬂrst case we assume that the channel is noiseless, but it is constrained in the sense that it can only pass R bits per input symbol to the receiver. Since this is usually insu–cient for the receiver to perfectly recover the source sequence, we attempt to code the source so that the receiver can recover it with as little distortion as possible. This leads to the theory of source coding or source coding subject to a ﬂdelity criterion or data compression, where the latter name re°ects the fact that sources with inﬂnite or very large entropy are \compressed" to ﬂt across the given communication link. In the next chapter we ignore the source and focus on a discrete alphabet channel and construct codes that can communicate any of a ﬂnite number of messages with small probability of error and we quantify how large the message set can be. This operation is called channel coding or error control coding. We then develop joint source and channel codes which combine source coding and channel coding so as to code a given source for communication over a given channel so as to minimize average distortion. The ad hoc division into two forms of coding is convenient and will permit performance near that of the OPTA function for the codes considered.

11.2Block Source Codes for AMS Sources

We ﬂrst consider a particular class of codes: block codes. For the time being we also concentrate on additive distortion measures. Extensions to subadditive distortion measures will be considered later. Let fXng be a source with a standard alphabet A. Recall that an (N; K) block code of a source fXng maps successive nonoverlapping input vectors fXnNN g into successive channel vectors

= ﬁ(XnNN ), where ﬁ : AN ! BK is called the source encoder. We assume

211

212	CHAPTER 11. SOURCE CODING THEOREMS

that the channel is noiseless, but that it is constrained in the sense that N source time units corresponds to the same amount of physical time as K channel time

units and that

K log jjBjj • R; N

where the inequality can be made arbitrarily close to equality by taking N and K large enough subject to the physical stationarity constraint. R is called the source coding rate or resolution in bits or nats per input symbol. We may wish to change the values of N and K, but the rate is ﬂxed.

A reproduction or approximation of the original source is obtained by a source decoder, which we also assume to be a block code. The decoder is a

mapping ﬂ : B	K		^N	which forms the reproduction process		^		^ N		=
		!	A		f	X		via X
K							ng		nN
ﬂ(UnK ); n =	1; 2; ¢ ¢ ¢.			In general we could have a reproduction dimension

diﬁerent from that of the input vectors provided they corresponded to the same amount of physical time and a suitable distortion measure was deﬂned. We will make the simplifying assumption that they are the same, however.

Because N source symbols are mapped into N reproduction symbols, we will often refer to N alone as the block length of the source code. Observe that the resulting sequence coder is N -stationary. Our immediate goal is now the following: Let E and D denote the collection of all block codes with rate no greater than R and let ” be the given channel. What is the OPTA function ¢(„; E; ”; D) for this system? Our ﬂrst step toward evaluating the OPTA is to ﬂnd a simpler and equivalent expression for the current special case.

Given a source code consisting of encoder ﬁ and decoder ﬂ, deﬂne the code-

book to be

C = f all ﬂ(uK ); uK 2 BK g;

that is, the collection of all possible reproduction vectors available to the receiver. For convenience we can index these words as

C = fyi; i = 1; 2; ¢ ¢ ¢ ; M g;

where N ¡1 log M • R by construction. Observe that if we are given only a decoder ﬂ or, equivalently, a codebook, and if our goal is to minimize the average distortion for the current block, then no encoder can do better than the encoder ﬁ⁄ which maps an input word xN into the minimum distortion available reproduction word, that is, deﬂne ﬁ⁄(xN ) to be the uK minimizing ‰N (xN ; ﬂ(uK )), an assignment we denote by

ﬁ⁄(xN ) = min¡1‰N (xN ; ﬂ(uK )):

Observe that by construction we therefore have that

‰N (xN ; ﬂ(ﬁ⁄(xN ))) = min ‰N (xN ; y)

y2C

and the overall mapping of xN into a reproduction is a minimum distortion or nearest neighbor mapping. Deﬂne

‰N (xN ; C) = min ‰N (xN ; y):

y2C

11.2. BLOCK SOURCE CODES FOR AMS SOURCES

213

To formally prove that this is the best decoder, observe that if the source „ is AMS and p is the joint distribution of the source and reproduction, then p is also AMS. This follows since the channel induced by the block code is N -stationary and hence also AMS with respect to T N . This means that p is AMS with respect to T N which in turn implies that it is AMS with respect to T (Theorem 7.3.1 of [50]). Letting p„ denote the stationary mean of p and p„N denote the N -stationary mean, we then have from (10.10) that for any block codes with codebook C

¢ = N1 Ep„N ‰N (XN ; Y N ) ‚ N1 Ep„N ‰N (XN ; C);

with equality if the minimum distortion encoder is used. For this reason we can conﬂne interest to block codes speciﬂed by a codebook: the encoder produces the index of the minimum distortion codeword for the observed vector and the decoder is a table lookup producing the codeword being indexed. A code of this type is also called a vector quantizer or block quantizer. Denote the performance of the block code with codebook C on the source „ by

‰(C; „) = ¢ = Ep‰1:

Lemma 11.2.1: Given an AMS source „ and a block length N code book C, let „N denote the N -stationary mean of „ (which exists from Corollary 7.3.1 of [50]), let p denote the induced input/output distribution, and let p„ and p„N denote its stationary mean and N -stationary mean, respectively. Then

‰(C; „) = Ep„‰1(X0; Y0) = N1 Ep„N ‰N (XN ; Y N )

= N1 E„N ‰N (XN ; C) = ‰(C; „N ):

Proof: The ﬂrst two equalities follow from (10.10), the next from the use of the minimum distortion encoder, the last from the deﬂnition of the performance of a block code. 2

It need not be true in general that ‰(C; „) equal ‰(C; „). For example, if „ produces a single periodic waveform with period N and C consists of a single period, then ‰(C; „) = 0 and ‰(C; „) > 0. It is the N -stationary mean and not the stationary mean that is most useful for studying an N -stationary code.

We now deﬂne the OPTA for block codes to be

–(R; „) = ¢⁄(„; ”; E; D) = inf –N (R; „);

				N
–	N (R; „) =	inf		‰(C; „);
	N (R; „) =	C2K	(N;R)	‰(C; „);
		C2K

where ” is the noiseless channel as described previously, E and D are classes of block codes for the channel, and K(N; R) is the class of all block length N codebooks C with

N1 log jjCjj • R:

214 CHAPTER 11. SOURCE CODING THEOREMS

–(R; „) is called the block source coding OPTA or the operational block coding distortion-rate function.

Corollary 11.2.1: Given an AMS source „, then for any N and i =

0; 1; ¢ ¢ ¢ ; N ¡ 1

–N (R; „T ¡i) = –N (R; „N T ¡i):

Proof: For i = 0 the result is immediate from the lemma. For i 6= 0 it follows from the lemma and the fact that the N -stationary mean of „T ¡i is „N T ¡i (as is easily veriﬂed from the deﬂnitions). 2

Reference Letters

Many of the source coding results will require a technical condition that is a generalization of reference letter condition of Theorem 10.6.1 for stationary

				^			2	A^ with respect
sources. An AMS source „ is said to have a reference letter a⁄								A^ with respect
to a distortion measure ‰ = ‰1 on A £ A if
sup E	„T ¡	n ‰(X	; a⁄) = sup E		‰(X	; a⁄) = ‰⁄ <	1	;	(11.1)
n	„T ¡	0	n	„	n		1

that is, there exists a letter for which E„‰(Xn; a⁄) is uniformly bounded above. If we deﬂne for any k the vector a⁄k = (a⁄; a⁄; ¢ ¢ ¢ ; a⁄) consisting of k a⁄’s, then (11.1) implies that

sup E		n	1	‰	(Xk; a⁄k)	•	‰⁄ <	1	:	(11.2)
	„T ¡
n			k	k

We assume for convenience that any block code of length N contains the reference vector a⁄N . This ensures that ‰N (xN ; C) • ‰N (xN ; a⁄N ) and hence that ‰N (xN ; C) is bounded above by a „-integrable function and hence is itself „-integrable. This implies that

–(R; „) • –N (R; „) • ‰⁄:

(11.3)

The reference letter also works for the stationary mean source „ since

n¡1

lim 1 X ‰(xi; a⁄) = ‰1(x; a⁄);

n!1 n i=0

„-a.e. and „-a.e., where a⁄ denotes an inﬂnite sequence of a⁄. Since ‰1 is invariant we have from Lemma 6.3.1 of [50] and Fatou’s lemma that

E„‰(X0; a⁄) = E„ ˆnlim	n	‰(Xi; a⁄)!
	1 n¡1
!1		X
!1		i=0

n¡1

• lim inf 1 X E„‰(Xi; a⁄) • ‰⁄:

n!1 n

i=0

11.2. BLOCK SOURCE CODES FOR AMS SOURCES

215

Performance and OPTA

We next develop several basic properties of the performance and OPTA functions for block coding AMS sources with additive ﬂdelity criteria.

Lemma 11.2.2: Given two sources „1 and „2 and ‚ 2 (0; 1), then for any block code C

‰(C; ‚„1 + (1 ¡ ‚)„2) = ‚‰(C; „1) + (1 ¡ ‚)‰(C; „2)

and for any N

–N (R; ‚„1 + (1 ¡ ‚)„2) ‚ ‚–N (R; „1) + (1 ¡ ‚)–N (R; „2)

and

–(R; ‚„1 + (1 ¡ ‚)„2) ‚ ‚–(R; „1) + (1 ¡ ‚)–(R; „2):

Thus performance is linear in the source and the OPTA functions are convex

. Lastly,

–N (R + N ; ‚„1 + (1 ¡ ‚)„2) • ‚–N (R; „1) + (1 ¡ ‚)–N (R; „2):

Proof: The equality follows from the linearity of expectation since ‰(C; „) = E„‰(XN ; C). The ﬂrst inequality follows from the equality and the fact that the inﬂmum of a sum is bounded below by the sum of the inﬂma. The next inequality follows similarly. To get the ﬂnal inequality, let Ci approximately yield –N (R; „i); that is,

‰(Ci; „i) • –N (R; „i) + †:

Form the union code C = C1		C2 containing all of the words in both of the
codes. Then the rate of the	code is
		S

N1 log jjCjj = N1 log(jjC1jj + jjC2jj)

• N1 log(2NR + 2NR) = R + N1 :

This code yields performance

‰(C; ‚„1 + (1 ¡ ‚)„2) = ‚‰(C; „1) + (1 ¡ ‚)‰(C; „2)

• ‚‰(C1; „1) + (1 ¡ ‚)‰(C2; „2) • ‚–N (R; „1) + ‚† + (1 ¡ ‚)–N (R; „2) + (1 ¡ ‚)†:

Since the leftmost term in the above equation can be no smaller than –N (R + 1=N; ‚„1 + (1 ¡ ‚)„2), the lemma is proved. 2

The ﬂrst and last inequalities in the lemma suggest that –N is very nearly an a–ne function of the source and hence perhaps – is as well. We will later pursue this possibility, but we are not yet equipped to do so.

216 CHAPTER 11. SOURCE CODING THEOREMS

Before developing the connection between the OPTA functions of AMS sources and those of their stationary mean, we pause to develop some additional properties for OPTA in the special case of stationary sources. These results follow Kieﬁer [76].

Lemma 11.2.3: Suppose that „ is a stationary source. Then

–(R; „) = lim –N (R; „):

N!1

Thus the inﬂmum over block lengths is given by the limit so that longer codes can do better.

Proof: Fix an N and an n < N and choose codes Cn ‰ ^n CN¡n ‰ ^N¡n

and

for which

†

‰(Cn; „) • –n(R; „) + 2

†

‰(CN¡n; „) • –N¡n(R; „) + 2 :

Form the block length N code C = Cn £ CN¡n. This code has rate no greater than R and has distortion

N ‰(C; „) = E min ‰N (XN ; y)

y2C

=Eyn2Cn ‰n(Xn; yn) + EvN ¡n2CN ¡n ‰N¡n(XnN¡n; vN¡n)

=Eyn2Cn ‰n(Xn; yn) + EvN ¡n2CN ¡n ‰N¡n(XN¡n; vN¡n)

=n‰(Cn; „) + (N ¡ n)‰(CN¡n; „)

• n–n(R; „) + (N ¡ n)–N¡n(R; „) + †;

(11.4)

where we have made essential use of the stationarity of the source. Since † is arbitrary and since the leftmost term in the above equation can be no smaller than N –N (R; „), we have shown that

N –N (R; „) • n–n(R; „) + (N ¡ n)–N¡n(R; „)

and hence that the sequence N –N is subadditive. The result then follows immediately from Lemma 7.5.1 of [50]. 2

Corollary 11.2.2: If „ is a stationary source, then –(R; „) is a convex function of R and hence is continuous for R > 0.

Proof: Pick R1 > R2 and ‚ 2 (0; 1). Deﬂne R = ‚R1 + (1 ¡ ‚)R2. For large

n deﬂne n1 = b‚ncn be the largest integer less than ‚n and let n2 = n ¡n1. Pick
^	i	with rate Ri with distortion
codebooks Ci ‰ A

‰(Ci; „) • –ni (Ri; „) + †:

Analogous to (11.4), for the product code C = C1 £ C2 we have

n‰(C; „) = n1‰(C1; „) + n2‰(C2; „)

11.2. BLOCK SOURCE CODES FOR AMS SOURCES

217

• n1–n1 (R1; „) + n2–n2 (R2; „) + n†:

The rate of the product code is no greater than R and hence the leftmost term above is bounded below by n–n(R; „). Dividing by n we have since † is arbitrary

that	n1		n2
–n(R; „) •		–n1 (R1; „) +		–n2 (R2; „):
	n		n

Taking n ! 1 we have using the lemma and the choice of ni that

–(R; „) • ‚–(R1; „) + (1 ¡ ‚)–(R2; „);

proving the claimed convexity. 2

Corollary 11.2.3: If „ is stationary, then –(R; „) is an a–ne function of „. Proof: From Lemma 11.2.2 we need only prove that

–(R; ‚„1 + (1 ¡ ‚)„2) • ‚–(R; „1) + (1 ¡ ‚)–(R; „2):

From the same lemma we have that for any N

–N (R + N ; ‚„1 + (1 ¡ ‚)„2) • ‚–N (R; „1) + (1 ¡ ‚)–N (R; „2)

For any K • N we have since –N (R; „) is nonincreasing in R that

–N (R + K ; ‚„1 + (1 ¡ ‚)„2) • ‚–N (R; „1) + (1 ¡ ‚)–N (R; „2):

Taking the limit as N ! 1 yields from Lemma 11.2.3 that

–(R + K ; „) • ‚–(R; „1) + (1 ¡ ‚)–(R; „2):

From Corollary 11.2.2, however, – is continuous in R and the result follows by letting K ! 1. 2

The following lemma provides the principal tool necessary for relating the OPTA of an AMS source with that of its stationary mean. It shows that the OPTA of an AMS source is not changed by shifting or, equivalently, by redeﬂning the time origin.

Lemma 11.2.4: Let „ be an AMS source with a reference letter. Then for any integer i –(R; „) = –(R; „T ¡i).

Proof: Fix † > 0 and let CN be a rate R block length N codebook for which ‰(CN ; „) • –(R; „) + †=2. For 1 • i • N ¡1 choose J large and deﬂne the block length K = JN code CK (i) by

C	K (i) = a⁄(N¡i)	J¡2	N	£	a⁄i	;
C		£ j£ C		£
		=0

where a⁄l is an l-tuple containing all a⁄’s. CK (i) can be considered to be a code consisting of the original code shifted by i time units and repeated many times, with some ﬂller at the beginning and end. Except for the edges of the long

218	CHAPTER 11. SOURCE CODING THEOREMS

product code, the eﬁect on the source is to use the original code with a delay. The code has at most (2NR)J¡1 = 2KR2¡NR words; the rate is no greater than

For any K-block xK the distortion resulting from using Ck(i) is given by

K‰	K	(xK ;	CK	(i))	•	(N	¡	i)‰	N¡i	(xN¡i; a⁄(N¡i)) + i‰ (xi	; a⁄i): (11.5)
	K		CK		•		¡		N¡i	i K¡i

Let fx^ng denote the encoded process using the block code CK (i). If n is a multiple of K, then

		b	n	c
n‰ (xn; x^n)		b	K	c				(xN¡i; a⁄(N¡i))
n‰ (xn; x^n)				((N i)‰				(xN¡i; a⁄(N¡i))
n	•	X			¡		N¡i kK
		k=0
				b	n	cJ¡1
				b	K	cJ¡1
+i‰i(x(ik+1)K¡i; a⁄i)) +					X		N ‰N (xNN¡i+kN ; CN ):
+i‰i(x(ik+1)K¡i; a⁄i)) +							N ‰N (xNN¡i+kN ; CN ):

k=0

If n is not a multiple of K we can further overbound the distortion by including the distortion contributed by enough future symbols to complete a K-block, that is,

n‰n(xn; x^n) • n°n(x; x^)

c+1

k=0

‡(N ¡ i)‰N¡i(xkKN¡i; a⁄(N¡i) + i‰i(x(ik+1)K¡i; a⁄i)·

c+1)J¡1

N ‰N (xNN¡i+kN ; CN ):

k=0

Thus

c+1

N ¡ i 1

‰ (xn; x^n)

‰ (XN¡i(T kK x); a⁄(N¡i)

•

N¡i

K n=K

k=0

c+1

‰i(Xi(T (k+1)K¡ix; a⁄i)

K n=K

k=0

c+1)J¡1

‰N (XN (T (N¡i)+kN x); CN ):

n=N

k=0

Since „ is AMS these quantities all converge to invariant functions:

lim ‰ (xn; x^n)

N ¡ i

lim

m¡1

‰

(XN¡i(T kK x); a⁄(N¡i)

n!1 n

•

m!1

N¡i

k=0

m¡1

‰i(Xi(T (k+1)K¡ix; a⁄i)

lim

K m!1 m

k=0

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 2324 / 3124 25 26 27 28 29 30 31 > Следующая >>>

Соседние файлы в папке Теория информации

#
09.08.201310.58 Mб269Cover T.M., Thomas J.A. Elements of Information Theory. 2006., 748p.pdf
#
09.08.20131.32 Mб32Gray R.M. Entropy and information theory. 1990., 284p.pdf