Добавил:

neus500 Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Московский государственный юридический университет им. О.Е. Кутафина

Предмет:

Социология

Файл:

An Introduction to Statistical Signal Processing

.pdf

Скачиваний:

Добавлен:

10.07.2022

Размер:

1.81 Mб

Скачать

☆

<<< < Предыдущая 9 10 11 12 13 14 15 16 17 18 19 2021 / 4621 22 23 24 25 26 27 28 29 30 31 32 33 > Следующая >>>

3.18. PROBLEMS

185

186	CHAPTER 3. RANDOM OBJECTS

Chapter 4

Expectation and Averages

4.1Averages

In engineering practice we are often interested in the average behavior of measurements on random processes. The goal of this chapter is to link the two distinct types of averages that are used — long-term time averages taken by calculations on an actual physical realization of a random process and averages calculated theoretically by probabilistic averages at some given instant of time, averages that are sometimes called expectations. As we shall see, both computations often (but by no means always) give the same answer. Such results are called laws of large numbers or ergodic theorems.

At ﬁrst glance from a conceptual point of view, it seems unlikely that long-term time averages and instantaneous probabilistic averages would be the same. If we take a long-term time average of a particular realization of the random process, say {X(t, ω0); t T }, we are averaging for a particular

ω— an ω which we cannot know or choose; we do not use probability in any way and we are ignoring what happens with other values of ω. Here the averages are computed by summing the sequence or integrating the waveform over t while ω0 stays ﬁxed. If, on the other hand, we take an instantaneous probabilistic average, say at the time t0, we are taking a probabilistic average and summing or integrating over ω for the random variable X(t0, ω). Thus we have two averages, one along the time axis with

ωﬁxed, the other along the ω axis with time ﬁxed. It seems that there should be no reason for the answers to agree. Taking a more practical point of view, however, it seems that the time and probabilistic averages must be the same in many situations. For example, suppose that you measure the percentage of time that a particular noise voltage exceeds 10 volts. If you make the measurement over a su ciently long period of time,

187

188	CHAPTER 4. EXPECTATION AND AVERAGES

the result should be a reasonably good estimate of the probability that the noise voltage exceeds 10 volts at any given instant of time — a probabilistic average value.

To proceed further, for simplicity we concentrate on a discrete alphabet discrete time random process. Other cases are considered by converting appropriate sums into integrals. Let {Xn} be an arbitrary discrete alphabet discrete time process. Since the process is random, we cannot predict accurately its instantaneous or short-term behavior — we can only make probabilistic statements. Based on experience with coins, dice, and roulette wheels, however, one expects that the long-term average behavior can be characterized with more accuracy. For example, if one ﬂips a fair coin, short sequences of ﬂips are unpredictable. However, if one ﬂips long enough, one would expect to have an average of about 50% of the ﬂips result in heads. This is a time average of an instantaneous function of a random process — a type of counting function that we will consider extensively. It is obvious that there are many functions that we can average, i.e., the average value, the average power, etc. We will proceed by deﬁning one particular average, the sample average value of the random process, which is formulated as

n−1

Sn = n−1 Xi ; n = 1, 2, 3, . . .

i=0

We will investigate the behavior of Sn for large n, i.e., for a long-term time average. Thus, for example, if the random process {Xn} is the coin-ﬂipping model, the binary process with alphabet {0, 1}, then Sn is the number of 1’s divided by the total number of ﬂips — the fraction of ﬂips that produced a 1. As noted before, Sn should be close to 50% for large n if the coin is fair.

Note that, as in example [3.7], for each n, Sn is a random variable that is deﬁned on the same probability space as the random process {Xn}. This is made explicit by writing the ω dependence:

	1 n−1

Sn(ω) =		Xk(ω) .
	n
		k=0

In more direct analogy to example [3.7], we can consider the {Xn} as coordinate functions on a sequence space, say ( Z, B( Z), m), where m is the distribution of the process, in which case Sn is deﬁned directly on the sequence space. The form of deﬁnition is simply a matter of semantics or convenience. Observe, however, that in any case {Sn; n = 1, 2, . . . } is itself a random process since it is an indexed family of random variables deﬁned on a probability space.

4.1. AVERAGES

189

For the discrete alphabet random process that we are considering, we can rewrite the sum in another form by grouping together all equal terms:

Sn(ω) =	ara(n)(ω)	(4.1)
	a A

where A is the range space of the discrete alphabet random variable Xn and ra(n)(ω) = n−1 [number of occurrences of the letter a in {Xi(ω), i = 0, 1, 2, . . . , n−1}]. The random variable ra(n) is called the nth−order relative frequency or of the symbol a. Note that for the binary coin ﬂipping example we have considered, A = {0, 1}, and Sn(ω) = r1(n)(ω), the average number of heads in the ﬁrst n ﬂips. In other words, for the binary coin-ﬂipping example, the sample average and the relative frequency of heads are the same quantity. More generally, the reader should note that rn(n) can always be written as the sample average of the indicator function for a, 1a(x):

	n−1
	i
ra(n) = n−1	1a(Xi) ,
	=0
where
1	if x = a
1a(x) = 0	otherwise.

Note that 1{a} is a more precise, but more clumsy, notation for the indicator function of the singleton set {a}. We shall use the shorter form here.

Let us now assume that all of the marginal pmf’s of the given process are the same, say pX(x), x A. Based on intuition and gambling experience, one might suspect that as n goes to inﬁnity, the relative frequency of a symbol a should go to its probability of occurrence, pX(a). To continue the example of binary coin ﬂipping, the relative frequency of heads in n tosses of a fair coin should tend to 1/2 as n → ∞. If these statements are true, that is, if in some sense,

ra(n) n→ pX(a) ,		(4.2)
→∞
then it follows that in a similar sense
	apX(a) ,	(4.3)
Sn n→∞→	apX(a) ,	(4.3)
a A

the same expression as (4.1) with the relative frequency replaced by the pmf. The formula on the right is an example of an expectation of a random variable, a weighted average with respect to a probability measure. The

190 CHAPTER 4. EXPECTATION AND AVERAGES

formula should be recognized as a special case of the deﬁnition of expectation of (2.34), where the pmf is pX and g(x) = x, the identity function. The previous plausibility argument motivates studying such weighted averages because they will characterize the limiting behavior of time averages in the same way that probabilities characterize the limiting behavior of relative frequencies.

Limiting statements of the form of (4.2) and (4.3) are called laws of large numbers or ergodic theorems. They relate long-run sample averages or time average behavior to probabilistic calculations made at any given instant of time. It is obvious that such laws or theorems do not always hold. If the coin we are ﬂipping wears in a known fashion with time so that the probability of a head changes, then one could hardly expect that the relative frequency of heads would equal the probability of heads at time zero.

In order to make precise statements and to develop conditions under which the laws of theorems do hold, we ﬁrst need to develop the properties of the quantity on the right-hand side of (4.2) and (4.3). In particular, we

cannot at this point make any sense out of a statement like “lim →∞ S =

n n

apX(a),” since we have no deﬁnition for such a limit of random variables

a A

or functions of random variables. It is obvious, however, that the usual deﬁnition of a limit used in calculus will not do, because Sn is a random variable albeit a random variable whose “randomness” decreases in some sense with increasing n. Thus the limit must be deﬁned in some fashion that involves probability. Such limits are deferred to a later section and we begin by looking at the deﬁnitions and calculus of expectations.

4.2Expectation

Given a discrete alphabet random variable X speciﬁed by a pmf pX, deﬁne the expected value, probabilistic average, or mean of X by

E(X) = apX(x) .	(4.4)
x A

The expectation is also denoted by EX or E[X] or by an overbar, as X. The expectation is also sometimes called an ensemble average to denote averaging across the ensemble of sequences that is generated for di erent values of ω at a given instant of time.

The astute reader might note that we have really provided two deﬁnitions of the expectation of X. The deﬁnition of (4.4) has already been noted to be a special case of (2.34) with pmf pX and function g(x) = x.

4.2. EXPECTATION

191

Alternatively, we could use (2.34) in a more fundamental form and consider g(ω) = X(ω) is a function deﬁned on an underlying probability space described by a pmf p or a pdf f, in which case (2.34) or (2.57) provide a different formula for ﬁnding the expection in terms of the original probability function:

E(X) = X(ω)p(ω)	(4.5)
if the original space is discrete, or

E(X) = X(r)f(r) dr	(4.6)

if it is described by a pdf. Are these two versions consistent? The answer is yes, as will be proved soon by the fundamental theorem of expectation. The equivalence of these forms is essentially a change of variables formula.

The mean of a random variable is a weighted average of the possible values of the random variable with the pmf used as a weighting. Before continuing, observe that we can deﬁne an analogous quantity for a continuous random variable possessing a pdf: If the random variable X is described by a pdf fX, then we deﬁne the expectation of X by

EX = xfX(x) dx ,

(4.7)

where we have replaced the sum by an integral. Analogous to the discrete case, this formula is a special case of (2.57) with pdf f = fX and g being the identity function. We can also use (2.57) to express the expectation in terms of an underlying pdf, say f, with g = X by the formula

EX = X(r)f(x) dr .

(4.8)

The equivalence of these two formulas will be considered when the fundamental theorem of expectation is treated.

While the integral does not have the intuitive motivation involving a relative frequency converging to a pmf that the earlier sum did, we shall see that it plays the analogous role in the laws of large numbers. Roughly speaking, this is because continuous random variables can be approximated by discrete random variables arbitrarily closely by very ﬁne quantization. Through this procedure, the integrals with pdfs are approximated by sums with pmf’s and the discrete alphabet results imply the continuous alphabet results by taking appropriate limits. Because of the direct analogy, we shall develop the properties of expectations for continuous random variables along with those for discrete alphabet random variables. Note in passing

192	CHAPTER 4. EXPECTATION AND AVERAGES

that, analogous to using the Stieltjes integral as a uniﬁed notation for sums and integrals when computing probabilities, the same thing can be done for expectations. If FX is the cdf of a random variable X, deﬁne

EX =	x dFX(x) =	xpX(x)	if X is discrete
			if X has a pdf.
		xfX(x) dx	if X has a pdf.

In a similar manner, we can deﬁne the expectation of a mixture random variable having both continuous and discrete parts in a manner analogous to (3.36).

4.2.1Examples: Expectation

The following examples provide some typical expectation computations.

[4.1] As a slight generalization of the fair coin ﬂip, consider the more general binary pmf with parameter p; that is, pX(1) = p and pX(0) = 1 − p. In this case

EX = xpX(x) = 0(1 − p) + 1p = p .

i=0

It is interesting to note that in this example, as is generally true for discrete random variables, EX is not necessarily in the alphabet of the random variable, i.e., EX = 0 or 1 unless p = 0 or 1.

[4.2] A more complicated discrete example is a geometric random variable. In this case

∞	∞

EX = kpX(k) =	kp(1 − p)k−1 ,
k=1	k=1
a sum evaluated in (2.48) as 1/p.

[4.3] As an example of a continuous random variable, assume that X is a uniform random variable on [0, 1], that is, that its density is one on [0, 1]. Here

1 1

EX = xfX(x) dx = x dx = 1/2 ,

an integral evaluated in (2.67).

4.2. EXPECTATION

193

[4.4] If X is an exponentially distributed random variable with parameter λ, then from (2.71)

∞		1
0	rλe−λr dr =		.	(4.9)
		λ

In some case expectations can be found virtually by inspection. For example, if X has an even pdf fX — that is, if fX(−x) = fX(x) for all x — then if the integral exists, EX = 0, since xfX(x) is an odd function and hence has a zero integral. The assumption that the integral exists is necessary because not all even functions are integrable. For example, suppose that we have a pdf fX(x) = c/x2 for all |x| ≥ 1, where c is a normalization constant. Then it is not true that EX is zero, even though the pdf is even, because the Riemann integral

x dx

x: |x|≥1 x2

does not exist. (The puzzled reader should review the deﬁnition of indeﬁnite integrals. Their existence requires that the limit

lim lim	xfX(x) dx
T →∞ S→∞ −T

exists regardless of how T and S tend to inﬁnity; in particular, the existence for the limit with the constraint T = S is not su cient for the existence of the integral. These limits do not exist for the given example because 1/x is not integrable on [1, ∞).) Nonetheless, it is convenient to set EX to 0 in this example because of the obvious intuitive interpretation.

Sometimes the pdf is an even function about some nonzero value, that is, fX(x + m) = fX(x − m), where m is some constant. In this case, it is easily seen that if if the expectation exists, then EX = m, as the reader can quickly verify by a change of variable in the integral deﬁning the expectation. The most important example of this is the Gaussian pdf, which is even about the constant m.

The same conclusions also obviously hold for an even pmf. sectionExpectations of Functions of Random Variables In addition to

the expectation of a given random variable, we will often be interested in the expectations of other random variables formed as functions of the given one. In the beginning of the chapter we introduced the relative frequency function, ra(n), which counts the relative number of occurrences of the value a in a sequence of n terms. We are interested in its expected value and in the expected value of the indicator function that appears in the expression for

194	CHAPTER 4. EXPECTATION AND AVERAGES

ra(n). More generally, given a random variable X and a function g : → , we might wish to ﬁnd the expectation of the random variable Y = g(X). If X corresponds to a voltage measurement and g is a simple squaring operation, g(X) = X2, then g(X) provides the instantaneous energy across a unit resistor. Its expected value, then, represents the probabilistic average energy. More generally than the square of a random variable, the moments of a random variable X are deﬁned by E[Xk] for k = 1, 2, . . . . The mean is the ﬁrst moment, the square is the second moment, and so on. Moments are often useful as general parameters of a distribution, providing information on its shape without requiring the complete pdf or pmf. Some distributions are completely characterized by a few moments. It is often useful to consider moments of a “centralized” random variable formed by removing its mean. The kth centralized moment is deﬁned by E[(X − E(X))k]. Of particular

interest is the second centralized moment or variance σ2	∆
	= E[(X−E(X))2].

Other functions that are of interest are indicator functions of a set, 1F (x) = 1 if x F and 0 otherwise, so that 1F (X) is a binary random variable indicating whether or not the value of X lies in F , and complex exponentials

ejuX.

Expectations of functions of random variables were deﬁned in this chapter in terms of the derived distribution for the new random variable. In chapter 2, however, they were deﬁned in terms of the original pmf or pdf in the underlying probability space, a formula not requiring that the new distribution be derived. We next show that the two formulas are consistent. First consider ﬁnding the expectation of Y by using derived distribution techniques to ﬁnd the probability function for Y and then use the deﬁnition of expectation to evaluate EY . Speciﬁcally, if X is discrete, the pmf

for Y is found as before as
	pY (y) =	pX(x), y AY .
	x)=y
	x: g(
EY is then found as
	EY =	ypY (y) .

Although it is straightforward to ﬁnd the probability function for Y , it can be a nuisance if it is being found only as a step in the evaluation of the expectation EY = Eg(X). A second and easier method of ﬁnding EY is normally used. Looking at the formula for EX, it seems intuitively obvious that E(g(X)) should result if x is replaced by g(x). This can be proved by the following simple procedure. Starting with the pmf for Y , then substituting for its expression in terms of the pmf of X and reordering the summation, the expectation of Y is found directly from the pmf for X

<<< < Предыдущая 9 10 11 12 13 14 15 16 17 18 19 2021 / 4621 22 23 24 25 26 27 28 29 30 31 32 33 > Следующая >>>

Соседние файлы в предмете Социология

#
23.06.202254.09 Кб1Akty_pravoprimenenia.pptx
#
10.07.202222.04 Кб0Alvin Gouldner on the New Class & the Culture of Critical Discourse.chm
#
10.07.202260 б0AMER_SL-1.INF
#
10.07.202260 б1AMER_SL.INF
#
10.07.2022943.46 Кб3An Introduction To Statistical Inference And Data Analysis.pdf
#
10.07.20221.81 Mб2An Introduction to Statistical Signal Processing.pdf
#
10.07.20221.01 Mб3Analiz soderzh - soc. metod.doc
#
10.07.202249.2 Кб0anketa.chm
#
10.07.202238.95 Кб0ANNOT.PDF
#
23.06.20223.26 Mб2APRP_2_2021-3_SIGNAL.pdf
#
23.06.20222.97 Mб2APRP_3_2021-2_SIGNAL.pdf