Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

An Introduction to Statistical Signal Processing

.pdf
Скачиваний:
2
Добавлен:
10.07.2022
Размер:
1.81 Mб
Скачать

2.4. DISCRETE PROBABILITY SPACES

45

Note that the sequence of probabilities on the right-hand side of (2.28) is increasing with increasing n. Thus, for example, probabilities of semi-infinite intervals can be found as a limit as P ((−∞, a]) = limn→∞ P ((−n, a]). A similar argument can be used to show that one can also interchange the limit with the probability measure given a sequence of decreasing events; that is,

If Fn is a sequence of decreasing events, then

P lim Fn

= lim P (Fn) .

(2.29)

n→∞

n→∞

 

that is, the probability of the limit of a sequence of decreasing events is the limit of the probabilities.

Note that the sequence of probabilities on the right-hand side of (2.29) is decreasing with increasing n. Thus, for example, the probabilities of points can be found as a limit of probabilities of intervals, P ({a}) = limn→∞ P ((a − 1/n, a + 1/n)).

It can be shown (see problem 2.20) that, given (2.22) through (2.24), the three conditions (2.25), (2.28), and (2.29) are equivalent; that is, any of the three could serve as the fourth axiom of probability.

Property (2.28) is called continuity from below, and (2.29) is called continuity from above. The designations “from below” and “from above” relate to the direction from which the respective sequences of probabilities approach their limit. These continuity results are the basis for using integral calculus to compute probabilities, since integrals can be expressed as limits of sums.

2.4Discrete Probability Spaces

We now provide several examples of probability measures on our examples of sample spaces and sigma-fields and thereby give some complete examples of probability spaces.

The first example formalizes the description of a probability measures as a sum of a pmf as introduced in the introductory section.

[2.12] Let Ω be a finite set and let F be the power set of Ω. Suppose that we have a function p(ω) that assigns a real number to each sample point ω in such a way that

p(ω) 0 , all ω

(2.30)

46

CHAPTER 2.

PROBABILITY

 

and

 

 

 

(2.31)

 

p(ω) = 1 .

ω

Define the set function P by

 

 

 

P (F ) =

p(ω) 1F (ω)p(ω) , all F F

(2.32)

ω F

ω

 

where 1F (ω) is the indicator function of the set F , 1 if ω F and 0 otherwise.

For simplicity we drop the ω Ω underneath the sum; that is, when no range of summation is explicit, it should be assumed the sum is over all possible values. Thus we can abbreviate (2.32) to

 

 

P (F ) = 1F (ω)p(ω) , all F F

(2.33)

P is easily verified to be a probability measure: It obviously satisfies axioms 2.1 and 2.2. It is finitely and countably additive from the properties of sums. In particular, given a sequence of disjoint events, only a finite number can be distinct (since the power set of a finite space has only a finite number of members). To be disjoint, the balance of the sequence must equal . The probability of the union of these sets will be the finite sum of the p(ω) over the points in the union which equals the sum of the probabilities of the sets in the sequence. Example [2.1] is a special case of example [2.12], as is the coin flip example of the introductary section.

The summation (2.33) used to define probability measures for a discrete space is a special case of a more general weighted sum, which we pause to define and consider. Suppose that g is a real-valued function defined on Ω, i.e., g : Ω assigns a real number g(ω) to every ω Ω. We could consider more general complex-valued functions, but for the moment it is simpler to stick to real valued functions. Also, we could consider subsets of , but we leave it more generally at this time. Recall that in the introductory section we considered such a function to be an example of signal processing and called it a random variable. Given a pmf p, define the expectation2 of g (with respect to p) as

 

(2.34)

E(g) = g(ω)p(ω).

2This is not in fact the fundamental definition of expectation that will be introduced in chapter 4, but it will be seen to be equivalent

2.4. DISCRETE PROBABILITY SPACES

47

With this definition (2.33) with g(ω) = 1F (ω) yields

 

P (F ) = E(1F ),

(2.35)

showing that the probability of an event is the expectation of the indicator function of the event. Mathematically, we can think of expectation as a generalization of the idea of probability since probability is the special case of expectation that results when the only functions allowed are indicator functions.

Expectations are also called probabilistic averages or statistical averages. For the time being, probabilities are the most important examples of expectation. We shall see many examples, however, so it is worthwhile to mention a few of the most important. Suppose that the sample space is a subset of the real line, e.g., Z or Zn. One of the most commonly encountered expectations is the mean or first moment

 

(2.36)

m = ωp(ω),

where g(ω) = ω, the identity function. A more general idea is the kth moment defined by

 

(2.37)

m(k) = |ω|kp(ω),

so that m = m(1). After the mean, the most commonly encountered moment in practice is the second moment,

 

(2.38)

m(2) = |ω|2p(ω).

Moments can be thought of as parameters describing a pmf, and some computations involving signal processing will turn out to depend only on certain moments.

A slight variation on k order moments is the so-called centralized moments formed by substracting the mean before taking the power:

|ω − m|kp(ω),

(2.39)

but the only such moment commonly encountered in practice is the variance

 

(2.40)

σ2 = (ω − m)2p(ω).

48

CHAPTER 2. PROBABILITY

The variance and the second moment are easily related as

σ2 =

(ω − m)2p(ω)

=(ω2 2ωm + m2)p(ω)

=

 

 

 

 

 

 

 

ω2p(ω) 2m ωp(ω) + m2

p(ω)

=

m

(2)

2m

2

+ m

2

 

 

 

 

 

=

m(2)

− m2.

 

 

(2.41)

Probability Mass Functions

A function p(ω) satisfying (2.30) and (2.31) is called a probability mass function or pmf. It is important to observe that the probability mass function is defined only for points in the sample space, while a probability measure is defined for events, sets which belong to an event space. Intuitively, the probability of a set is given by the sum of the probabilities of the points as given by the pmf. Obviously it is much easier to describe the probability function than the probability measure since it need only be specified for points. The axioms of probability then guarantee that the probability function can be used to compute the probability measure. Note that given one, we can always determine the other. In particular, given the pmf p, we can construct P using (2.32). Given P , we can find the corresponding pmf p from the formula

p(ω) = P ({ω}) .

We list below several of the most common examples of pmf’s. The reader should verify that they are all indeed valid pmf’s, that is, that they satisfy (2.30) and (2.31).

The binary pmf. Ω = {0, 1}; p(0) = 1 − p, p(1) = p, where p is a parameter in (0, 1).

A uniform pmf. Ω = Zn = {0, 1, . . . , n − 1} and p(k) = 1/n; k Zn.

The binomial pmf. Ω = Zn+1 = {0, 1, . . . , n} and

p(k) =

n

pk(1 − p)n−k; k Zn+1 ,

k

where

n

 

n!

 

 

 

k

=

 

 

 

k!(n − k)!

2.4. DISCRETE PROBABILITY SPACES

49

is the binomial coe cient.

The binary pmf is a probability model for coin flipping with a biased coin or for a single sample of a binary data stream. A uniform pmf on Z6 can model the roll of a fair die. Observe that it would not be a good model for ASCII data since, for example, the letters t and e and the symbol for space have a higher probability than other letters. The binomial pmf is a probability model for the number of heads in n successive independent flips of a biased coin, as will later be seen.

The same construction provides a probability measure on countably infinite spaces such as Z and Z+. It is no longer as simple to prove countable additivity, but it should be fairly obvious that it holds and, at any rate, it follows from standard results in elementary analysis for convergent series. Hence we shall only state the following example without proving countable additivity, but bear in mind that it follows from the properties of infinite summations.

[2.13] Let Ω be a space with a countably infinite number of elements and let F be the power set of Ω. Then if p(ω); ω Ω satisfies (2.30) and (2.31), the set function P defined by (2.32) is a probability measure.

Two common examples of pmf’s on countably infinite sample spaces follow. The reader should test their validity.

The geometric pmf. Ω = {1, 2, 3, . . . } and p(k) = (1 − p)k−1p; k = 1, 2, . . . , where p (0, 1) is a parameter.

The Poisson pmf. Ω = Z+ = {0, 1, 2, . . . } and p(k) = (λke−λ)/k!,

where λ is a parameter in (0, ∞). (Keep in mind that 0! = 1.)

We will later see the origins of several of these pmf’s and their applications. For example, both the binomial and the geometric pmf will be derived from the simple binary pmf model for flipping a single coin. For the moment they should be considered as common important examples. Various properties of these pmf’s and a variety of calculations involving them are explored in the problems at the end of the chapter.

Computational Examples

The various named pmf’s provide examples for computing probabilities and other expectations. Although much of this is prerequisite material, it does

50

CHAPTER 2. PROBABILITY

not hurt to collect several of the more useful tricks that arise in evaluating sums. The binary pmf is too simple to alone provide much interest, so first consider the uniform pmf on Zn. This is trivially a valid pmf since it is nonnegative and sums to 1. The probability of any set is simply

P (F ) =

 

1

1F (ω) =

#(F )

,

n

n

where #(F ) denotes the number of elements or points in the set F . The mean is given by

n

 

 

 

 

n(n + 1)

 

 

m = k =

2

,

(2.42)

 

 

 

k=1

a standard formula easily verified by induction, as detailed in appendix B. The second moment is given by

n

 

 

 

 

k(k + 1)(2k + 1)

 

 

m(2) = k2 =

6

,

(2.43)

 

 

 

k=1

as can also be verified by induction. The variance can be found by combining (2.43), (2.42), and (2.41).

The binomial pmf is more complicated. The first issue is to prove that it sums to one and hence is a valid pmf (it is obviously nonnegative). This is accomplished by recalling the binomial theorem from high school algebra:

n

 

anbn−k

 

 

n

 

(a + b)n = k=0 k

(2.44)

and setting a = p and b = 1 − p to write

n

n

 

pk(1

 

 

 

n

 

k=0 p(k) =

k=0 k

− p)n−k

= (p + 1 − p)n

 

=

1.

 

 

 

Finding moments is trickier here, and we shall later develop a much easier way to do this using exponential transforms. Nonetheless, it provides some useful practice to compute an example sum, if only to demonstrate later how much work can be avoided! Finding the mean requires evaluation

2.4. DISCRETE PROBABILITY SPACES

51

of the sum

 

 

 

 

 

 

 

 

 

 

n

 

 

 

pk(1 − p)n−k

 

 

 

 

n

 

m =

k=0 k k

 

 

n

 

 

 

 

 

 

=

 

 

 

n!

 

pk(1 − p)n−k

 

 

 

 

 

 

 

 

k=0

(n

 

k)!(k

 

1)!

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

=

 

 

 

n!

 

pk(1 − p)n−k.

 

 

 

 

 

 

 

 

 

(n

 

k)!(k

 

1)!

 

k=1

The trick here is to recognize that the sum looks very much like the terms in the binomial theorem, but a change of variables is needed to get the binomial theorem to simplify things. Changing variables by defining l = k − 1, the sum becomes

n−1

 

 

n!

 

 

l

 

 

 

 

 

 

(n

l

1)!l!pl+1(1 − p)n−l−1,

m =

 

 

=0

 

 

 

 

 

 

which will very much resemble the binomial theorem with n − 1 replacing n if we factor out a p and an n:

m =

np n−1

(n − 1)!

pl(1

 

p)n−1−l

 

l

1

l)!l!

 

 

 

(n

 

 

 

 

 

=0

 

 

 

 

 

 

 

= np(p + 1 − p)n−1

 

 

 

=

np.

 

 

 

 

 

 

(2.45)

The second moment is messier, so its evaluation is postponed until simpler means are developed.

The geometric pmf is handled using the geometric progression, usually treated in high school algebra and summarized in appendix B. From (B.4) in appendix B we have for any real a with |a| < 1

 

 

 

 

 

 

 

ak =

 

1

,

(2.46)

1

a

k=0

which proves that the geometric pmf indeed sums to 1.

Evaluation of the mean of the geometric pmf requires evaluation of the sum

 

 

m = kp(k) =

kp(1 − p)k−1 .

k=1

k=1

52

CHAPTER 2. PROBABILITY

One may have access to a book of tables including this sum, but a useful trick can be used to evaluate the sum from the well-known result for summing a geometric series. The trick involves di erentiating the usual geometric progression sum, as detailed in appendix B, where it is shown for any q (0, 1) that

1

 

 

 

 

 

 

 

 

(2.47)

 

q)2 .

kqk−1 = (1

 

k=0

Set q = 1 − p yields

m =

1

.

(2.48)

 

 

p

 

A similar idea works for the second moment. From (B.7) of appendix B the second moment is given by

 

2

1

 

 

 

 

 

 

 

 

 

 

 

m(2) =

k2p(1 − p)k−1 = p(

p3

+

p2

)

(2.49)

 

k=1

 

 

 

 

 

 

 

and hence from (2.41) the variance is

 

 

 

 

 

 

 

 

σ2 =

2

.

 

 

 

 

(2.50)

 

 

 

 

 

 

 

 

p2

 

 

 

 

As an example of a probability computation using a geometric pmf, suppose that (Ω, F, P ) is a discrete probability space with Ω = Z+, F the power set of Ω, and P the probability measure induced by the geometric pmf with parameter p. Find the probabilities of the events F = {k : k ≥ 10} and G = {k : k is odd }. Alternatively note that F = {10, 11, 12, . . . } and G = {1, 3, 5, 7, . . . } (we consider only odd numbers in the sample space,

2.4. DISCRETE PROBABILITY SPACES

53

that is, only positive odd numbers). We have that

P (F ) =

p(k)

 

 

k F

 

 

 

 

 

 

 

 

=

 

p(1 − p)k−1

 

 

 

 

 

 

k=10

 

 

 

 

p

 

 

 

 

 

 

k

 

=

 

1

 

p

(1 − p)k

 

 

 

 

 

=10

 

=

p

(1 − p)k−10

(1 − p)10

1 p

 

 

 

=10

 

 

 

 

 

k

 

 

 

 

 

 

 

= p(1 − p)9 (1 − p)k

 

 

 

 

 

k=0

 

=

(1 − p)9,

 

where the suitable form of the geometric progression has been derived from the basic form (B.4). While we have concentrated on the calculus, this problem could be interpreted as a solution to a word problem. For example, suppose you arrive at the Stanford Post O ce and you know that the probability of k people being in line is a geometric distribution with p = 1/2. What is the probability that there are at least ten people in line? From the solution just obtained the answer is (1 − .5)9 = 29.

To find the probability of an odd outcome, we proceed in the same general fashion to write

P (G) =

p(k)

 

 

k G

 

=

p(1 − p)k−1

 

k=1,3,...

 

=

p

(1 − p)k

 

k=0,

 

 

2,4,...

 

 

 

=

p [(1 − p)2]k

k=0

p1

=1 (1 − p)2 = 2 − p.

Thus in the English example of the post o ce lines, the probability of finding an odd number of people in line is 2/3.

54

CHAPTER 2. PROBABILITY

Lastly we consider the Poisson pmf, again beginning with a verification that it is indeed a pmf. Consider the sum

λke−λ

λk

 

 

 

 

 

 

 

p(k) =

 

= e−λ

 

.

k=0

k=0

k!

k=0

k!

 

 

 

Here the trick is to recognize the sum as the Taylor series expansion for an

exponential, that is,

eλ = λk , k!

k=0

whence

p(k) = e−λeλ = 1,

k=0

proving the claim.

To evaluate the mean of the Poisson pmf, begin with

 

λke−λ

 

 

 

 

 

 

 

 

 

 

 

 

 

kp(k) =

 

k

 

 

 

 

 

 

 

 

k=0

k=1

 

k!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λk

 

 

=

e−λ

 

 

 

 

 

 

 

(k

1)!.

 

 

k=1

 

 

 

 

 

 

Change variables l = k − 1 and pull a λ out of the sum to write

 

 

λl

 

 

kp(k) = λe−λ

 

 

.

 

 

 

 

 

 

k=0

 

 

k=0

 

l!

 

 

 

 

 

 

 

 

 

 

 

Recognizing the sum as eλ, this yields

 

 

 

 

 

 

 

 

 

m = λ.

 

 

 

 

(2.51)

The second moment is found similarly, but with more bookkeeping. Analogous to the mean computation,

 

 

λke−λ

 

 

 

 

 

 

 

 

m(2) =

k2

 

 

 

 

 

k=1

 

 

k!

 

 

 

 

 

 

 

λke−λ

 

 

 

 

 

 

 

 

 

 

 

 

 

=

k(k − 1)

k!

+ m

k=2

 

 

 

 

 

 

 

 

λke−λ

 

 

 

 

 

 

 

 

+ m.

 

(k

2)!

=

 

 

k=1

Соседние файлы в предмете Социология