Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Петербургский государственный университет путей сообщения им. императора Александра I

Предмет:

Информатика

Файл:

Algorithms and data structures

.pdf

Скачиваний:

Добавлен:

16.03.2015

Размер:

4.87 Mб

Скачать

☆

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2728 / 3128 29 30 31 > Следующая >>>

A.2 Mathematical Concepts

265

concave: a function f is concave on an interval [a, b] if

x, y [a, b] : t [0, 1] : f (tx + (1 −t)y) ≥ t f (x) + (1 −t) f (y).

convex: a function f is convex on an interval [a, b] if

x, y [a, b] : t [0, 1] : f (tx + (1 −t)y) ≤ t f (x) + (1 −t) f (y).

equivalence relation: a transitive, reﬂexive, symmetric relation.

ﬁeld: a set of elements that support addition, subtraction, multiplication, and division by nonzero elements. Addition and multiplication are associative and commutative, and have neutral elements analogous to zero and one for the real numbers. The prime examples are R, the real numbers; Q, the rational numbers; and Zp, the integers modulo a prime p.

iff: abbreviation for “if and only if”.

lexicographic order: the canonical way of extending a total order on a set of elements to tuples, strings, or sequences over that set. We have a1, a2, . . . , ak <

b1, b2, . . . , bk if and only if a1 < b1 or a1 = b1 and a2, . . . , ak < b2, . . . , bk .

linear order: a reﬂexive, transitive, weakly antisymmetric relation.

median: an element with rank n/2 among n elements.

multiplicative inverse: if an object x is multiplied by a multiplicative inverse x−1 of x, we obtain x · x−1 = 1 – the neutral element of multiplication. In particular, in a ﬁeld, every element except zero (the neutral element of addition) has a unique multiplicative inverse.

prime number: an integer n, n ≥ 2, is a prime iff there are no integers a, b > 1 such that n = a · b.

rank: a one-to-one mapping r : S → 1..n is a ranking function for the elements of a set S = {e1, . . . , en} if r(x) < r(y) whenever x < y.

reﬂexive: a relation A × A is reﬂexive if a A : (a, a) R.

relation: a set of pairs R. Often, we write relations as operators; for example, if is a relation, a b means (a, b) .

symmetric relation: a relation is symmetric if for all a and b, a b implies b a.

total order: a reﬂexive, transitive, antisymmetric relation.

transitive: a relation is transitive if for all a, b, c, a b and b c imply a c.

weakly antisymmetric: a relation ≤ is weakly antisymmetric if for all a, b, a ≤ b or b ≤ a. If a ≤ b and b ≤ a, we write a ≡ b. Otherwise, we write a < b or b < a.

266 A Appendix

A.3 Basic Probability Theory

Probability theory rests on the concept of a sample space S . For example, to describe the rolls of two dice, we would use the 36-element sample space {1, . . . , 6} × {1, . . . , 6}, i.e., the elements of the sample space are the pairs (x, y) with 1 ≤ x, y ≤ 6 and x, y N. Generally, a sample space is any set. In this book, all sample spaces are ﬁnite. In a random experiment, any element of s S is chosen with some elementary probability ps, where ∑s S ps = 1. A sample space together with a probability distribution is called a probability space. In this book, we use uniform probabilities almost exclusively; in this case ps = p = 1/|S |. Subsets E of the sample space are called events. The probability of an event E S is the sum of the probabilities of its elements, i.e., prob(E ) = |E |/|S | in the uniform case. So the probability of the event {(x, y) : x + y = 7} = {(1, 6), (2, 5), . . . , (6, 1)} is equal to 6/36 = 1/6, and the probability of the event {(x, y) : x + y ≥ 8} is equal to 15/36 = 5/12.

A random variable is a mapping from the sample space to the real numbers. Random variables are usually denoted by capital letters to distinguish them from plain values. For example, the random variable X could give the number shown by the ﬁrst die, the random variable Y could give the number shown by the second die, and the random variable S could give the sum of the two numbers. Formally, if (x, y) S , then X((x, y)) = x, Y ((x, y)) = y, and S((x, y)) = x + y = X((x, y)) + Y ((x, y)).

We can deﬁne new random variables as expressions involving other random variables and ordinary values. For example, if V and W are random variables, then

(V + W )(s) = V (s) + W (s), (V ·W )(s) = V (s) ·W (s), and (V + 3)(s) = V (s) + 3. Events are often speciﬁed by predicates involving random variables. For exam-

ple, X ≤ 2 denotes the event {(1, y), (2, y) : 1 ≤ y ≤ 6}, and hence prob(X ≤ 2) = 1/3. Similarly, prob(X + Y = 11) = prob({(5, 6), (6, 5)}) = 1/18.

Indicator random variables are random variables that take only the values zero and one. Indicator variables are an extremely useful tool for the probabilistic analysis of algorithms because they allow us to encode the behavior of complex algorithms into simple mathematical objects. We frequently use the letters I and J for indicator variables.

The expected value of a random variable Z : S → R is
E[Z] = ∑ ps · Z(s) = ∑ z · prob(Z = z) ,		(A.1)
s S	z R

i.e., every sample s contributes the value of Z at s times its probability. Alternatively, we can group all s with Z(s) = z into the event Z = z and then sum over the z R.

In our example, E[X] = (1 + 2 + 3 + 4 + 5 + 6)/6 = 21/6 = 3.5, i.e., the expected value of the ﬁrst die is 3.5. Of course, the expected value of the second die is also 3.5. For an indicator random variable I, we have

E[I] = 0 · prob(I = 0) + 1 · prob(I = 1) = prob(I = 1) .

Often, we are interested in the expectation of a random variable that is deﬁned in terms of other random variables. This is easy for sums owing to the linearity of

A.3 Basic Probability Theory

267

expectations of random variables: for any two random variables V and W ,

E[V + W ] = E[V ] + E[W ] .

(A.2)

This equation is easy to prove and extremely useful. Let us prove it. It amounts essentially to an application of the distributive law of arithmetic. We have

E[V + W ] = ∑ ps · (V (s) + W (s))

s S

= ∑ ps ·V (s) + ∑ ps ·W (s)

s S	s S
= E[V ] + E[W ] .

As our ﬁrst application, let us compute the expected sum of two dice. We have

E[S] = E[X + Y ] = E[X] + E[Y ] = 3.5 + 3.5 = 7 .

Observe that we obtain the result with almost no computation. Without knowing about the linearity of expectations, we would have to go through a tedious calculation:

E[S] = 2 · 361 + 3 · 362 + 4 · 363 + 5 · 364 + 6 · 365 + 7 · 366 + 8 · 365 + 9 · 364 + . . . + 12 · 361

= 2 · 1 + 3 · 2 + 4 · 3 + 5 · 4 + 6 · 5 + 7 · 6 + 8 · 5 + . . . + 12 · 1 = 7 . 36

Exercise A.1. What is the expected sum of three dice?

We shall now give another example with a more complex sample space. We consider the experiment of throwing n balls into m bins. The balls are thrown at random and distinct balls do not inﬂuence each other. Formally, our sample space is the set of all functions f from 1..n to 1..m. This sample space has size mn, and f (i), 1 ≤ i ≤ n, indicates the bin into which the ball i is thrown. All elements of the sample space are equally likely. How many balls should we expect in bin 1? We use I to denote the number of balls in bin 1. To determine E[I], we introduce indicator variables Ii, 1 ≤ i ≤ n. The variable Ii is 1, if ball i is thrown into bin 1, and is 0 otherwise. Formally, Ii( f ) = 0 iff f (i) = 1. Then I = ∑i Ii. We have

E[I] = E[∑Ii] = ∑E[Ii] = ∑prob(Ii = 1) ,

where the ﬁrst equality is the linearity of expectations and the second equality follows from the Ii’s being indicator variables. It remains to determine the probability that Ii = 1. Since the balls are thrown at random, ball i ends up in any bin1 with the same probability. Thus prob(Ii = 1) = 1/m, and hence

E[I] = ∑prob(Ii = 1) = ∑		1	=	n	.
		m
i	i			m

1 Formally, there are exactly mn−1 functions f with f (i) = 1.

268 A Appendix

Products of random variables behave differently. In general, we have E[X ·Y ] = E[X] · E[Y ]. There is one important exception: if X and Y are independent, equality holds. Random variables X1, . . . , Xk are independent if and only if

x1, . . . , xk : prob(X1 = x1 ··· Xk = xk) = ∏ prob(Xi = xi) .	(A.3)
1≤i≤k

As an example, when we roll two dice, the value of the ﬁrst die and the value of the second die are independent random variables. However, the value of the ﬁrst die and the sum of the two dice are not independent random variables.

Exercise A.2. Let I and J be independent indicator variables and let X = (I + J) mod 2, i.e., X is one iff I and J are different. Show that I and X are independent, but that I, J, and X are dependent.

Assume now that X and Y are independent. Then

E[X]	E[Y ] =	∑x	prob(X = x)		∑y	prob(X = y)
·	x ·			·	y ·
	= ∑x · y · prob(X = x) · prob(X = y)
	x,y
	= ∑x · y · prob(X = x Y = y)
	x,y
	= ∑		∑	z · prob(X = x Y = y)
	z	x,y with z=x·y
	= ∑ z ·		∑	prob(X = x Y = y)
	z	x,y with z=x·y
		x,y with z=x·y
	= ∑ z · prob(X ·Y = z)
	z
	= E[X ·Y ] .

How likely is it that a random variable will deviate substantially from its expected value? Markov’s inequality gives a useful bound. Let X be a nonnegative random

variable and let c be any constant. Then
1
prob(X ≥ c · E[X]) ≤		.	(A.4)
	c

The proof is simple. We have

E[X] = ∑ z · prob(X = z)

z R

≥ ∑ z · prob(X = z)

z≥c·E[X]

≥ c · E[X] · prob(X ≥ c · E[X]) ,

A.4 Useful Formulae

269

where the ﬁrst inequality follows from the fact that we sum over a subset of the possible values and X is nonnegative, and the second inequality follows from the fact that the sum in the second line ranges only over z such that z ≥ cE[X].

Much tighter bounds are possible for some special cases of random variables. The following situation arises several times, in the book. We have a sum X = X1 +

··· + Xn of n independent indicator random variables X1,. . . , Xn and want to bound the probability that X deviates substantially from its expected value. In this situation,

the following variant of the Chernoff bound is useful. For any ε > 0, we have
prob(X < (1	− ε )E[X]) ≤ e−ε 2E[X]/2 ,					E[X] .	(A.5)
prob(X > (1	+ ε )E[X])		eε				(A.6)
			(1 + ε )	(1+	)
		≤		ε

A bound of the form above is called a tail bound because it estimates the “tail” of the probability distribution, i.e., the part for which X deviates considerably from its expected value.

Let us see an example. If we throw n coins and let Xi be the indicator variable

for the i-th coin coming up heads, X = X1 + ··· + Xn is the total number of heads. Clearly, E[X] = n/2. The bound above tells us that prob(X ≤ (1 −ε )n/2) ≤ e−ε 2n/4. In particular, for ε = 0.1, we have prob(X ≤ 0.9·n/2) ≤ e−0.01·n/4. So, for n = 10 000,

the expected number of heads is 5 000 and the probability that the sum is less than 4 500 is smaller than e−25, a very small number.

Exercise A.3. Estimate the probability that X in the above example is larger than 5 050.

If the indicator random variables are independent and identically distributed with prob(Xi = 1) = p, X is binomially distributed, i.e.,

prob(X = k) =	k pk(1 − p)(n−k) .	(A.7)
	n

Exercise A.4 (balls and bins continued). in bin 1. Show

prob(I = k) =

n k

Let, as above, I denote the number of balls

m		1 − m		(n	−	,
1	k		1	(n		k)
1			1

and then attempt to compute E[I] as ∑k prob(I = k)k.

A.4 Useful Formulae

We shall ﬁrst list some useful formulae and then prove some of them.

270 A Appendix

•A simple approximation to the factorial:

n n

≤n! ≤ nn.

•Stirling’s approximation to the factorial:

n! =	1 + O n	√2π n		e		.
	1				n	n

• An approximation to the binomial coefﬁcients:

k	≤		n	k·	e	.
n			n		e	k
• The sum of the ﬁrst n integers:
n		n(n +				1)
∑ i =		n(n +				1)	.
∑ i =							.
i=1				2
i=1

• The harmonic numbers:

(A.8)

(A.9)

(A.10)

(A.11)

ln n ≤ Hn = ∑

≤ ln n + 1.

(A.12)

i=1

•

The geometric series:

n−1 qi =

1 − qn

for q = 1 and

∑

qi =

for 0

≤

q < 1.

(A.13)

∑

−

i=0

i≥0

∑ 2−i = 2 and

∑ i · 2−i

= ∑ i · 2−i = 2.

(A.14)

i≥0

i≥1

•

Jensen’s inequality:

∑in=1 xi

∑ f (xi) ≤ n · f

(A.15)

i=1

for any concave function f . Similarly, for any convex function f ,

∑in=1 xi

∑ f (xi) ≥ n · f

(A.16)

i=1

A.4

Useful Formulae

271

A.4.1 Proofs

Fori

(A.8), we ﬁrst observe that n! = n(n − 1) ···1 ≤ nn. Also, for all i ≥ 2, ln i ≥

i−1 ln x dx, and therefore

x=n

ln n! = ∑ ln i

≥

ln x dx = x(ln x

−

x=1

n(ln n

−

1) .

≤

≥

≤

Thus

n! ≥ en(ln n−1) = (eln n−1)n =

Equation (A.10) follows almost immediately from (A.8). We have

n n

)

k + 1)

k =

(

−

···

−

≤

(k/e)k =

Equation (A.11) can be computed by a simple trick:

1 + 2 + . . . + n = 2 ((1 + 2 + . . . + n − 1 + n) + (n + n − 1 + . . . + 2 + 1))

= 2 ((n + 1) + (2 + n − 1) + . . . + (n − 1 + 2) + (n + 1))

= n(n + 1) .

The sums of higher powers are estimated easily; exact summation formulae are also available. For example, ii−1 x2 dx ≤ i2 ≤ ii+1 x2 dx, and hence

n+1

x=n+1

∑ i2

x2 dx =

(n + 1)

− 1

≤

x=1

≤

and

x=n

∑ i2

x2 dx =

x=0 =

≥

≤

For (A.12), we also use estimation by integral. We have ii+1(1/x) dx ≤ 1/i ≤

i−1(1/x) dx, and hence

n 1

∑

n 1

ln n = 1

dx ≤

≤ 1 + 1

dx = 1 + ln n .

1≤i≤n

Equation (A.13) follows from

(1 − q) · ∑ qi = ∑ qi − ∑ qi = 1 − qn .

0≤i≤n−1 0≤i≤n−1 1≤i≤n

272 A Appendix

Letting n pass to inﬁnity yields ∑i≥0 qi = 1/(1 − q) for 0 ≤ q < 1. For q = 1/2, we obtain ∑i≥0 2−i = 2. Also,

∑ i · 2−i = ∑ 2−i + ∑ 2−i + ∑ 2−i + . . .

i≥1 i≥1 i≥2 i≥3

= (1 + 1/2 + 1/4 + 1/8 + . . .) · ∑ 2−i

i≥1

= 2 · 1 = 2 .

For the ﬁrst equality, observe that the term 2−i occurs in exactly the ﬁrst i sums of the right-hand side.

Equation (A.15) can be shown by induction on n. For n = 1, there is nothing

to show. So assume n ≥ 2. Let x = ∑1≤i≤n xi/n and x¯ = ∑1≤i≤n−1 xi/(n − 1). Then x = ((n − 1)x¯ + xn)/n, and hence

∑ f (xi) = f (xn) + ∑ f (xi)

1≤i≤n

1≤i≤n−1

n ·

≤

−

) + (n

f (x¯) = n

f (x

) +

n − 1

f (x¯)

≤

f (x ) ,

where the ﬁrst inequality uses the induction hypothesis and the second inequality uses the deﬁnition of concavity with x = xn, y = x¯, and t = 1/n. The extension to convex functions is immediate, since convexity of f implies concavity of − f .

References

[1]E. H. L. Aarts and J. Korst. Simulated Annealing and Boltzmann Machines. Wiley, 1989.

[2]J. Abello, A. Buchsbaum, and J. Westbrook. A functional approach to external graph algorithms. Algorithmica, 32(3):437–458, 2002.

[3]W. Ackermann. Zum hilbertschen Aufbau der reellen Zahlen. Mathematische Annalen, 99:118–133, 1928.

[4]G. M. Adel’son-Vel’skii and E. M. Landis. An algorithm for the organization of information. Soviet Mathematics Doklady, 3:1259–1263, 1962.

[5]A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116–1127, 1988.

[6]A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

[7]A. V. Aho, B. W. Kernighan, and P. J. Weinberger. The AWK Programming Language. Addison-Wesley, 1988.

[8]R. K. Ahuja, R. L. Magnanti, and J. B. Orlin. Network Flows. Prentice Hall, 1993.

[9]R. K. Ahuja, K. Mehlhorn, J. B. Orlin, and R. E. Tarjan. Faster algorithms for the shortest path problem. Journal of the ACM, 3(2):213–223, 1990.

[10]N. Alon, M. Dietzfelbinger, P. B. Miltersen, E. Petrank, and E. Tardos. Linear hash functions. Journal of the ACM, 46(5):667–683, 1999.

[11]A. Andersson, T. Hagerup, S. Nilsson, and R. Raman Sorting in linear time?

Journal of Computer and System Sciences, 57(1):74–93, 1998.

[12]F. Annexstein, M. Baumslag, and A. Rosenberg. Group action graphs and parallel architectures. SIAM Journal on Computing, 19(3):544–569, 1990.

[13]D. L. Applegate, R. E. Bixby, V. Chvátal, and W. J. Cook. The Traveling Salesman Problem: A Computational Study. Princeton University Press, 2007.

[14]G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, and M. Protasi. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, 1999.

[15]H. Bast, S. Funke, P. Sanders, and D. Schultes. Fast routing in road networks with transit nodes. Science, 316(5824):566, 2007.

274References

[16]R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173–189, 1972.

[17]R. Beier and B. Vöcking. Random knapsack in expected polynomial time.

Journal of Computer and System Sciences, 69(3):306–329, 2004.

[18]R. Bellman. On a routing problem. Quarterly of Applied Mathematics, 16(1):87–90, 1958.

[19]M. A. Bender, E. D. Demaine, and M. Farach-Colton. Cache-oblivious B- trees. In 41st Annual Symposium on Foundations of Computer Science, pages 399–409, 2000.

[20]J. L. Bentley and M. D. McIlroy. Engineering a sort function. Software Practice and Experience, 23(11):1249–1265, 1993.

[21]J. L. Bentley and T. A. Ottmann. Algorithms for reporting and counting geometric intersections. IEEE Transactions on Computers, pages 643–647, 1979.

[22]J. L. Bentley and R. Sedgewick. Fast algorithms for sorting and searching strings. In 8th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 360–369, 1997.

[23]D. Bertsimas and J. N. Tsitsiklis. Introduction to Linear Optimization. Athena Scientiﬁc, 1997.

[24]G. E. Blelloch, C. E. Leiserson, B. M. Maggs, C. G. Plaxton, S. J. Smith, and M. Zagha. A comparison of sorting algorithms for the connection machine CM-2. In 3rd ACM Symposium on Parallel Algorithms and Architectures, pages 3–16, 1991.

[25]M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. Journal of Computer and System Sciences, 7(4):448, 1972.

[26]N. Blum and K. Mehlhorn. On the average number of rebalancing operations in weight-balanced trees. Theoretical Computer Science, 11:303–320, 1980.

[27]Boost.org. Boost C++ Libraries. www.boost.org.

[28]O. Boruvka. O jistém problému minimálním. Pràce, Moravské Prirodovedecké Spolecnosti, pages 1–58, 1926.

[29]F. C. Botelho, R. Pagh, and N. Ziviani. Simple and space-efﬁcient minimal perfect hash functions. In 10th Workshop on Algorithms and Data Structures, volume 4619 of Lecture Notes in Computer Science, pages 139–150. Springer, 2007.

[30]G. S. Brodal. Worst-case efﬁcient priority queues. In 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 52–58, 1996.

[31]G. S. Brodal and J. Katajainen. Worst-case efﬁcient external-memory priority queues. In 6th Scandinavian Workshop on Algorithm Theory, volume 1432 of Lecture Notes in Computer Science, pages 107–118. Springer, 1998.

[32]M. R. Brown and R. E. Tarjan. Design and analysis of a data structure for representing sorted lists. SIAM Journal of Computing, 9:594–614, 1980.

[33]R. Brown. Calendar queues: A fast O(1) priority queue implementation for the simulation event set problem. Communications of the ACM, 31(10):1220– 1227, 1988.

[34]J. L. Carter and M. N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143–154, Apr. 1979.

<<< < Предыдущая 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2728 / 3128 29 30 31 > Следующая >>>

Соседние файлы в предмете Информатика

#
16.03.20154.87 Mб43Algorithms and data structures.pdf
#
16.03.201516.97 Кб54Excel_1_Ввод данных и формул.docx
#
16.03.201528.16 Кб35Excel_2_Стоимость товара_типы адресации.doc
#
16.03.2015135.17 Кб47Excel_3_Стоимость проезда_1.doc
#
16.03.201543.52 Кб69Excel_5_Работа с базами данных в EXCEL 2007.doc
#
02.12.20182.23 Mб31Lab_rabota_po_Excel.doc