Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Data-Structures-And-Algorithms-Alfred-V-Aho

.pdf
Скачиваний:
122
Добавлен:
09.04.2015
Размер:
6.91 Mб
Скачать

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_16.gif

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_16.gif [1.7.2001 19:23:52]

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_17.gif

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_17.gif [1.7.2001 19:24:03]

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_18.gif

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_18.gif [1.7.2001 19:24:14]

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_19.gif

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_19.gif [1.7.2001 19:24:18]

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_20.gif

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/images/f7_20.gif [1.7.2001 19:24:24]

Data Structures and Algorithms: CHAPTER 9: Algorithm Analysis Techniques

Algorithm Analysis

Techniques

What is a good algorithm? There is no easy answer to this question. Many of the criteria for a good algorithm involve subjective issues such as simplicity, clarity, and appropriateness for the expected data. A more objective, but not necessarily more important, issue is run-time efficiency. Section 1.5 covered the basic techniques for establishing the running time of simple programs. However, in more complex cases such as where programs are recursive, some new techniques are needed. This short chapter presents some general techniques for solving recurrence equations that arise in the analysis of the running times of recursive algorithms.

9.1 Efficiency of Algorithms

One way to determine the run-time efficiency of an algorithm is to program it and measure the execution time of the particular implementation on a specific computer for a selected set of inputs. Although popular and useful, this approach has some inherent problems. The running times depend not only on the underlying algorithm, but also on the instruction set of the computer, the quality of the compiler, and the skill of the programmer. The implementation may also be tuned to work well on the particular set of test inputs chosen. These dependencies may become strikingly evident with a different computer, a different compiler, a different programmer, or a different set of test inputs. To overcome these objections, computer scientists have adopted asymptotic time complexity as a fundamental measure of the performance of an algorithm. The term efficiency will refer to this measure, and particularly to the worst-case (as opposed to average) time complexity.

The reader should recall from Chapter 1 the definitions of O(f(n)) and Ω(f(n)). The efficiency, i.e., worst-case complexity, of an algorithm is said to be O(f(n)), or just f(n) for short, if the function of n that gives the maximum, over all inputs of length n, of the number of steps taken by the algorithm on that input, is O(f(n)). Put another way, there is some constant c such that for sufficiently large n, cf(n) is an upper bound on the number of steps taken by the algorithm on any input of length n.

There is the implication in the assertion that "the efficiency of a given algorithm is f(n)" that the efficiency is also Ω(f(n)), so that f(n) is the slowest growing function of n that bounds the worst-case running time from above. However, this latter requirement is not part of the definition of O(f(n)), and sometimes it is not possible to be sure that we have the slowest growing upper bound.

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1209.htm (1 of 19) [1.7.2001 19:24:51]

Data Structures and Algorithms: CHAPTER 9: Algorithm Analysis Techniques

Our definition of efficiency ignores constant factors in running time, and there are several pragmatic reasons for doing so. First, since most algorithms are written in a high level language, we must describe them in terms of "steps," which each take a constant amount of time when translated into the machine language of any computer. However, exactly how much time a step requires depends not only on the step itself, but on the translation process and the instruction repertoire of the machine. Thus to attempt to be more precise than to say that the running time of an algorithm is "on the order of f(n)", i.e., O(f(n)), would bog us down in the details of specific machines and would apply only to those machines.

A second important reason why we deal with asymptotic complexity and ignore constant factors is that the asymptotic complexity, far more than constant factors, determines for what size inputs the algorithm may be used to provide solutions on a computer. Chapter 1 discussed this viewpoint in detail. The reader should be alert, however, to the possibility that for some very important problems, like sorting, we may find it worthwhile to analyze algorithms in such detail that statements like "algorithm A should run twice as fast as algorithm B on a typical computer" are possible.

A second situation in which it may pay to deviate from our worstcase notion of efficiency occurs when we know the expected distribution of inputs to an algorithm. In such situations, average case analysis can be much more meaningful than worst case analysis. For example, in the previous chapter we analyzed the average running time of quicksort under the assumption that all permutations of the correct sorted order are equally likely to occur as inputs.

9.2 Analysis of Recursive Programs

In Chapter 1 we showed how to analyze the running time of a program that does not call itself recursively. The analysis for a recursive program is rather different, normally involving the solution of a difference equation. The techniques for solving difference equations are sometimes subtle, and bear considerable resemblance to the methods for solving differential equations, some of whose terminology we borrow.

Consider the sorting program sketched in Fig. 9.1. There the procedure mergesort takes a list of length n as input, and returns a sorted list as its output. The procedure merge(L1, L2) takes as input two sorted lists L1 and L2, scans them each, element by element, from the front. At each step, the larger of the two front elements is deleted from its list and emitted as output. The result is a single sorted list containing the elements of L1 and L2. The details of merge are not important here, although we

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1209.htm (2 of 19) [1.7.2001 19:24:52]

Data Structures and Algorithms: CHAPTER 9: Algorithm Analysis Techniques

discuss this sorting algorithm in detail in Chapter 11. What is important is that the time taken by merge on lists of length n/2 is O(n).

function mergesort ( L: LIST; n: integer ): LIST; { L is a list of length n. A sorted version of L

is returned. We assume n is a power of 2. } var

L1, L2: LIST begin

if n = 1 then return (L);

else begin

break L into two halves, L1 and L2, each of length n/2;

return (merge (mergesort (L1,n/2), mergesort(L2, n/2)));

end

end; { mergesort }

Fig. 9.1. Recursive procedure mergesort.

Let T(n) be the worst case running time of the procedure mergesort of Fig. 9.1. We can write a recurrence (or difference) equation that upper bounds T(n), as follows

The term c1 in (9.1) represents the constant number of steps taken when L has length 1. In the case that n > 1, the time taken by mergesort can be divided into two parts. The recursive calls to mergesort on lists of length n/2 each take time T(n/2), hence the term 2T(n/2). The second part consists of the test to discover that n ¹ 1, the breaking of list L into two equal parts and the procedure merge. These three operations take time that is either a constant, in the case of the test, or proportional to n for the split and the merge. Thus the constant c2 can be chosen so the term c2n is an upper bound on the time taken by mergesort to do everything except the recursive calls. We now have equation (9.1).

Observe that (9.1) applies only when n is even, and hence it will provide an upper

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1209.htm (3 of 19) [1.7.2001 19:24:52]

Data Structures and Algorithms: CHAPTER 9: Algorithm Analysis Techniques

bound in closed form (that is, as a formula for T(n) not involving any T(m) for m < n) only if n is a power of 2. However, even if we only know T(n) when n is a power of 2, we have a good idea of T(n) for all n. In particular, for essentially all algorithms, we may suppose that T(n) lies between T(2i) and T(2i+1) if n lies between 2i and 2i+1. Moreover, if we devote a little more effort to finding the solution, we could replace the term 2T(n/2) in (9.1) by T((n+1)/2) + T((n-1)/2) for odd n > 1. Then we could solve the revised difference equation to get a closed form solution for all n.

9.3 Solving Recurrence Equations

There are three different approaches we might take to solving a recurrence equation.

1.Guess a solution f(n) and use the recurrence to show that T(n) £ f(n). Sometimes we guess only the form of f(n), leaving some parameters

unspecified (e.g., guess f(n) = an2 for some a) and deduce suitable values for the parameters as we try to prove T(n) £ f(n) for all n.

2.Use the recurrence itself to substitute for any T(m), m<n, on the right until all terms T(m) for m>1 have been replaced by formula involving only T(1). Since T(1) is always a constant, we have a formula for T(n) in terms of n> and constants. This formula is what we have referred to as a "closed form" for T(n).

3.Use the general solution to certain recurrence equations of common types found in this section or elsewhere (see the bibliographic notes).

This section examines the first two methods.

Guessing a Solution

Example 9.1. Consider method (1) applied to Equation (9.1). Suppose we guess that for some a, T(n) = anlogn. Substituting n = 1, we see that this guess will not work, because anlogn has value 0, independent of the value of a. Thus, we might next try T(n) = anlogn + b. Now n = 1 requires that b ³ c1.

For the induction, we assume that

 

T(k) £ aklogk +

b

(9.2)

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1209.htm (4 of 19) [1.7.2001 19:24:52]

Data Structures and Algorithms: CHAPTER 9: Algorithm Analysis Techniques

for all k < n and try to establish that

T(n) £ anlogn +

b

To begin our proof, assume n ³ 2. From (9.1), we have

T(n) £ 2T(n/2)+c2n

From (9.2), with k = n/2, we obtain

provided a ³ c2 + b.

We thus see that T(n) £ anlogn + b holds provided two constraints are satisfied: b ³ c1 and a ³ c2 + b. Fortunately, there are values we can choose for a that satisfy these two constraints. For example, choose b = c1 and a = c1 + c2. Then, by induction on n, we conclude that for all n ³ 1

T(n) £ (c1 +

c2)nlogn + c1

(9.4)

In other words, T(n) is O(nlogn).

Two observations about Example 9.1 are in order. If we assume that T(n) is O(f(n)), and our attempt to prove T(n) £ cf(n) by induction fails, it does not follow that T(n) is not O(f(n)). In fact, an inductive hypothesis of the form T(n) £ cf(n) - 1 may succeed!

http://www.ourstillwaters.org/stillwaters/csteaching/DataStructuresAndAlgorithms/mf1209.htm (5 of 19) [1.7.2001 19:24:52]

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]