Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Algorithms (2006)

.pdf
Скачиваний:
256
Добавлен:
17.08.2013
Размер:
9.67 Mб
Скачать

Chapter 7

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

U

R

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

 

 

U

 

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

 

 

C

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R E

Figure 7-36: The final level of recursion in the mergesort example.

You have finally reached a level at which you have two single-item lists. This is the base case in the recursive algorithm, so you can now merge these two single-item sorted lists together into a single twoitem sorted sublist, as shown in Figure 7-37.

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

U

R

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

 

 

U

 

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E R C

Figure 7-37: The first merge operation is complete.

In Figure 7-37, the two sublists of the sublist (R, E, C) are already sorted. One is the two-element sublist you just merged, and the other is a single-item sublist containing the letter C. These two sublists can thus be merged to produce a three-element sorted sublist, as shown in Figure 7-38.

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

U

R

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C

E

R

 

 

U

 

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-38: The second merge operation is complete.

164

Advanced Sorting

The sublist (R, E, C, U, R) now has one sublist that is sorted and one sublist that is not yet sorted. The next step is to sort the second sublist (U, R). As you would expect, this involves recursively mergesorting this two-element sublist, as shown in Figure 7-39.

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

U

R

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C

E

R

 

 

U

 

R

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

U R

Figure 7-39: Recursively sorting the (U, R) sublist.

The two single-item sublists can now be merged as shown in Figure 7-40.

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R

E

C

U

R

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C

E

R

 

 

R

 

U

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-40: Merging the two single-item sublists.

Both sublists of the (R, E, C, U, R) sublist are sorted independently, so the recursion can unwind by merging these two sublists together, as shown in Figure 7-41.

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C

E

R

R

U

 

 

S

I

V

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-41: Unwinding the recursion by merging two sublists.

The algorithm continues with the (S, I, V, E) sublist until it too is sorted, as shown in Figure 7-42. (We have skipped a number of steps between the preceding and following figures because they are very similar to the steps for sorting the first sublist.)

165

Chapter 7

R

E

C

U

R

S

I

V

E

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

C

E

R

R

U

 

 

E

I

S

 

V

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-42: Ready to merge the first half of the original list.

The final step in sorting the first half of the original list is to merge its two sorted sublists, giving the result shown in Figure 7-43.

C

E

E

I

R

R

S

U

V

 

M

E

R

G

E

S

O

R

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-43: The first half of the list is now sorted.

The algorithm continues in exactly the same way until the right half of the original list is also sorted, as shown in Figure 7-44.

C

E

E

I

R

R

S

U

V

 

E

E

G

M

O

R

R

S

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 7-44: The right half of the original list is sorted.

Finally, you can merge the two sorted halves of the original list to form the final sorted list, as shown in Figure 7-45.

C

E

E

I

R

R

S

U

V

 

E

E

G

M

O

R

R

S

T

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C E E E E G I M O R R R R S S T U V

Figure 7-45: The final result.

Mergesort is an elegant algorithm that is relatively simple to implement, as you will see in the next Try It Out.

Try It Out

Testing the Mergesort Algorithm

The test for the mergesort algorithm is the same as each of the other tests in this chapter, with the only difference being that you instantiate the appropriate implementation class:

package com.wrox.algorithms.sorting;

public class MergesortListSorterTest extends AbstractListSorterTest { protected ListSorter createListSorter(Comparator comparator) {

return new MergesortListSorter(comparator);

}

}

166

Advanced Sorting

In the next Try It Out, you implement the mergesort.

Try It Out

Implementing Mergesort

Once again, the implementation follows the usual pattern: You implement the ListSorter interface and accept a comparator to impose order on the items to be sorted:

package com.wrox.algorithms.sorting;

import com.wrox.algorithms.lists.List; import com.wrox.algorithms.lists.ArrayList;

import com.wrox.algorithms.iteration.Iterator;

public class MergesortListSorter implements ListSorter { private final Comparator _comparator;

public MergesortListSorter(Comparator comparator) {

assert comparator != null : “comparator cannot be null”; _comparator = comparator;

}

...

}

You use the sort() method from the ListSorter interface to call the mergesort() method, passing in the lowest and highest item indexes so that the entire list is sorted. Subsequent recursive calls pass in index ranges that restrict the sorting to smaller sublists:

public List sort(List list) {

assert list != null : “list cannot be null”;

return mergesort(list, 0, list.size() - 1);

}

Create the mergesort() method that follows to deal with situations in which it has been called to sort a sublist containing a single item. In this case, it creates a new result list and adds the single item to it, ending the recursion.

If there is more than one item in the sublist to be sorted, the code simply splits the list, recursively sorts each half, and merges the result:

private List mergesort(List list, int startIndex, int endIndex) { if (startIndex == endIndex) {

List result = new ArrayList(); result.add(list.get(startIndex)); return result;

}

int splitIndex = startIndex + (endIndex - startIndex) / 2;

List left = mergesort(list, startIndex, splitIndex);

List right = mergesort(list, splitIndex + 1, endIndex);

return merge(left, right);

}

167

Chapter 7

The following merge() method is a little more complicated than you might at first expect, mainly because it has to deal with cases in which one of the lists is drained before the other, and cases in which either of the lists supplies the next item:

private List merge(List left, List right) { List result = new ArrayList();

Iterator l = left.iterator();

Iterator r = right.iterator();

l.first();

r.first();

while (!(l.isDone() && r.isDone())) { if (l.isDone()) {

result.add(r.current());

r.next();

}else if (r.isDone()) { result.add(l.current()); l.next();

}else if (_comparator.compare(l.current(), r.current()) <= 0) { result.add(l.current());

l.next();

}else {

result.add(r.current());

r.next();

}

}

return result;

}

How It Works

As with all recursive algorithms, the key part of the implementation is to cater to both the base case and the general case in the recursive method. In the preceding code, the mergesort() method separates these cases very clearly, dealing with the base case first and returning from the method immediately if there is only a single element in the sublist being considered. If there is more than one item, it splits the list in half, sorts each recursively, and merges the results using the merge() method.

The merge() method obtains an iterator on each sublist, as all that is required is a simple sequential traversal of the items in each sublist. The code is complicated slightly by the fact that the algorithm needs to continue when one of the sublists runs out of items, in which case all of the items from the other sublist need to be added to the end of the output list. In cases where both sublists still have items to be considered, the current items from the two sublists are compared, and the smaller of the two is placed on the output list.

That completes our coverage of the three advanced sorting algorithms in this chapter. The next section compares these three algorithms so that you can choose the right one for the problem at hand.

168

Advanced Sorting

Comparing the Advanced Sor ting Algorithms

As in Chapter 6, we compare the three algorithms from this chapter using a practical, rather than a theoretical or mathematical, approach. If you are interested in the math behind the algorithms, see the excellent coverage in Algorithms in Java [Sedgwick, 2002]. The idea behind this approach is to inspire your creativity when evaluating code that you or others have written, and to encourage you to rely on empirical evidence in preference to theoretical benefits as a general rule.

Refer to Chapter 6 for details on the ListSorterCallCountingTest class that was used to compare the algorithms from that chapter. Here you create code that extends this driver program to support the three advanced algorithms. The code for the worst-case tests is shown here:

public void testWorstCaseShellsort() {

new ShellsortListSorter(_comparator).sort(_reverseArrayList); reportCalls(_comparator.getCallCount());

}

public void testWorstCaseQuicksort() {

new QuicksortListSorter(_comparator).sort(_reverseArrayList);

reportCalls(_comparator.getCallCount());

}

public void testWorstCaseMergesort() {

new MergesortListSorter(_comparator).sort(_reverseArrayList); reportCalls(_comparator.getCallCount());

}

The code for the best cases is, of course, very similar:

public void testBestCaseShellsort() {

new ShellsortListSorter(_comparator).sort(_sortedArrayList); reportCalls(_comparator.getCallCount());

}

public void testBestCaseQuicksort() {

new QuicksortListSorter(_comparator).sort(_sortedArrayList);

reportCalls(_comparator.getCallCount());

}

public void testBestCaseMergesort() {

new MergesortListSorter(_comparator).sort(_sortedArrayList); reportCalls(_comparator.getCallCount());

}

169

Chapter 7

Finally, look over the code for the average cases:

public void testAverageCaseShellsort() {

new ShellsortListSorter(_comparator).sort(_randomArrayList); reportCalls(_comparator.getCallCount());

}

public void testAverageCaseQuicksort() {

new QuicksortListSorter(_comparator).sort(_randomArrayList);

reportCalls(_comparator.getCallCount());

}

public void testAverageCaseMergeSort() {

new MergesortListSorter(_comparator).sort(_randomArrayList); reportCalls(_comparator.getCallCount());

}

This evaluation measures only the number of comparisons performed during the algorithm execution. It ignores very important issues such as the number of list item movements, which can have just as much of an impact on the suitability of an algorithm for a particular purpose. You should take this investigation further in your own efforts to put these algorithms to use.

Take a look at the worst-case results for all six sorting algorithms. We have included the results from the basic algorithms from Chapter 6 to save you the trouble of referring back to them:

testWorstCaseBubblesort: 499500 calls testWorstCaseSelectionSort: 499500 calls testWorstCaseInsertionSort: 499500 calls

testWorstCaseShellsort: 9894 calls testWorstCaseQuicksort: 749000 calls testWorstCaseMergesort: 4932 calls

Wow! What’s up with quicksort? It has performed 50 percent more comparisons than even the basic algorithms! Shellsort and mergesort clearly require very few comparison operations, but quicksort is the worst of all (for this one simple measurement). Recall that the worst case is a list that is completely in reverse order, meaning that the smallest item is at the far right (the highest index) of the list when the algorithm begins. Recall also that the quicksort implementation you created always chooses the item at the far right of the list and attempts to divide the list into two parts, with items that are smaller than this item on one side, and those that are larger on the other. Therefore, in this worst case, the partitioning item is always the smallest item, so no partitioning happens. In fact, no swapping occurs except for the partitioning item itself, so an exhaustive comparison of every object with the partitioning value is required on every pass, with very little to show for it.

As you can imagine, this wasteful behavior has inspired smarter strategies for choosing the partitioning item. One way that would help a lot in this particular situation is to choose three partitioning items (such as from the far left, the far right, and the middle of the list) and choose the median value as the partitioning item on each pass. This type of approach stands a better chance of actually achieving some partitioning in the first place in the worst case.

170

Advanced Sorting

Here are the results from the best-case tests:

testBestCaseBubblesort: 498501 calls testBestCaseSelectionSort: 498501 calls

testBestCaseInsertionSort:

998

calls

testBestCaseShellsort:

4816

calls

testBestCaseQuicksort:

498501

calls

testBestCaseMergesort:

5041

calls

The excellent result for insertion sort will be no surprise to you, and again quicksort seems to be the odd one out among the advanced algorithms. Perhaps you are wondering why we bothered to show it to you if it is no improvement over the basic algorithms, but remember that the choice of partitioning item could be vastly improved. Once again, this case stumps quicksort’s attempt to partition the data, as on each pass it finds the largest item in the far right position of the list, thereby wasting a lot of time trying to separate the data using the partitioning item as the pivot point.

As you can see from the results, insertion sort is the best performer in terms of comparison effort for already sorted data. You saw how shellsort eventually reduces to an insertion sort, having first put the data into a nearly sorted state. It is also common to use insertion sort as a final pass in quicksort implementations. For example, when the sublists to be sorted get down to a certain threshold (say, five or ten items), quicksort can stop recursing and simply use an in-place insertion sort to complete the job.

Now look at the average-case results, which tend to be a more realistic reflection of the type of results you might achieve in a production system:

testAverageCaseBubblesort: 498501 calls testAverageCaseSelectionSort: 498501 calls

testAverageCaseInsertionSort: 251096

calls

testAverageCaseShellsort:

13717

calls

testAverageCaseQuicksort:

19727

calls

testAverageCaseMergeSort:

8668

calls

Finally, there is a clear distinction between the three basic algorithms and the three advanced algorithms! The three algorithms from this chapter are all taking about 1/20 the number of comparisons to sort the average case data as the basic algorithms. This is a huge reduction in the amount of effort required to sort large data sets. The good news is that as the data sets get larger, the gap between the algorithms gets wider.

Before you decide to start using mergesort for every problem based on its excellent performance in all the cases shown, remember that mergesort creates a copy of every list (and sublist) that it sorts, and so requires significantly more memory (and perhaps time) to do its job than the other algorithms. Be very careful not to draw conclusions that are not justified by the evidence you have, in this and in all your programming endeavors.

171

Chapter 7

Summar y

This chapter covered three advanced sorting algorithms. While these algorithms are more complex and more subtle than the algorithms covered in Chapter 6, they are far more likely to be of use in solving large, practical problems you might come across in your programming career. Each of the advanced sorting algorithms — shellsort, quicksort, and mergesort — were thoroughly explained before you implemented and tested them.

You also learned about the lack of stability inherent in shellsort and quicksort by implementing a compound comparator to compensate for this shortcoming. Finally, you looked at a simple comparison of the six algorithms covered in this and the previous chapter, which should enable you to understand the strengths and weaknesses of the various options.

In the next chapter, you learn about a sophisticated data structure that builds on what you have learned about queues and incorporates some of the techniques from the sorting algorithms.

Exercises

1.Implement mergesort iteratively rather than recursively.

2.Implement quicksort iteratively rather than recursively.

3.Count the number of list manipulations (for example, set(), add(), insert()) during quicksort and shellsort.

4.Implement an in-place version of insertion sort.

5.Create a version of quicksort that uses insertion sort for sublists containing fewer than five items.

172

8

Priority Queues

After studying a broad selection of sorting algorithms in the previous two chapters, you return to investigating data structures in this chapter. A priority queue is a special type of queue (see Chapter 4) that provides access to the largest element contained within it. This has many interesting applications, some of which you will see later in this book. We waited until we covered the sorting algorithms before discussing priority queues because the more complex priority queue implementations require you to understand the issues regarding sorting.

As an example of when you might use a priority queue, imagine a role-playing game in which you are making your way through hostile territory, with threatening characters all around you. Some of these characters are more lethal than others. Being able to quickly identify the largest threat to your health would be a good survival strategy! Notice that it is not necessary to maintain the list of threatening characters in full sorted order. Given that you can have only one fight at a time, all you need to know at any one time is which threat is the largest. By the time you’ve dealt with the biggest one, others may have arrived on the scene, so the sort would have been of little use.

This chapter covers the following topics:

Understanding priority queues

Creating an unordered list priority queue

Creating a sorted list priority queue

Understanding a heap and how it works

Creating a heap-ordered list implementation of a priority queue

Comparing the different priority queue implementations