Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Algorithms (2006)

.pdf
Скачиваний:
255
Добавлен:
17.08.2013
Размер:
9.67 Mб
Скачать

Chapter 18

How It Works

As with many test cases, the preceding code works by considering a number of unusual cases, such as an empty set of points, a set of points with only one item in it, a set of points with only two items in it, and items that have exactly the same distance between them. Sometimes the number of test cases can be higher than you might expect, but that is an indication of the complexity of the problem you’re trying to solve. Each individual test method is quite simple on its own.

In the next Try It Out, you implement the algorithm, and get all these tests to pass.

Try It Out

Creating the ClosestPairFinder Interface

The interface that defines your algorithm is very simple indeed. It has a single method that accepts a Set of Point objects, and returns another Set containing the two Point objects that make up the closest pair in the original set of points. It can also return null if it is not possible to determine the closest pair (for example, if there is only one Point provided).

package com.wrox.algorithms.geometry;

import com.wrox.algorithms.sets.Set;

public interface ClosestPairFinder

{

public Set findClosestPair(Set

points);

}

 

 

 

 

Try It Out

Implementing the Plane Sweep Algorithm

Create the declaration of the class, including a binary inserter that will enable you to turn the Set of points you receive into a sorted List:

package com.wrox.algorithms.geometry;

import com.wrox.algorithms.bsearch.IterativeBinaryListSearcher; import com.wrox.algorithms.bsearch.ListInserter;

import com.wrox.algorithms.iteration.Iterator; import com.wrox.algorithms.lists.ArrayList; import com.wrox.algorithms.lists.List;

import com.wrox.algorithms.sets.ListSet; import com.wrox.algorithms.sets.Set;

public final class PlaneSweepClosestPairFinder implements ClosestPairFinder { public static final PlaneSweepClosestPairFinder INSTANCE = new

PlaneSweepClosestPairFinder();

private static final ListInserter INSERTER = new ListInserter(

new IterativeBinaryListSearcher(XYPointComparator.INSTANCE));

private PlaneSweepClosestPairFinder() {

}

...

}

464

Computational Geometry

The algorithm to find the closest pair is shown in the following code:

public Set findClosestPair(Set points) {

assert points != null : “points can’t be null”;

if (points.size() < 2) { return null;

}

List sortedPoints = sortPoints(points);

Point p = (Point) sortedPoints.get(0);

Point q = (Point) sortedPoints.get(1);

return findClosestPair(p, q, sortedPoints);

}

Create the following method (explained in more detail below):

private Set findClosestPair(Point p, Point q, List sortedPoints) { Set result = createPointPair(p, q);

double distance = p.distance(q); int dragPoint = 0;

for (int i = 2; i < sortedPoints.size(); ++i) { Point r = (Point) sortedPoints.get(i); double sweepX = r.getX();

double dragX = sweepX - distance;

while (((Point) sortedPoints.get(dragPoint)).getX() < dragX) { ++dragPoint;

}

for (int j = dragPoint; j < i; ++j) {

Point test = (Point) sortedPoints.get(j); double checkDistance = r.distance(test); if (checkDistance < distance) {

distance = checkDistance;

result = createPointPair(r, test);

}

}

}

return result;

}

The preceding code relies on the following method to arrange the points according to their x and y coordinates, using the comparator you defined earlier in this chapter:

private static List sortPoints(Set points) { assert points != null : “points can’t be null”;

List list = new ArrayList(points.size());

Iterator i = points.iterator();

465

Chapter 18

for (i.first(); !i.isDone(); i.next()) { INSERTER.insert(list, i.current());

}

return list;

}

The last utility method is a simple one to create a Set to represent the closest pair, given two Point objects:

private Set createPointPair(Point p, Point q) { Set result = new ListSet();

result.add(p);

result.add(q); return result;

}

How It Works

This plane sweep algorithm implements the ClosestPairFinder interface defined in the preceding section. It is also implemented as a singleton with a private constructor, as it has no state of its own.

An early exit is taken if there are not enough points to comprise even a single pair. You sort the points according to their coordinates. After you have a sorted list, you can extract the first two Point objects and assume they are the closest pair to begin with. You then delegate to another method that sweeps through the remaining points to determine whether any pairs are closer than this initial pair.

The following method is the heart of the plane sweep algorithm. It’s a little more complex than the other methods in this class, so you might want to examine it carefully. Refer to Figure 18-16 and Figure 18-17, which illustrate the algorithm earlier in this chapter, if you need to confirm your understanding of how it works in principle:

private Set findClosestPair(Point p, Point q, List sortedPoints) { Set result = createPointPair(p, q);

double distance = p.distance(q); int dragPoint = 0;

for (int i = 2; i < sortedPoints.size(); ++i) { Point r = (Point) sortedPoints.get(i); double sweepX = r.getX();

double dragX = sweepX - distance;

while (((Point) sortedPoints.get(dragPoint)).getX() < dragX) { ++dragPoint;

}

for (int j = dragPoint; j < i; ++j) {

Point test = (Point) sortedPoints.get(j); double checkDistance = r.distance(test); if (checkDistance < distance) {

distance = checkDistance;

result = createPointPair(r, test);

}

466

Computational Geometry

}

}

return result;

}

Note the following key points when looking at the code:

result contains the Point objects that make up the closest pair.

distance represents the currently identified distance between the closest pair. Of course, this is also the width of the drag net.

dragpoint is the index of the leftmost Point within the drag net.

sweepx is the x coordinate of the Point under the sweep line.

dragx is the x coordinate representing the left edge of the drag net.

This algorithm ignores the first two points in the sorted list, starting the sweep line at the third point, as the first two have already been assumed to make the closest pair for now. It then ignores points that have slipped behind the drag net by advancing the dragpoint variable. Finally, it checks the distance from the point under the sweep line to each of the points in the drag net, updating the resulting closest pair and the distance between them if a closer pair is found than that currently identified.

That wraps up your implementation of the plane sweep algorithm. If you now run all the tests defined for this algorithm, you’ll see that they all pass.

Summar y

This chapter covered some of the theory behind two-dimensional geometry, including the coordinate system, points, lines, and triangles.

We covered two geometric problems in detail: finding the intersection point of two straight lines, and finding the closest pair among a set of points.

You implemented solutions to these problems with fully tested Java code.

We barely had time to scratch the surface of the subject of computational geometry. It is a fascinating field that covers areas including trilateration (the mechanism behind the Global Positioning System), 3D graphics, and computer-aided design. We hope that we have stimulated an interest you will pursue in the future.

Exercises

1.Implement a brute-force solution to the closest pair problem.

2.Optimize the plane sweep algorithm so that points too distant in the vertical direction are ignored.

467

19

Pragmatic Optimization

You might be wondering what the chapter about optimization is doing way at the back of the book. Its placement here reflects our philosophy that optimization does not belong at the forefront of your mind when building your applications. This chapter describes the role of optimization, including when and how to apply it, and demonstrates some very practical techniques to get great performance improvements in the software you build. You’ll be encouraged to keep your design clean and clear as the first order of business, and use hard facts to drive the optimization process. Armed with this knowledge, you will be able to measure your progress and identify when the optimization effort has stopped adding value.

In this chapter, you learn the following:

How optimization fits into the development process

What profiling is and how it works

How to profile an application using the standard Java virtual machine mechanism

How to profile an application using the free Java Memory Profiler tool

How to identify performance issues related to both the CPU and memory usage

How to achieve huge performance increases with small and strategic alterations to code

Where Optimization Fits In

Optimization is an important part of software development, but not as important as you might think. We recommend that you take time to accumulate an awareness of the types of issues that affect performance in your particular programming environment, and keep them in mind as you code. For example, using a StringBuffer to build a long character sequence is preferable to multiple concatenations of String objects in Java, so you should do that as a matter of course.

However, you will stray into dangerous territory if you let this awareness cause you to change your design. This is called premature optimization, and we strongly encourage you to resist the temptation to build an implementation that is faster but harder to understand. In our experience, we are always

Chapter 19

surprised at the performance bottlenecks in our applications. It is only by measuring the behavior of your system and locating the real issue that you can make changes that will have the greatest benefit. It is simply not necessary to have optimized code throughout your system. You only need to worry about the code that is on the critical path, and you might be surprised to find out which code that is, even when you’ve written it yourself! This is where profiling comes in, which is the topic of the next section.

The key thing to remember is that a clean and simple design is much, much easier to optimize than one that the original developers thoughtfully optimized while writing it. It is also very important to choose the right algorithm initially. You should always be aware of the performance profile of your chosen implementation before trying to optimize it — that is, be conscious of whether your algorithm is O(N), O(log N), and so on. Choosing the wrong class of algorithm for the problem at hand will put a hard limit on the benefits that optimization is able to provide. That’s another reason why this chapter is at the back of the book!

Experience shows that the first cut of a program is very unlikely to be the best performing. Unfortunately, experience also shows that it is unlikely that you can guess the exact reason why performance is suffering in any nontrivial application. When first writing a given program, you should make it work before making it fast. In fact, it is a good idea to separate these two activities very clearly in your development projects. By now, you know that we suggest using test-driven development to ensure the correct functioning of your programs. This is the “make it work” part. We then recommend the approach outlined in this chapter to “make it fast.” The tests will keep your code on track while you alter its implementation to get better performance out of it. Just as unit tests remove the guesswork out of the functional success of your program, the techniques you learn in this chapter take the guesswork out of the performance success of your program.

The good news is that most programs have a very small number of significant bottlenecks that can be identified and addressed. The areas of your code you need to change are typically not large in number. The techniques described here enable you to quickly find them, fix them, and prove that you have achieved the benefits you want. We recommend that you avoid guessing how to make your code faster and relying on subjective opinions about the code’s performance. In the same way that you should avoid refactoring without first creating automated unit tests, you should avoid optimizing without automated performance measurements.

Every program that takes a nontrivial amount of time to run has a bottleneck constraining its performance. You need to remember that this will still be true even when the program is running acceptably fast. You should begin the optimization process with a goal to meet some objective performance criteria, not to remove all performance bottlenecks. Be careful to avoid setting yourself impossible performance targets, of course. Nothing you do will enable a 2MB photo to squeeze down a 56K modem line in 3 seconds! Think of optimization as part of your performance toolkit, but not the only skill you’ll need to make your applications really fast. Good design skills and a knowledge of the trade-offs you make when choosing particular data structures and algorithms are much more important skills to have at your disposal.

Understanding Profiling

Profiling is the process of measuring the behavior of a program. Java lends itself to profiling because support for it is built right into the virtual machine, as you’ll see later in this chapter. Profiling other languages varies in its difficulty but is still a very popular technique. Three major areas are measured when profiling a program: CPU usage, memory usage, and concurrency, or threading behavior.

470

Pragmatic Optimization

Concurrency issues are beyond the scope of this chapter, so be sure to check Appendix A for further reading if you need more information on the topic.

A profiler measures CPU usage by determining how long the program spends in each method while the program is running. It is important to be aware that this information is typically gathered by sampling the execution stack of each thread in the virtual machine at regular intervals to determine which methods are active at any given moment. Better results are obtained from longer-running programs. If your program is very fast, then the results you get might not be accurate. Then again, if your program is already that fast, you probably don’t need to optimize it too much!

A profiler will report statistics such as the following:

How many times a method was called

How much CPU time was consumed in a given method

How much CPU time was consumed by a method and all methods called by it

What proportion of the running time was spent in a particular method

These statistics enable you to identify which parts of your code will benefit from some optimization. Similarly for memory usage, a profiler will gather statistics on overall memory usage, object creation, and garbage collection, and provide you with information such as the following:

How many objects of each class were instantiated

How many instances of each class were garbage collected

How much memory was allocated to the virtual machine by the operating system at any given time (the heap size)

How much of the heap was free and how much was in use at a given time

This kind of information will give you a deeper insight into the runtime behavior of your code, and is often the source of many informative surprises, as you’ll see later in this chapter when we optimize an example program. Again, the profiler gives you a lot of evidence on which to base your optimization efforts.

The following section shows you how to profile a Java program using two different techniques. The first uses the profiling features built into the Java virtual machine itself. These features are simple in nature, but readily available for you to try. The second technique involves an open-source tool known as the Java Memory Profiler (JMP). This provides much more helpful information in a nice graphical interface, but requires you to download and install the software before you can get started. The next section explains the sample program used in the profiling exercises.

The FileSor tingHelper Example Program

You will use a contrived example program for the purposes of profiling and optimization. This program will be a simple Unix-style filter that takes input from standard input, assuming each line contains a word, and then sorts the words and prints them out again to standard output in sorted order. To put a twist on things, the comparator used to sort the words will sort words according to their alphabetical order were the words printed backwards. For example, the word “ant” would sort before the word “pie”

471

Chapter 19

because when printed backwards, “tna” (“ant” backwards) sorts after “eip” (“pie” backwards). This is simply to make the program work a little harder and make the profiling more interesting, so don’t worry if it seems pointless — it probably is!

If you used this sample program to sort the following list of words:

test driven development is

one small step for

programmers but

one giant leap for

programming

then you’d get the following output:

one one

programming small driven leap

step for for is

programmers giant development test

but

Here is the code for the comparator:

package com.wrox.algorithms.sorting;

public final class ReverseStringComparator implements Comparator { public static final ReverseStringComparator INSTANCE = new

ReverseStringComparator();

private ReverseStringComparator() {

}

public int compare(Object left, Object right) throws ClassCastException { assert left != null : “left can’t be null”;

472

Pragmatic Optimization

assert right != null : “right can’t be null”;

return reverse((String) left).compareTo(reverse((String) right));

}

private String reverse(String s) { StringBuffer result = new StringBuffer();

for (int i = 0; i < s.length(); i++) { result.append(s.charAt(s.length() - 1 - i));

}

return result.toString();

}

}

There’s no need to go into great detail about how this code works, as you won’t be using it in any of your programs. It implements the standard Comparator interface, assumes both its arguments are String objects, and compares them after first creating a reversed version of each.

Try It Out

Implementing the FileSortingHelper Class

The FileSortingHelper class is shown here:

package com.wrox.algorithms.sorting;

import com.wrox.algorithms.iteration.Iterator; import com.wrox.algorithms.lists.LinkedList; import com.wrox.algorithms.lists.List;

import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader;

public final class FileSortingHelper { private FileSortingHelper() {

}

public static void main(String[] args) throws Exception { sort(loadWords());

System.err.println(“Finished...press CTRL-C to exit”);

Thread.sleep(100000);

}

...

}

How It Works

As you can see, this class has a private constructor to prevent instantiation by other code, and a main() method that delegates most of the work to two other methods, loadWords() and sort(). It then does an apparently strange thing — it prints a message advising you to kill the program and puts itself to sleep for a while using the Thread.sleep() call. This is simply to give you more time to look at the results of the profiling in JMP when the program finishes, so don’t worry about it.

473