AhmadLang / Java, How To Program, 2004
.pdf
[Page 786 (continued)]
16.1. Introduction
Searching data involves determining whether a value (referred to as the search key) is present in the data and, if so, finding the value's location. Two popular search algorithms are the simple linear search and the faster but more complex binary search. Sorting places data in order, typically ascending or descending, based on one or more sort keys. A list of names could be sorted alphabetically, bank accounts could be sorted by account number, employee payroll records could be sorted by social security number, and so on. This chapter introduces two simple sorting algorithms, the selection sort and the insertion sort, along with the more efficient but more complex merge sort. Figure 16.1 summarizes the searching and sorting algorithms discussed in this book.
Figure 16.1. Searching and sorting algorithms in this text.
(This item is displayed on page 787 in the print version)
Chapter |
Algorithm |
Location |
|
|
|
Searching Algorithms: |
|
|
16 |
Linear Search |
Section 16.2.1 |
|
Binary Search |
Section 16.2.2 |
|
Recursive Linear Search |
Exercise 16.8 |
|
Recursive Binary Search |
Exercise 16.9 |
17 |
Linear search of a List |
Exercise 17.21 |
|
Binary tree search |
Exercise 17.23 |
19 |
binarySearch method of |
Fig. 19.14 |
|
class Collections |
|
Sorting Algorithms: |
|
|
16 |
Selection Sort |
Section 16.3.1 |
|
Insertion Sort |
Section 16.3.2 |
|
Recursive Merge Sort |
Section 16.3.3 |
|
Bubble Sort |
Exercises 16.3 and 16.4 |
|
Bucket Sort |
Exercise 16.7 |
|
Recursive Quicksort |
Exercise 16.10 |
17 |
Binary tree sort |
Section 17.9 |
19 |
sort method of class |
Fig. 19.8Fig. 19.11 |
|
Collections |
|
|
SortedSet collection |
Fig. 19.19 |
|
|
|
[Page 786 (continued)]
16.2. Searching Algorithms
Looking up a phone number, accessing a Web site and checking the definition of a word in a dictionary all involve searching large amounts of data. The next two sections discuss two common search algorithmsone that is easy to program yet relatively inefficient and one that is relatively efficient but more complex and difficult to program.
16.2.1. Linear Search
The linear search algorithm searches each element in an array sequentially. If the search key does not match an element in the array, the algorithm tests each element, and when the end of the array is reached, informs the user that the search key is not present. If the search key is in the array, the algorithm tests each element until it finds one that matches the search key and returns the index of that element.
As an example, consider an array containing the following values
34 |
56 |
2 |
10 |
77 |
51 |
93 |
30 |
5 |
52 |
and a program that is searching for 51. Using the linear search algorithm, the program first checks whether 34 matches the search key. It does not, so the algorithm checks whether 56 matches the search key. The program continues moving through the array sequentially, testing 2, then 10, then 77. When the program tests 51, which matches the search key, the program returns the index 5, which is the location of 51 in the array. If, after checking every array element, the program determines that the search key does not match any element in the array, the program returns a sentinel value (e.g. -1).
[Page 787]
Figure 16.2 declares the LinearArray class. This class has two private instance variablesan array of ints named data, and a static Random object to fill the array with randomly generated ints. When an object of class LinearArray is instantiated, the constructor (lines 1219) creates and initializes the array data with random ints in the range 1099. If there are duplicate values in the array, linear search returns the index of the first element in the array that matches the search key.
Figure 16.2. LinearArray class.
(This item is displayed on page 788 in the print version)
1 |
// |
Fig |
16 |
.2: |
LinearArray.java |
2 |
// |
Class |
that |
contains an array of random integers and a method |
|
3 |
// |
that |
will |
search that array sequentially |
|
4 |
import |
java.util.Random; |
|||
5 |
|
|
|
|
|
6public class LinearArray
7{
8 |
private |
int [] |
data; // array of values |
9 |
private |
static |
Random generator = new Random(); |
10
11// create array of given size and fill with random numbers
12public LinearArray( int size )
13{
14 |
data |
= |
new |
int |
[ |
size |
]; |
// |
create space |
for array |
||
15 |
|
|
|
|
|
|
|
|
|
|
|
|
16 |
// fill |
array |
with |
random |
ints in |
range |
10-99 |
|||||
17 |
for |
( int |
i = |
0 |
; |
i < |
size; |
i++ ) |
|
|
||
18data[ i ] = 10 + generator.nextInt( 90 );
19} // end LinearArray constructor
20
21// perform a linear search on the data
22public int linearSearch( int searchKey )
23{
24// loop through array sequentially
25 |
for ( |
int index = 0 |
; |
index < |
data.length; index++ ) |
||
26 |
if |
( data[ |
index |
] |
== searchKey |
) |
|
27 |
|
return |
index; |
// |
return |
index |
of integer |
28 |
|
|
|
|
|
|
|
29return -1; // integer was not found
30} // end method linearSearch
31
32// method to output values in array
33public String toString()
34{
35StringBuffer temporary = new StringBuffer();
37// iterate through array
38for ( int element : data )
39temporary.append( element + " " );
41temporary.append( "\n" ); // add endline character
42return temporary.toString();
43} // end method toString
44} // end class LinearArray
Lines 2230 perform the linear search. The search key is passed to parameter searchKey. Lines 2527 loop through the elements in the array. Line 26 compares each element in the array with searchKey. If the values are equal, line 27 returns the index of the element. If the loop ends without finding the value, line 29 returns -1. Lines 3343 declare method toString, which returns a String representation of the array for printing.
Figure 16.3 creates a LinearArray object containing an array of 10 ints (line 16) and allows the user to search the array for specific elements. Lines 2022 prompt the user for the search key and store it in searchInt. Lines 2541 then loop until the searchInt is equal to -1. The array holds ints from 10-99 (line 18 of Fig. 16.2). Line 28 calls method linearSearch to determine whether searchInt is in the array. If searchInt is not in the array, linearSearch returns -1 and the program notifies the user (lines 3132). If searchInt is in the array, linearSearch returns the position of the element, which the program outputs in lines 3435. Lines 3840 retrieve the next integer from the user.
[Page 788]
Figure 16.3. LinearSearchTest class.
(This item is displayed on pages 789 - 790 in the print version)
1 |
// |
Fig |
16.3: LinearSearchTest.java |
|
2 |
// |
Sequentially search an |
array for an item. |
|
3 |
import |
java.util.Scanner; |
|
|
4 |
|
|
|
|
5public class LinearSearchTest
6{
7public static void main( String args[] )
8{
9// create Scanner object to input data
10Scanner input = new Scanner( System.in );
12int searchInt; // search key
13int position; // location of search key in array
15// create array and output it
16LinearArray searchArray = new LinearArray( 10 );
17System.out.println( searchArray ); // print array
19// get input from user
20System.out.print(
21"Please enter an integer value (-1 to quit): " );
The binary search algorithm is more efficient than the linear search algorithm, but it requires that the array be sorted. The first iteration of this algorithm tests the middle element in the array. If this matches the search key, the algorithm ends. Assuming the array is sorted in ascending order, then if the search key is less than the middle element, the search key cannot match any element in the second half of the array and the algorithm continues with only the first half of the array (i.e., the first element up to, but not including the middle element). If the search key is greater than the middle element, the search key cannot match any element in the first half of the array and the algorithm continues with only the second half of the array (i.e., the element after the middle element through the last element). Each iteration tests the middle value of the remaining portion of the array. If the search key does not match the element, the algorithm eliminates half of the remaining elements. The algorithm ends either by finding an element that matches the search key or reducing the sub-array to zero size.
As an example consider the sorted 15-element array
2 |
3 |
5 |
10 |
27 |
30 |
34 |
51 |
56 |
65 |
77 |
81 |
82 |
93 |
99 |
[Page 792]
and a search key of 65. A program implementing the binary search algorithm would first check whether 51 is the search key (because 51 is the middle element of the array). The search key (65) is larger than 51, so 51 is discarded along with the first half of the array (all elements smaller than 51.) Next, the algorithm checks whether 81 (the middle element of the remainder of the array) matches the search key. The search key (65) is smaller than 81, so 81 is discarded along with the elements larger than 81. After just two tests, the algorithm has narrowed the number of values to check to three (56, 65 and 77). The algorithm then checks 65 (which indeed matches the search key), and returns the index of the array element containing 65. This algorithm required just three comparisons to determine whether the search key matched an element of the array. Using a linear search algorithm would have required 10 comparisons. [Note: In this example, we have chosen to use an array with 15 elements so that there will always be an obvious middle element in the array. With an even number of elements, the middle of the array lies between two elements. We implement the algorithm to chose the lower of those two elements.]
Figure 16.4 declares class BinaryArray. This class is similar to LinearArrayit has two private instance variables, a constructor, a search method (binarySearch), a remainingElements method and a toString method. Lines 1322 declare the constructor. After initializing the array with random ints from 1099 (lines 1819), line 21 calls the Arrays.sort method on the array data. Method sort is a static method of class Arrays that sorts the elements in an array in ascending order. Recall that the binary search algorithm will work only on a sorted array.
|
|
[Page 793] |
|
|
[Page 794] |
Figure 16.4. BinaryArray class. |
||
|
|
(This item is displayed on pages 792 - 793 in the print version) |
1 |
// Fig 16.4: BinaryArray.java |
|
2 |
// |
Class that contains an array of random integers and a method |
3 |
// |
that uses binary search to find an integer. |
4import java.util.Random;
5import java.util.Arrays;
7public class BinaryArray
8{
9private int [] data; // array of values
10private static Random generator = new Random();
12// create array of given size and fill with random integers
13public BinaryArray( int size )
14 |
{ |
|
|
15 |
data = new |
int [ size ]; // create space |
for array |
16 |
|
|
|
17 |
// fill array with random ints in range 10-99 |
||
18 |
for ( int i = 0; i < size; i++ ) |
|
|
19 |
data[ i |
] = 10 + generator.nextInt( 90 |
); |
20
21Arrays.sort( data );
22} // end BinaryArray constructor
24// perform a binary search on the data
25public int binarySearch( int searchElement )
26 |
{ |
|
|
27 |
int low = 0 ; // low end of the |
search area |
|
28 |
int high = data.length - 1 ; // |
high |
end of the search area |
29 |
int middle = ( low + high + 1 ) |
/ 2 |
; // middle element |
30 |
int location = -1; // return value; |
-1 if not found |
|
31 |
|
|
|
32do // loop to search for element
33{
34// print remaining elements of array
35System.out.print( remainingElements( low, high ) );
37 |
// output |
spaces |
for |
alignment |
|
||
38 |
for ( int |
i = 0; |
i < |
middle; |
i++ |
) |
|
39 |
System.out.print( |
" |
" ); |
|
|
||
40 |
System.out.println( " |
* " |
); |
// |
indicate current middle |
||
41 |
|
|
|
|
|
|
|
42// if the element is found at the middle
43if ( searchElement == data[ middle ] )
44 |
location = middle; // location is the current middle |
45 |
|
46// middle element is too high
47else if ( searchElement < data[ middle ] )
48 |
high = middle - 1 |
; |
// |
eliminate |
the |
higher half |
||||||
49 |
else // middle element is |
too low |
|
|
|
|||||||
50 |
low |
= |
middle |
+ |
1 ; |
// |
eliminate |
the |
lower half |
|||
51 |
|
|
|
|
|
|
|
|
|
|
|
|
52 |
middle |
= |
( low |
+ |
high |
+ |
1 |
) |
/ 2 ; |
// |
recalculate the middle |
|
53 |
} while ( |
( |
low <= |
high |
) |
&& |
( |
location |
== |
-1 ) ); |
||
54 |
|
|
|
|
|
|
|
|
|
|
|
|
55return location; // return location of search key
56} // end method binarySearch
57
58// method to output certain values in array
59public String remainingElements( int low, int high )
60{
61StringBuffer temporary = new StringBuffer();
62 |
|
|
|
|
|
|
|
63 |
// output |
spaces for |
alignment |
|
|
||
64 |
for ( int |
i = 0; i < |
low; |
i++ |
) |
|
|
65 |
temporary.append( |
" |
" ); |
|
|
||
66 |
|
|
|
|
|
|
|
67 |
// output |
elements |
left in array |
|
|||
68 |
for ( int |
i = low; |
i |
<= high; |
i++ |
) |
|
69 |
temporary.append( |
data[ i ] |
+ |
" " ); |
|||
70 |
|
|
|
|
|
|
|
71temporary.append( "\n" );
72return temporary.toString();
73} // end method remainingElements
75// method to output values in array
76public String toString()
77{
78return remainingElements( 0, data.length - 1 );
79} // end method toString
80} // end class BinaryArray
Lines 2556 declare method binarySearch. The search key is passed into parameter searchElement (line 25). Lines 2729 calculate the low end index, high end index and middle index of the portion of the array that the program is currently searching. At the beginning of the method, the low end is 0, the high end is the length of the array minus 1 and the middle is the average of these two values. Line 30 initializes the location of the element to -1the value that will be returned if the element is not found. Lines 3253 loop until low is greater than high (this occurs when the element is not found) or location does not equal -1 (indicating that the search key was found). Line 43 tests whether the value in the middle element is equal to searchElement. If this is TRue, line 44 assigns middle to location. Then the
loop terminates and location is returned to the caller. Each iteration of the loop tests a single value
(line 43) and eliminates half of the remaining values in the array (line 48 or 50).
Lines 2644 of Fig. 16.5 loop until the user enters -1. For each other number the user enters, the program performs a binary search on the data to determine whether it matches an element in the array. The first line of output from this program is the array of ints, in increasing order. When the user instructs the program to search for 23, the program first tests the middle element, which is 42 (as indicated by *). The search key is less than 42, so the program eliminates the second half of the array and tests the middle element from the first half of the array. The search key is smaller than 34, so the program eliminates the second half of the array, leaving only three elements. Finally, the program checks 23 (which matches the search key) and returns the index 1.
[Page 796]
Figure 16.5. BinarySearchTest class.
(This item is displayed on pages 794 - 795 in the print version)
1 |
// |
Fig |
16.5: BinarySearchTest.java |
2 |
// |
Use |
binary search to locate an item in an array. |
3 |
import |
java.util.Scanner; |
|
4 |
|
|
|
5public class BinarySearchTest
6{
7 public static void main( String args[] )
8{
9// create Scanner object to input data
10Scanner input = new Scanner( System.in );
12int searchInt; // search key
13int position; // location of search key in array
15// create array and output it
16BinaryArray searchArray = new BinaryArray( 15 );
17System.out.println( searchArray );
19// get input from user
20System.out.print(
21"Please enter an integer value (-1 to quit): " );
22searchInt = input.nextInt(); // read an int from user
23System.out.println();
24
25// repeatedly input an integer; -1 terminates the program
26while ( searchInt != -1 )
27{
28// use binary search to try to find integer
29position = searchArray.binarySearch( searchInt );
30
31// return value of -1 indicates integer was not found
32if ( position == -1 )
33 |
System.out.println( "The integer " + searchInt |
+ |
34 |
" was not found.\n" ); |
|
35 |
else |
|
36 |
System.out.println( "The integer " + searchInt |
+ |
37 |
" was found in position " + position + ".\n" |
); |
38 |
|
|
39// get input from user
40System.out.print(
41 |
"Please enter an integer value (-1 to quit): " ); |
42searchInt = input.nextInt(); // read an int from user
43System.out.println();
44} // end while
45} // end main
46} // end class BinarySearchTest
13 23 24 34 35 36 38 42 47 51 68 74 75 85 97
Please enter an integer value (-1 to quit): 23
13 |
23 |
24 |
34 |
35 |
36 |
38 |
42 |
47 |
51 |
68 |
74 |
75 |
85 |
97 |
13 |
23 |
24 |
34 |
35 |
36 |
38 |
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||||
13 |
23 |
24 |
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
The integer 23 was found in position 1. |
|
|
||||||||||||
Please enter an integer value |
(-1 to quit): 75 |
|||||||||||||
13 |
23 |
24 |
34 |
35 |
36 |
38 |
42 |
47 |
51 |
68 |
74 |
75 |
85 |
97 |
|
|
|
|
|
|
|
* |
47 |
51 |
68 |
74 |
75 |
85 |
97 |
|
|
|
|
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
|
|
|
* |
75 |
85 |
97 |
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
75 |
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* |
|
|
The integer 75 was found in position 12. |
|
|||||||||||||
Please enter an integer value |
(-1 to quit): 52 |
|||||||||||||
13 |
23 |
24 |
34 |
35 |
36 |
38 |
42 |
47 |
51 |
68 |
74 |
75 |
85 |
97 |
|
|
|
|
|
|
|
* |
47 |
51 |
68 |
74 |
75 |
85 |
97 |
|
|
|
|
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
47 |
51 |
68 |
* |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
* |
68 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The integer 52 was not found. |
* |
|
|
|
|
|||||||||
|
|
|
|
|
||||||||||
Please |
enter |
an |
integer |
value |
(-1 |
to |
quit): -1 |
|||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Efficiency of Binary Search
In the worst-case scenario, searching a sorted array of 1,023 elements will take only 10 comparisons when using a binary search. Repeatedly dividing 1,023 by 2 (because after each comparison, we are able to eliminate half of the array) and rounding down (because we also remove the middle element)
yields the values 511, 255, 127, 63, 31, 15, 7, 3, 1 and 0. The number 1023 (210 1) is divided by 2 only 10 times to get the value 0, which indicates that there are no more elements to test. Dividing by 2
is equivalent to one comparison in the binary-search algorithm. Thus, an array of 1,048,575 (220 1) elements takes a maximum of 20 comparisons to find the key, and an array of over one billion elements takes a maximum of 30 comparisons to find the key. This is a tremendous improvement in performance over the linear search. For a one-billion-element array, this is a difference between an average of 500 million comparisons for the linear search and a maximum of only 30 comparisons for the binary search! The maximum number of comparisons needed for the binary search of any sorted array is the exponent of the first power of 2 greater than the number of elements in the array which is represented as log2n.
All logarithms grow at roughly the same rate, so in big O notation the base can be omitted. This results in a big O of O(log n) for a binary search which is also known as logarithmic run time.
[Page 796 (continued)]
16.3. Sorting Algorithms
Sorting data (i.e., placing the data into some particular order, such as ascending or descending) is one of the most important computing applications. A bank sorts all checks by account number so that it can prepare individual bank statements at the end of each month. Telephone companies sort their lists of accounts by last name and, further, by first name to make it easy to find phone numbers. Virtually every organization must sort some data, and often, massive amounts of it. Sorting data is an intriguing, computer-intensive problem that has attracted intense research efforts.
An important item to understand about sorting is that the end resultthe sorted arraywill be the same no matter which algorithm you use to sort the array. The choice of algorithm affects only the run time and memory use of the program. The rest of the chapter introduces three common sorting algorithms. The first twoselection sort and insertion sortare simple algorithms to program, but are inefficient. The last algorithmmerge sortis a much faster algorithm than selection sort and insertion sort, but is harder to program. We focus on sorting arrays of primitive type data, namely ints. It is possible to sort arrays of objects of classes as well. We discuss this in Section 19.6.1.
16.3.1. Selection Sort
Selection sort is a simple, but inefficient, sorting algorithm. The first iteration of the algorithm selects the smallest element in the array and swaps it with the first element. The second iteration selects the secondsmallest item (which is the smallest item of the remaining elements) and swaps it with the second element. The algorithm continues until the last iteration selects the second-largest element and swaps it with the second-to-last index, leaving the largest element in the last index. After the ith iteration, the smallest i items of the array will be sorted into increasing order in the first i elements of the array.
[Page 797]
As an example, consider the array
34 |
56 |
4 |
10 |
77 |
51 |
93 |
30 |
5 |
52 |
A program that implements selection sort first determines the smallest element (4) of this array which is contained in index 2. The program swaps 4 with 34, resulting in
4 |
56 |
34 |
10 |
77 |
51 |
93 |
30 |
5 |
52 |
The program then determines the smallest value of the remaining elements (all elements except 4), which is 5, contained in index 8. The program swaps 5 with 56, resulting in
4 |
5 |
34 |
10 |
77 |
51 |
93 |
30 |
56 |
52 |
On the third iteration, the program determines the next smallest value (10) and swaps it with 34.
4 |
5 |
10 |
34 |
77 |
51 |
93 |
30 |
56 |
52 |
The process continues until the array is fully sorted.
4 |
5 |
10 |
30 |
34 |
51 |
52 |
56 |
77 |
93 |
Note that after the first iteration, the smallest element is in the first position. After the second iteration, the two smallest elements are in order in the first two positions. After the third iteration, the three smallest
