In Section 8.10, we discussed one technique for searching an array and we wrote a function that returned the index of a target key in an array if the target was present. To do this, it was necessary to compare array element keys to the target key, starting with the first array element. The comparison process was terminated when the target key was found or the end of the array was reached. We must make N comparisons to determine that a target key is not in an array of N elements. On the average, we must make N/2 comparisons to locate a target key that is in the array. The number of comparisons is directly proportional to the number of elements in the array, so we say that this search method is O(N).
Often we want to search an array whose elements are arranged in order by key field. We can take advantage of the fact that the array keys are in ascending order and terminate the search when an array key greater than or equal to the target key is reached. There is no need to look any further in the array; all other keys will also be larger than the target key.
Both these search techniques are called sequential search because we examine the array elements in sequence. The modified algorithm discussed above is a sequential search of an ordered array. On the average, a sequential search of an ordered array requires N/2 comparisons to locate the target key or determine that it is not in the array; so we still have an O(N) process.
The array searches described above are considered linear searches because their execution time increases linearly (in direct proportion) with the number of array elements. This can be a problem when searching very large arrays (for example, N > 1000). Consequently, we often use the binary search algorithm described below for large sorted arrays.
PROBLEM SPECIFICATION
Your employer has a directory of customers that she keeps in alphabetical order. Since business has been very good, this list has become too large to search efficiently using a linear search. Write an improved search algorithm that takes advantage of the fact that the array is sorted.
ANALYSIS
The binary search algorithm takes advantage of the fact that the array is ordered to eliminate half of the array elements with each probe into the array. Consequently, if the array has 1000 elements, it will either locate the target value or eliminate 500 elements with its first probe, 250 elements with its second probe, 125 elements with its third probe, and so on. Only 10 probes are necessary to completely search an array of 1000 elements. (Why?) You can use the binary search algorithm to find a name in a large metropolitan telephone directory using 30 or less probes, so this algorithm should be suitable for your employer.
The number of probes to completely search an N-element array obviously varies with the number of elements N. Can we find a formula for this variation? Because each probe eliminates half the elements, the maximum number of probes is determined by the number of times we can "cut the array in half" before we are left with only one element.
Let's consider some values of N corresponding to powers of 2. If N is 8 (23), for example, we first search an eight-element array, then a four-element array, then a two-element array, and finally a one-element array. We cut the array three times. If N is 32 (25), we cut the array five times; if N is 256 (28), we cut the array 8 times; if N is 1024 (210), 10 times. Indeed, we make a maximum of only 16 cuts even if N is 32,768 (216)! If N is not an exact power of 2, the number of probes is determined by the next higher power of 2: If N is 1000, 1024, or 210, is the determining power of 2.
An equivalent way of saying "1024 is 210" is "10 is the logarithm, to the base 2, of 1024," or "log2 1024 = 10." The formula we are looking for is that the number of binary search probes into an array of N elements is log2N. Another way of saying this is that binary search is an O(log2N) algorithm. This is much faster than sequential search, isn't it?
Now let's develop the binary search algorithm. Because the array is ordered, all we have to do is compare the target key with the middle element of the subarray we are searching. If their keys are the same, we are done. If the middle value is larger than the target, we should search the left half of the array next; otherwise, we should search the right half of the array.
The subarray to be searched, Slice
, has subscripts
Slice'First..Slice'Last
. The variable Middle
is the
subscript of the middle element in this range. The right half of the array
(subscripts Middle..Slice'Last
) is eliminated by the first probe
as shown in
Figure
13.10. The new subarray to be searched is
Slice(Slice'First..Middle-1)
, as shown in
Figure
13.11. The target value, 35, would be found on this probe.
Figure 13.10
Figure 13.11
Second Probe of Binary Search
The binary search algorithm can be stated clearly using recursion. The stopping cases are:
Slice'First>Slice'Last
or Slice'Length=0
).
Middle
. The recursive step is to search the appropriate subarray.
Data Requirements
Problem Inputs
array to be searched (Slice : SearchArray
)
target being searched for (Target : KeyType
)
Problem Outputs
the location of Target
or 0 if not found
DESIGN
Algorithm
1. Compute the subscript of the middle element
2. IF
the slice has zero length THEN
3. Return a result of 0
ELSIF
the target is the middle value THEN
4. Return the subscript of the middle element
ELSIF
the target is less than the middle value THEN
5. Search the subarray with subscripts Slice'First..Middle-1
ELSE
6. Search the subarray with subscripts Middle+1..Slice'Last
END IF;
In each of the recursive steps (steps 5 and 6), the bounds of the slice are listed as a part of the actual table parameter in the recursive call. The actual parameters define the search limits for the next probe into the array.
TEST PLAN
You should test the binary search function very carefully. Besides verifying that it locates target values that are present in the array, verify that they also determine when a target value is missing. Use target values that are within the range of values stored in the array, a target value that is less than the smallest value in the array, and a target value that is greater than the largest value in the array. Make sure that the binary search function terminates regardless of whether the target is missing or where it is located if it is not missing.
IMPLEMENTATION
In the initial call to the recursive procedure, the entire array is normally given. For example, given the following declarations:
SUBTYPE KeyType IS Integer; TYPE SearchArray IS ARRAY(Positive RANGE <>) OF KeyType; Test: SearchArray(1..9); Location: Natural;the procedure call statement
Location := BinarySearch (Test, 35);could be used to search an array
Test
for the
target key 35
. Function BinarySearch
is shown in
Program
13.11.
The assignment statement
Middle := (Slice'First + Slice'Last) / 2;computes the subscript of the middle element by finding the average of
Slice'First
and Slice'Last
.
Program 13.11
WITH Ada.Text_IO; WITH Ada.Integer_Text_IO; PROCEDURE Test_Binary_Search IS ------------------------------------------------------------------------ --| Test program for Recursive Binary Search --| Author: Michael B. Feldman, The George Washington University --| Last Modified: July 1995 ------------------------------------------------------------------------ TYPE SearchArray IS ARRAY(Positive RANGE <>) OF Integer; Test: SearchArray(1..9); FUNCTION BinarySearch (Slice: SearchArray; Target: Integer) RETURN Natural IS -- Performs a recursive binary search of an ordered array of -- keys with bounds Slice'First..Slice'Last. -- Pre : Target and Slice are defined. -- 0 < Slice'First <= Slice'Last -- Post: Returns the subscript of Target if found in array Slice; -- otherwise, returns 0 Middle : Natural; -- the subscript of the middle element BEGIN -- BinarySearch Middle := (Slice'First + Slice'Last) / 2; -- define Middle -- Determine if Target is found or missing or redefine subarray. IF Slice'Length = 0 THEN RETURN 0; -- stopping case: Target missing ELSIF Slice(Middle) = Target THEN RETURN Middle; -- stopping case: Target found ELSIF Slice(Middle) > Target THEN -- search lower subarray RETURN BinarySearch (Slice(Slice'First..Middle-1),Target); ELSE -- search upper subarray RETURN BinarySearch (Slice(Middle+1..Slice'Last),Target); END IF; END BinarySearch; BEGIN -- Test_Binary_Search Test := (20,35,37,40,45,50,51,55,67); Ada.Text_IO.Put(Item => "BinarySearch(Test,35) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,35)); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "BinarySearch(Test,19) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,19)); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "BinarySearch(Test,75) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,75)); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "BinarySearch(Test,20) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,20)); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "BinarySearch(Test,67) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,67)); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "BinarySearch(Test,54) is"); Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,54)); Ada.Text_IO.New_Line; END Test_Binary_Search;Sample Run
BinarySearch(Test,35) is 2 BinarySearch(Test,19) is 0 BinarySearch(Test,75) is 0 BinarySearch(Test,20) is 1 BinarySearch(Test,67) is 9 BinarySearch(Test,54) is 0The binary search function is written to return 0 if the target is not found. This works only because we have required that the array bounds be positive, because otherwise 0 could be a valid subscript. Binary search would be more general if it could accept arrays with arbitrary integer bounds; in that case, it would be better to convert the binary search function to a procedure with two
OUT
parameters: the index of the target if found and a program
flag indicating whether the target was found. This modification is left as an
exercise.
Copyright © 1996 by Addison-Wesley Publishing Company, Inc.