13.6 Problem Solving: Recursive Searching

In Section 8.10, we discussed one technique for searching an array and we wrote a function that returned the index of a target key in an array if the target was present. To do this, it was necessary to compare array element keys to the target key, starting with the first array element. The comparison process was terminated when the target key was found or the end of the array was reached. We must make N comparisons to determine that a target key is not in an array of N elements. On the average, we must make N/2 comparisons to locate a target key that is in the array. The number of comparisons is directly proportional to the number of elements in the array, so we say that this search method is O(N).

Often we want to search an array whose elements are arranged in order by key field. We can take advantage of the fact that the array keys are in ascending order and terminate the search when an array key greater than or equal to the target key is reached. There is no need to look any further in the array; all other keys will also be larger than the target key.

Both these search techniques are called sequential search because we examine the array elements in sequence. The modified algorithm discussed above is a sequential search of an ordered array. On the average, a sequential search of an ordered array requires N/2 comparisons to locate the target key or determine that it is not in the array; so we still have an O(N) process.

The array searches described above are considered linear searches because their execution time increases linearly (in direct proportion) with the number of array elements. This can be a problem when searching very large arrays (for example, N > 1000). Consequently, we often use the binary search algorithm described below for large sorted arrays.

PROBLEM SPECIFICATION

Your employer has a directory of customers that she keeps in alphabetical order. Since business has been very good, this list has become too large to search efficiently using a linear search. Write an improved search algorithm that takes advantage of the fact that the array is sorted.

ANALYSIS

The binary search algorithm takes advantage of the fact that the array is ordered to eliminate half of the array elements with each probe into the array. Consequently, if the array has 1000 elements, it will either locate the target value or eliminate 500 elements with its first probe, 250 elements with its second probe, 125 elements with its third probe, and so on. Only 10 probes are necessary to completely search an array of 1000 elements. (Why?) You can use the binary search algorithm to find a name in a large metropolitan telephone directory using 30 or less probes, so this algorithm should be suitable for your employer.

The number of probes to completely search an N-element array obviously varies with the number of elements N. Can we find a formula for this variation? Because each probe eliminates half the elements, the maximum number of probes is determined by the number of times we can "cut the array in half" before we are left with only one element.

Let's consider some values of N corresponding to powers of 2. If N is 8 (2³), for example, we first search an eight-element array, then a four-element array, then a two-element array, and finally a one-element array. We cut the array three times. If N is 32 (2⁵), we cut the array five times; if N is 256 (2⁸), we cut the array 8 times; if N is 1024 (2¹⁰), 10 times. Indeed, we make a maximum of only 16 cuts even if N is 32,768 (2¹⁶)! If N is not an exact power of 2, the number of probes is determined by the next higher power of 2: If N is 1000, 1024, or 2^10, is the determining power of 2.

An equivalent way of saying "1024 is 2¹⁰" is "10 is the logarithm, to the base 2, of 1024," or "log₂ 1024 = 10." The formula we are looking for is that the number of binary search probes into an array of N elements is log₂N. Another way of saying this is that binary search is an O(log₂N) algorithm. This is much faster than sequential search, isn't it?

Now let's develop the binary search algorithm. Because the array is ordered, all we have to do is compare the target key with the middle element of the subarray we are searching. If their keys are the same, we are done. If the middle value is larger than the target, we should search the left half of the array next; otherwise, we should search the right half of the array.

The subarray to be searched, Slice, has subscripts Slice'First..Slice'Last. The variable Middle is the subscript of the middle element in this range. The right half of the array (subscripts Middle..Slice'Last) is eliminated by the first probe as shown in Figure 13.10. The new subarray to be searched is Slice(Slice'First..Middle-1), as shown in Figure 13.11. The target value, 35, would be found on this probe.

Figure 13.10
First Probe of Binary Search

Figure 13.10

Figure 13.11
Second Probe of Binary Search

Figure 13.11

The binary search algorithm can be stated clearly using recursion. The stopping cases are:

The array would have no elements (Slice'First>Slice'Last or Slice'Length=0).
The middle value is the target value.

In the first case, the function result is 0 (we'll require that arrays submitted to this algorithm have positive subscript ranges); in the second case, the function result is Middle. The recursive step is to search the appropriate subarray.

Data Requirements

Problem Inputs

array to be searched (Slice : SearchArray)

target being searched for (Target : KeyType)

Problem Outputs

the location of Target or 0 if not found

DESIGN

Algorithm

1. Compute the subscript of the middle element

2. IF the slice has zero length THEN

3. Return a result of 0

ELSIF the target is the middle value THEN

4. Return the subscript of the middle element

ELSIF the target is less than the middle value THEN

5. Search the subarray with subscripts Slice'First..Middle-1

ELSE

6. Search the subarray with subscripts Middle+1..Slice'Last

END IF;

In each of the recursive steps (steps 5 and 6), the bounds of the slice are listed as a part of the actual table parameter in the recursive call. The actual parameters define the search limits for the next probe into the array.

TEST PLAN

You should test the binary search function very carefully. Besides verifying that it locates target values that are present in the array, verify that they also determine when a target value is missing. Use target values that are within the range of values stored in the array, a target value that is less than the smallest value in the array, and a target value that is greater than the largest value in the array. Make sure that the binary search function terminates regardless of whether the target is missing or where it is located if it is not missing.

IMPLEMENTATION

In the initial call to the recursive procedure, the entire array is normally given. For example, given the following declarations:

    SUBTYPE KeyType IS Integer;
    TYPE SearchArray IS ARRAY(Positive RANGE <>) OF KeyType;

    Test: SearchArray(1..9);
    Location: Natural;

the procedure call statement

    Location := BinarySearch (Test, 35);

could be used to search an array Test for the target key 35. Function BinarySearch is shown in Program 13.11.

The assignment statement

    Middle := (Slice'First + Slice'Last) / 2;

computes the subscript of the middle element by finding the average of Slice'First and Slice'Last.

Program 13.11
A Test of Recursive Binary Search

WITH Ada.Text_IO;
WITH Ada.Integer_Text_IO;
PROCEDURE Test_Binary_Search IS
------------------------------------------------------------------------
--| Test program for Recursive Binary Search
--| Author: Michael B. Feldman, The George Washington University 
--| Last Modified: July 1995                                     
------------------------------------------------------------------------

  TYPE SearchArray IS ARRAY(Positive RANGE <>) OF Integer;
  Test: SearchArray(1..9);

  FUNCTION BinarySearch 
    (Slice: SearchArray; Target: Integer) RETURN Natural IS
  -- Performs a recursive binary search of an ordered array of 
  -- keys with bounds Slice'First..Slice'Last.
  -- Pre : Target and Slice are defined.
  --     0 < Slice'First <= Slice'Last
  -- Post: Returns the subscript of Target if found in array Slice;
  --     otherwise, returns 0
  
    Middle : Natural;      -- the subscript of the middle element   

  BEGIN -- BinarySearch   

    Middle := (Slice'First + Slice'Last) / 2; -- define Middle   

    -- Determine if Target is found or missing or redefine subarray.   

    IF Slice'Length = 0 THEN
      RETURN 0;                         -- stopping case: Target missing
    ELSIF Slice(Middle) = Target THEN
      RETURN Middle;                    -- stopping case: Target found
    ELSIF Slice(Middle) > Target THEN   -- search lower subarray   
      RETURN BinarySearch (Slice(Slice'First..Middle-1),Target);
    ELSE                                  -- search upper subarray
      RETURN BinarySearch (Slice(Middle+1..Slice'Last),Target);  
    END IF;

  END BinarySearch;

BEGIN -- Test_Binary_Search

  Test := (20,35,37,40,45,50,51,55,67);

  Ada.Text_IO.Put(Item => "BinarySearch(Test,35) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,35));
  Ada.Text_IO.New_Line;
  Ada.Text_IO.Put(Item => "BinarySearch(Test,19) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,19));
  Ada.Text_IO.New_Line;
  Ada.Text_IO.Put(Item => "BinarySearch(Test,75) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,75));
  Ada.Text_IO.New_Line;
  Ada.Text_IO.Put(Item => "BinarySearch(Test,20) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,20));
  Ada.Text_IO.New_Line;
  Ada.Text_IO.Put(Item => "BinarySearch(Test,67) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,67));
  Ada.Text_IO.New_Line;
  Ada.Text_IO.Put(Item => "BinarySearch(Test,54) is");
  Ada.Integer_Text_IO.Put(Item => BinarySearch(Test,54));
  Ada.Text_IO.New_Line;

END Test_Binary_Search;

Sample Run

BinarySearch(Test,35) is          2
BinarySearch(Test,19) is          0
BinarySearch(Test,75) is          0
BinarySearch(Test,20) is          1
BinarySearch(Test,67) is          9
BinarySearch(Test,54) is          0

The binary search function is written to return 0 if the target is not found. This works only because we have required that the array bounds be positive, because otherwise 0 could be a valid subscript. Binary search would be more general if it could accept arrays with arbitrary integer bounds; in that case, it would be better to convert the binary search function to a procedure with two OUT parameters: the index of the target if found and a program flag indicating whether the target was found. This modification is left as an exercise.