In this section, we will discuss two common problems in processing arrays:
searching an array to determine the location of a desired value and sorting an
array to rearrange the array elements in sequence. As an example of an array
search, we might wish to search the array of exam scores read in by procedure
ReadScores
(see Section 8.8,
Program
8.12) to determine which student, if any, got a particular score. An
example of an array sort would be rearranging the array elements so that they
are in increasing (or decreasing) order by score.
MaxSize : CONSTANT Positive := 250; MaxScore : CONSTANT Positive := 100; SUBTYPE ClassIndex IS Positive RANGE 1..MaxSize; SUBTYPE ClassRange IS Natural RANGE 0..MaxSize; SUBTYPE ScoreRange IS Natural RANGE 0..MaxScore; TYPE ScoreArray IS ARRAY (ClassIndex) OF ScoreRange; Scores : ScoreArray; ClassSize : ClassRange;We can search an array for a particular score (called the search target) by examining each array element and testing to see whether it matches the target score. If a match occurs, we have found the target and can return its subscript as the search result. If a match does not occur, we should continue searching until either we get a match or we test all array elements. The input data requirements for a search function are:
Scores : ScoreArray -- the array to be searched ClassSize : ClassRange -- the number of elements in Scores Target : ScoreRange -- the score being searched forThe function result is the subscript of the first element containing
Target
, or zero if Target
was not found. The
algorithm for searching the array is:
Algorithm for Search
1. Start with the first array element.
2. WHILE
the current element does not match the target
AND
the current element is not the last element
LOOP
3. Advance to the next element.
END LOOP;
4. IF
the current element matches the target THEN
5. Return its subscript.
ELSE
6. Return zero.
END IF;
The WHILE
loop in step 2 compares each array element to the
target. Loop exit occurs if there is a match or if the element being tested is
the last element. After loop exit, the IF
statement defines the
function result by repeating the last comparison.
Program
8.14 shows function Search
. The function uses a local variable
CurrentScore
to control the WHILE
loop.
Program 8.14
FUNCTION Search (Scores: ScoreArray; ClassSize: ClassRange; Target: ScoreRange) RETURN ClassRange IS -- Searches for Target in array Scores -- Pre : ClassSize and subarray Scores(1..ClassSize) are defined -- Post: Returns the subscript of Target if found; -- otherwise, returns 0 CurrentScore: ClassIndex; -- array subscript BEGIN -- Search -- Compare each value in Scores to Target until done CurrentScore := 1; -- Start with the first record WHILE (Scores(CurrentScore) /= Target) AND (CurrentScore <= ClassSize) LOOP -- invariant: -- CurrentScore <= ClassSize + 1 and -- no prior array element was Target CurrentScore := CurrentScore + 1; -- advance to next score END LOOP; -- assertion: Target is found or last element is reached. -- Define the function result. IF Scores(CurrentScore) = Target THEN RETURN CurrentScore; ELSE RETURN 0; END IF; END Search;The
WHILE
loop condition compares the array element selected by
CurrentScore
to the target. If they are equal, loop exit occurs.
If they are unequal and the last element has not been reached,
CurrentScore
advances to the next array element.
Loop exit occurs when the target is found or the last element is reached.
After loop exit, the IF
statement returns the subscript
(CurrentScore
) of the current element if Target
was
found; otherwise, it returns zero.
Sorting an Array
In Section 6.6 we discussed a simple sort operation involving three numbers ( Program 6.9). We performed the sort by examining pairs of numbers and exchanging them if they were out of order. There are many times when we would like to sort the elements in an array, for example, to display a grade report in alphabetical order or in order by score.
This section discusses a fairly intuitive (but not very fast) algorithm
called the selection sort. To perform a selection sort of an array with
N
elements (subscripts 1..N
), we locate the largest
element in the array and then switch the largest element with the element at
subscript N
, thereby placing the largest element at position
N
. Then we locate the largest element remaining in the subarray
with subscripts 1..N-1
, and switch it with the element at
subscript N-1
, thereby placing the second largest element at
position N-1
. Then we locate the largest element remaining in
subarray 1..N-2
and switch it with the element at subscript
N-2
, and so on.
Figure
8.11 traces the operation of the selection sort algorithm. The diagram on
the left shows the original array. Each subsequent diagram shows the array
after the next largest element is moved to its final position in the array. The
subarray in the darker color represents the portion of the array that is sorted
after each exchange occurs. Note that it will require, at most,
N-1
exchanges to place the largest element of an array with
N
elements. The algorithm follows.
Figure 8.11 Trace of Selection Sort
Selection Sort Algorithm
1. FOR PositionToFill IN 1..N-1 LOOP
2. Find the largest element in subarray PositionToFill
+1..N.
3. IF
the largest element
is not at subscript PositionToFill
THEN
Switch the largest element with the one at subscript PositionToFill
.
END IF;
END LOOP;
The refinement of step 2 also contains a FOR
loop and is shown
next.
Step 2 Refinement
2.1. Save PositionToFill
as the position of the largest so far in
the subarray
2.2. FOR ItemToCompare IN PositionToFill+1..N LOOP
2.3. IF
the element at
ItemToCompare
is bigger than largest so far
THEN
Save ItemToCompare
as the position of the largest so far.
END IF;
END LOOP;
Procedure SelectSort
in
Program
8.15 implements the selection sort algorithm.
Program 8.15
PROCEDURE SelectSort (Scores: IN OUT ScoreArray; ClassSize: IN ClassRange) IS -- Pre: Scores and ClassSize are defined -- Post: The first ClassSize elements of Scores -- are sorted in descending order IndexOfMax: ClassRange; BEGIN FOR PositionToFill IN 1..ClassSize - 1 LOOP -- Find the element in subarray PositionToFill..ClassSize -- with largest Score IndexOfMax := PositionToFill; FOR ItemToCompare IN PositionToFill .. ClassSize LOOP IF Scores(ItemToCompare) > Scores(IndexOfMax) THEN IndexOfMax := ItemToCompare; END IF; END LOOP; -- assert: element at IndexOfMax is largest in subarray IF IndexOfMax /= PositionToFill THEN Exchange(Scores(PositionToFill),Scores(IndexOfMax)); END IF; END LOOP; END SelectSort;Local variable
IndexOfMax
holds the location of the largest exam score
found so far in the current subarray. After each execution of the inner
FOR
loop, procedure Switch
is called to exchange the
elements with subscripts IndexOfMax
and
PositionToFill
, provided that the element at
PositionToFill
is not the next largest element. After the
execution of SelectSort
, the elements of the array
Scores
are in decreasing order.
There are many algorithms for searching and sorting arrays. Because arrays can have a very large number of elements, the time required to process all the elements of an array can become significant, so it is important to have some idea of the relative efficiency of different algorithms. It is difficult to get a precise measure of the performance of an algorithm or program. For this reason, we normally try to approximate the effect on an algorithm of a change in the number of items, N, that it processes. In this way, we can see how an algorithm's execution time increases with N, so we can compare two algorithms by examining their growth rates.
For example, if we determine that the expression
2N2 + N - 5
expresses the relationship between processing time and N, we say that the algorithm is an O(N2) algorithm where O is an abbreviation for Order of Magnitude. (This notation is called Big-O Notation.) The reason that this is an O(N2) algorithm instead of an O(2N2) algorithm or an O(N2 + N - 5) is that we are interested in only the fastest-growing term (the one with the largest exponent) and we ignore constants.
To search an array of N elements for a target, we have to examine all N elements when the target is not present in the array. If the target is in the array, we only have to search until we find it. However, it could be anywhere in the array, and it is equally likely to be at the beginning of the array as at the end. So on average, we have to examine N/2 array elements to locate a target value that is in an array. This means that an array search is an O(N) process, so the growth rate is linear.
To determine the growth rate of a sorting algorithm, we normally focus on the number of array element comparisons that it requires. To perform a selection sort on an array with N elements requires N - 1 comparisons during the first pass through the array, N - 2 comparisons during the second pass, and so on. Therefore, the total number of comparisons is represented by the series
1 + 2 + 3 + ... + (N - 2) + (N - 1)
The value of this series is expressed in closed form as
N × ( N - 1) N2 N ------------- = --- - - 2 2 2
Therefore, selection sort is an O(N2) process and the growth rate is quadratic (proportional to the square of the number of elements).
What difference does it make whether an algorithm is an O(N) process
or an O(N2) process? Table 8.5 evaluates N and
N2 for different values of N. A doubling of N causes
N2 to increase by a factor of 4. Since N increases much more
slowly with N, the performance of an O(N) algorithm is not as
adversely affected by an increase in N as is an O(N2)
algorithm.
Table 8.5
Table of Values of N and N2
N N2
2 4 4 16 8 64 16 256 32 1024 64 4096 128 16384 256 65536 512 262144
Since in this book we will be using relatively small arrays in our programming, algorithm performance is not a major concern. Analyzing the performance of algorithms is, however, an important subject about which you will study a great deal as you progress in your education, because knowing how to compute the expected Big O of an algorithm can, for large arrays and other data structures, make the difference between writing a program whose running time is acceptable and one that may run for weeks or months.
We begin with a set of declarations similar to the ones in the
previous section. The new declarations define a subtype
StudentName
as a string of (exactly) 20 characters, a type
ScoreRecord
as a record containing a student's name and a test
score, and a type ScoreArray
as an array of these records.
MaxSize : CONSTANT Positive := 250; MaxScore : CONSTANT Positive := 100; SUBTYPE StudentName IS String(1..20); SUBTYPE ClassIndex IS Positive RANGE 1..MaxSize; SUBTYPE ClassRange IS Natural RANGE 0..MaxSize; SUBTYPE ScoreRange IS Natural RANGE 0..MaxScore; TYPE ScoreRecord IS RECORD Name: StudentName; Score: ScoreRange; END RECORD; TYPE ScoreArray IS ARRAY (ClassIndex) OF ScoreRecord; Scores : ScoreArray; ClassSize : ClassRange;
These declarations mean that each element of
Scores
is a record with two fields. We can store values in an
element of the array by combining subscripting with field selection:
Scores(27).Name := "Jones, Mary "; Scores(27).Score := 79;
Note that the string representing the name must be exactly 20 characters long, and therefore we have included the extra blanks as required. Figure 8.12 shows a diagram of this array structure, with the first three records occupied.
Figure 8.12
An Array of Records
In fact, arrays of records are more often filled by reading fields
from an external file.
Program
8.16 is a revised version of
Program
8.15. The procedure GetRecords
assumes that a file
F
of type Ada.Text_IO.File_Type
has been created,
either by another Ada program or by a human using a text editor. In this file,
each line consists of a student name and a score. In creating such a file, one
must be careful to provide exactly 20 characters in the student name. Assuming
that the person who created this file was careful and that all the data in the
file is valid, the two fields in each student record are read from the file by
Ada.Text_IO.Get(File => F, Item => Scores(WhichStudent).Name); Ada.Integer_Text_IO.Get(File => F, Item => Scores(WhichStudent).Score);
This procedure is similar to
Program
5.5 in its use of file operations from Ada.Text_IO
.
Given the SelectSort
procedure from
Program
8.15, sorting the file of records is easy. Instead of using the "whole"
array element as the key, or basis of comparison, the score field of each array
element is used. In the sort procedure itself, only a single line needs to be
modified: the line which compares two array elements needs to be changed to
IF Scores(ItemToCompare).Score > Scores(IndexOfMax).Score THEN
After the execution of procedure SelectSort
, the
student records will be ordered by exam score (record with smallest score
first).
The revised version of SelectSort
appears as a procedure in
Program
8.16, with the necessary changes made to accommodate the fact that we are
using a file of score records instead of an array of scores.
Program 8.16
WITH Ada.Text_IO; WITH Ada.Integer_Text_IO; WITH Simple_Dates; PROCEDURE Sort_Score_File IS ------------------------------------------------------------------------ --| Sorts and displays an array of test score records --| The records are read from a file scorfile.dat --| Author: Michael B. Feldman, The George Washington University --| Last Modified: July 1995 ------------------------------------------------------------------------ MaxSize : CONSTANT Positive := 250; MaxScore : CONSTANT Positive := 100; SUBTYPE StudentName IS String(1..20); SUBTYPE ClassIndex IS Positive RANGE 1..MaxSize; SUBTYPE ClassRange IS Natural RANGE 0..MaxSize; SUBTYPE ScoreRange IS Natural RANGE 0..MaxScore; TYPE ScoreRecord IS RECORD Name: StudentName; Score: ScoreRange; END RECORD; TYPE ScoreArray IS ARRAY (ClassIndex) OF ScoreRecord; Scores : ScoreArray; ClassSize : ClassRange; PROCEDURE GetRecords (Scores: OUT ScoreArray; ClassSize: OUT ClassRange) IS -- Pre: None -- Post: Scores contains records read from file; ClassSize -- indicates the number of records read TestScores: Ada.Text_IO.File_Type; -- variable naming the input file TempSize: ClassRange; TempRecord: ScoreRecord; BEGIN -- GetRecords -- Open the file and associate it with the file variable name Ada.Text_IO.Open (File => TestScores, Mode => Ada.Text_IO.In_File, Name => "scorfile.dat"); -- Read each data item -- and store it in the appropriate element of Scores TempSize := 0; -- initial class size -- Read each array element until done. WHILE (NOT Ada.Text_IO.End_Of_File(TestScores)) AND (TempSize < MaxSize) LOOP -- invariant: -- Records remain in the file and -- TempSize <= MaxSize Ada.Text_IO.Get(File => TestScores, Item => TempRecord.Name); Ada.Integer_Text_IO.Get (File => TestScores, Item => TempRecord.Score); TempSize := TempSize + 1; Scores(TempSize) := TempRecord; -- Save the score END LOOP; -- assert: -- End of file reached or -- TempSize is MaxSize IF TempSize = MaxSize THEN Ada.Text_IO.Put(Item => "Array is filled."); Ada.Text_IO.New_Line; END IF; ClassSize := TempSize; END GetRecords; PROCEDURE Exchange(Student1, Student2: IN OUT ScoreRecord) IS -- Pre: Student1 and Student2 are defined -- Post: Student1 and Student2 are interchanged TempRecord: ScoreRecord; BEGIN TempRecord := Student1; Student1 := Student2; Student2 := TempRecord; END Exchange; PROCEDURE SelectSort (Scores: IN OUT ScoreArray; ClassSize: IN ClassRange) IS -- Pre: Scores and ClassSize are defined -- Post: The records in the first ClassSize elements of Scores -- are sorted in descending order by score IndexOfMax: ClassRange; BEGIN FOR PositionToFill IN 1..ClassSize - 1 LOOP -- Find the element in subarray 1..PositionToFill -- with largest Score IndexOfMax := PositionToFill; FOR ItemToCompare IN PositionToFill + 1 .. ClassSize LOOP IF Scores(ItemToCompare).Score > Scores(IndexOfMax).Score THEN IndexOfMax := ItemToCompare; END IF; END LOOP; -- assert: element at IndexOfMax is largest in subarry IF IndexOfMax /= PositionToFill THEN Exchange(Scores(PositionToFill),Scores(IndexOfMax)); END IF; END LOOP; END SelectSort; PROCEDURE DisplayScores (Scores: IN ScoreArray; ClassSize: IN ClassRange) IS -- Pre: Scores and ClassSize are defined -- Post: dislays the first ClassSize records in Scores BEGIN FOR I IN 1..ClassSize LOOP Ada.Integer_Text_IO.Put(Item => I, Width => 3); Ada.Text_IO.Put(Item => " "); Ada.Text_IO.Put(Item => Scores(I).Name); Ada.Integer_Text_IO.Put(Item => Scores(I).Score, Width => 4); Ada.Text_IO.New_Line; END LOOP; END DisplayScores; BEGIN -- Sort_Score_File Ada.Text_IO.Put(Item => "Today is "); Simple_Dates.Put(Item => Simple_Dates.Today); Ada.Text_IO.New_Line; GetRecords(Scores => Scores, ClassSize => ClassSize); Ada.Text_IO.Put(Item => "Original Test File:"); Ada.Text_IO.New_Line; Ada.Text_IO.New_Line; DisplayScores(Scores => Scores, ClassSize => ClassSize); SelectSort(Scores => Scores, ClassSize => ClassSize); Ada.Text_IO.New_Line; Ada.Text_IO.Put(Item => "Sorted Test File:"); Ada.Text_IO.New_Line; Ada.Text_IO.New_Line; DisplayScores(Scores => Scores, ClassSize => ClassSize); Ada.Text_IO.New_Line; END Sort_Score_File;Sample Run
Today is AUG 24 1995 Original Test File: 1 Jones, Mary 75 2 Hubbard, Kathy 99 3 Andersen, Lars 80 4 Gribben, George 21 5 Rogers, Roy 34 6 Evans, Dale 76 7 VanDoren, Charles 100 Sorted Test File: 1 VanDoren, Charles 100 2 Hubbard, Kathy 99 3 Andersen, Lars 80 4 Evans, Dale 76 5 Jones, Mary 75 6 Rogers, Roy 34 7 Gribben, George 21
Search
if the last student score
matches the target? Why can't we use the condition CurScore <=
ClassSize
in the WHILE
condition?
Found
, that is initially False
and is set to
True
inside a search loop if the target value is found. Loop
repetition continues as long as Found
is still False
and all elements have not been tested. After loop exit, the value of
Found
determines whether the current subscript or zero is returned
as the function result. Write the procedure body.
10 55 34 56 76 5
Class
by student name instead of score?
Scores(1..i-1)
, explain why the program will be a bit faster if we
start the search with the last element (i.e., IndexOfMax := i-1
)
rather than the first element.
SelectSort
and ReadScores
,
and function Search
, into a program to read a set of scores from
the terminal, sort the array, and then determine whether any student got a
score of 75 on the test.
Copyright © 1996 by Addison-Wesley Publishing Company, Inc.