Data Management: Sorts, Lists, and Indexes |
C C C |
|
10C |
|
C C C |
|
C C |
As you can see in Figure 10.2, the program begins by opening the two input files and the output file. If there are no errors in the file-open stage, the program reads a record from each file (assuming that the program should read a record and that the program has not reached the end of the file).
After the records are read, the program makes its comparisons (taking into consideration possible end-of-file conditions), then writes the correct record. When the program has the same record from both files, it discards the second file’s record, sets the flag indicating that it needs a new record from the second file, and saves the first file’s record.
When the program reaches the end of both input files, it closes all the files and ends. It is a simple program that works quickly.
When you write a purge function, remember that a record might be repeated many times. When your program finds a duplicate and therefore reads a new record, it still must test to be sure that it has read a unique record. The program might be reading a third duplicate, for example, that must also be discarded.
Sorting, Merging, and Purging All in One
Usually, a single utility offers sort, merge, and purge functions. This type of utility will have one or two input filenames, sort the files, purge the duplicates, and provide a single output file.
A variation of a sort program is a sort that works on a file of any size. The process to create the ultimate sort follows:
1.Read the file, stopping at the end of the file or when there is no more free memory.
2.Sort this part of the file. Write the result of the sort to a temporary work file.
3.If the program has reached the end of the file and there are no more records to read in, the program renames step 2’s work file to the output file’s name and ends the program.
4.Again read the file, stopping when there is no more free memory or when the end of the file is reached.