Radix Sort

Problem Description

The Radix Sort is a special-purpose sort that operates in linear time, O(N), faster than any of the advanced general purpose sorts such as QuickSort which are O(NlogN). The restriction is that the key field for the Radix Sort must be of a discrete type such as Integer or String. It won't work on real numbers. (Consider most keys are Strings or Integers, this restriction is not overly constraining).

Two attributes of the key are of importance to the Radix Sort. First, the is the key length, or the number of positions in the key. A string is composed of discrete character positions, so the key length is simply the string length. An integer is considered as composed of discrete digit positions, so the key length is the number of digits in the key. The second attribute is the radix, which is the number of possible values that can be assigned to a single position in the key.

For example, if the key is a decimal integer, each position is a digit and has ten possibilities: 0..9. If the key is a string of letters and case is not important, then each position has 26 possibilities: 'a'..'z'.

In a nutshell, radix sort divides the items to be sorted into as many subgroups as there are possible values for each position in the key (radix subgroups). Then the subgroups are combined back into a single list and the process is repeated. Beginning with the least-significant position in the key, regroup the values in order, and repeat the process as many times as there are positions in the key, moving one position to the left each time.

Example

Lets look at an example of Radix Sort where we use 3-digit integers for keys. Here is our initial unsorted list:
813  516  223  462  723  561  416  834  616  323
Since the key length is three, Radix Sort is going to make 3 passes through this list. One pass for each of the 3 digits.

On the first pass through the list, the third (or least significant) digit is selected for sorting.

Taking keys one at a time from the list, they are distributed into radix subgroups, or "bins."

bin0
bin1 561
bin2 462
bin3 813 223 723 323 
bin4 834
bin5
bin6 516 416 616
bin7
bin8
bin9
Next the items are taken out of the bins in orderfrom 0 to 9 and recombined into a single list.
561  462  813  223  723  323  834  516  416  616
On the second pass, the middle digit is selected, and the keys are distributed into bins based on the value of the middle digit.
bin0
bin1 813 516 416 616 
bin2 223 723 323
bin3 834
bin4 
bin5
bin6 561 462 
bin7
bin8
bin9
The items are then combined back into a single list:
813  516  416  616  223  723  323  834  561  462
On the third and final pass, the first digit (most significant) is selected for sorting, and the keys are distributed based on the value of their first digit.
bin0
bin1
bin2 223
bin3 323
bin4 416 462
bin5 516 561
bin6 616
bin7 723
bin8 813 834
bin9
After reassembling the bins into the final list, we see the items are sorted.
223  323  416  462  516  561  616  723  813  834
 
Historical note: This algorithm was implemented in a sorting machine by IBM in the 1920's. (See photo).

Problem Requirements

Write a reusable sort procedure that implements the Radix Sort. The parameters to the sort procedure are You may assume the items in the array are strings and that they are all the same length.  You may assume that the parameters are valid.

"Reusable" means that it is a stand-alone procedure that can be invoked from a separate module.

Create a test driver that reads a list of character strings, one per line, from standard input. The command line should contain the key length and the radix lower and upper bounds, in that order.

For example (in Java):
java RadixDriver 3 0 9
The driver should ignore lines that do not contain the number of characters specified in key length. The driver should ignore lines that contain characters outside the radix range. Have the data sorted by the Radix Sort module, then output the results, one per line, to standard output.

Implementation Guidelines

An array of queues (or stacks) is suggested. There are MANY possible ways to implement this. Notice that this sort makes NO comparisons. What is required, however, is an "extraction" operation, which can extract the value at a given position from the key in order to determine the bin number. The extraction must return a value that is within the radix range.

Note: Please design your own solution. While this algorithm is easy to find on the Web or in textbooks, please write your own algorithm from scratch.

Testing

You need to provide four test cases that demonstrate that your program can
  1. Sort strings of length three containing uppercase alphabetic characters.  (Be sure to include 'A' and 'Z'.)
  2. Sort whole numbers containing 5 digits. (Be sure to include '0' and '9'.)
  3. Sort this test data file provided by the instructor.
  4. Ignore lines that do not contain the number of characters specified in key length (too short and too long), or that contain characters outside the radix range.
Submit printouts of your source code, sample input data and resulting output.




Home