Skip to main content

Section 26.14 Non-Comparison Sorts

Subsection 26.14.1 Comparison Sorts

Is it a coincidence that quick sort, merge sort and other efficient sorts all have a time complexity of \(O(n \cdot \log n)\text{?}\) It is not. We will not attempt to do so here, but it can be proven that any sorting algorithm that compares the value of elements to decide β€œwhich comes first?” must have a time complexity of at least \(O(n \cdot \log n)\) in the general case.
Those algorithms are known as comparison sorts. Comparison sorts are the general purpose sorting algorithms. If we need to take \(n\) possibly unique values that we do not know in advance and put them in order, we know that a comparison sort can do so.
However, we may be able to put items in some kind of order without comparing them directly. Algorithms that do this are called non-comparison sorts. Non-comparison sorts can achieve better than \(O(n \cdot \log n)\) time complexity. However, they do so by using additional assumptions or requirements about the input data. Thus they are not as general purpose as comparison sorts.

Subsection 26.14.2 Bucket Sort

One example of a non-comparison sort is bucket sort, which distributes elements into a number of buckets. In a bucket sort, we know that there are \(k\) different β€œbuckets” and that each item belongs in one of the buckets.
For example, say we have a list of students. Among other things, each student has a name and a class (Freshman, Sophomore, Junior, or Senior). We want to sort the list by class, putting Freshmen first, Sophomores second, Juniors third, and Seniors last. Within each class we don’t care about the order (or we just want to leave those items in their existing order.
An efficient implementation of the process would:
  • Create four counters, one for each class.
  • Walk through the list of students, incrementing the counter for the appropriate class for each student.
  • Use the counters to determine the starting index for each class in the sorted list. If there are 3 students in the first class, we know that the second class starts at index 3. If there are 2 more students in the second class, the third class starts at index 5 (3 + 2), and so on.
  • Place each student into the correct position in the sorted list based on their class. As we place each student, we increment the starting index for that class so the next student of the same class goes to the next position.
That is what this interactive does:

Activity 26.14.1. Bucket Sort Demonstration.

Note 26.14.1.

We also could simply make 4 new lists, and copy the students into the appropriate list for each class. Then, make a new empty list (or clear the original one) and copy the individual lists one at a time back into the new full list. This would likely be less efficient as it needs to copy each item multiple times.
How long does this take? Let’s analyze it step by step. \(n\) will be the number of students and \(k\) will be the number of classes levels (which we know is 4, but we will keep it general).
  • Create four counters, one for each class. We need to make/initialize a list of \(O(k)\) counters.
  • Walk through the list of students, incrementing the counter for the appropriate class for each student. For \(n\) items, we do some simple constant work. \(O(n)\)
  • Use the counters to determine the starting index for each class in the sorted list. This involves a loop over the \(k\) counters. \(O(k)\)
  • Place each student into the correct position in the sorted list based on their class. \(n\) times, we need to copy a student and update a counter. Those operations each take constant time, so the total is \(O(n)\text{.}\)
Adding these up we get:
\begin{equation*} O(k) + O(n) + O(k) + O(n) = O(n + k) \end{equation*}
When k is a known constant and is smaller than n (like 4 in this example), the time complexity simplifies to \(O(n)\text{.}\)
However, now imagine we want to perfectly sort 1,000 32-bit integers using bucket sort. There are \(2^{32}\) different possible values, so we would need that many buckets to sort perfectly in one pass. Now, \(O(n + k)\) becomes \(O(1000 + 2^{32})\text{,}\) which is effectively \(O(2^{32})\) or about 4 billion units of work. This is way worse than \(O(n \log n)\text{.}\)
Thus, non-comparison sorts like bucket sort can be very efficient when the number of buckets is small relative to the number of items being sorted. But they are not general purpose tools.

Subsection 26.14.3 Radix Sort

Another non-comparison sort is radix sort, which can be used to sort data that can be represented as sequences of digits or characters.
The general strategy of radix sort is to sort the data one digit or character at a time using something like a bucket sort. Given this list of strings:
CAT
BAT
RAG
ACE
CAB
BAD
RAD
We could sort them using radix sort by first sorting on the last character, then the middle character, then the first character. After sorting on the last character we would have:
CAB
BAD
RAD
ACE
RAG
CAT
BAT
Note that the last letters are in alphabetical order. Then we could do a stable sort on the middle character. Because we are doing a stable sort, the existing order of the last character is preserved. This would mean that every string’s last two letters are in order:
RAB
CAB
BAD
RAD
CAT
BAT
ACE
Finally, we do a stable sort on the first character to get the fully sorted list:
ACE
BAD
BAT
CAB
CAT
RAB
RAD
Radix sorting takes \(O(d \times (n + k))\) where \(d\) is the number of digits, \(n\) is the number of items, and \(k\) is the range of each digit. If the number of digits is small relative to the number of items and the range of each digit is small, radix sort can be very efficient.
In general, the number of digits will only be small relative to the number of values and range of each digit when there are a large number of duplicate values. The number of digits needed to represent \(n\) unique values where each digit can have \(k\) possible values, is \(\log_k n\text{.}\) So, for \(n\) unique values, \(O(d \times (n + k))\) is potentially \(O(\log_k n \cdot (n + k))\) which is equivalent to \(O(n \cdot \log n)\text{.}\)
Thus, in situations where we don’t have a large number of duplicate items, radix sort ends up being equivalent to efficient comparison sorts. It is a special purpose tool useful only when sorting items that can be thought of as a sequence of digits and we expect to have lots of duplicate values.

Checkpoint 26.14.1.

Checkpoint 26.14.2.

Which statements are true about non-comparison sorts?
  • Non-comparison sorts can achieve worst-case time complexity better than \(O(n \cdot \log n)\text{.}\)
  • Non-comparison sorts are good general purpose sorting algorithms.
  • Non-comparison sorts do not rely on comparing elements to determine their order.
  • Bucket sort is an example of a non-comparison sort.
  • Radix sort is an example of a non-comparison sort.
  • Insertion sort is an example of a non-comparison sort.
You have attempted of activities on this page.