Quick Sort Efficiency

Section 26.13 Quick Sort Efficiency

When everything goes well, quick sort runs in \(O(n \cdot \log n)\) time. The analysis of its efficiency follows the same logic as merge sort:

Informal.
Each “level” needs to partition \(n\) total items. Initially that is 1 group of \(n\text{,}\) then 2 groups of \(n/2\text{,}\) then 4 groups of \(n/4\text{,}\) and so on.
🔗

There will be \(\log n\) levels as we divide the number of items in each group in half at each level.
🔗

So the total work is \(O(n)\) work per level times \(O(\log n)\) levels, for a total of \(O(n \cdot \log n)\text{.}\)
🔗

🔗
Formal.
The recurrence relation for quick sort is the same as for merge sort: \(T(n) = 2T(n/2) + c_1 \cdot n + c_2\text{.}\) It evaluates to \(O(n \cdot \log n)\text{.}\)
🔗

🔗

🔗

However, that analysis depends on the idea that the number of items in each partition is roughly equal and thus the list gets divided evenly at each step. If the partitions are very unbalanced, the efficiency can degrade significantly. Try this version of the interactive. In it, quick sort is used to sort an already sorted list:

🔗

Activity 26.13.1. QuickSort Efficiency.

Instructions.

Click "Step" to advance once step or "Play" to run continuously.

🔗

At the top, you can see a description of the current step. On the right, you can see the current call stack.

🔗

Because the list is already sorted, and we always pick the first element as the pivot, the partitions end up being extremely unbalanced. The left ends up empty, and the right partition always has all the elements other than the pivot value remaining elements. This means that instead of dividing the list in half at each step, we only remove one element (the pivot) at each step.

🔗

Thus, instead of the number of items in the largest partition following a pattern of:

🔗

32 → 16 → 8 → 4 → 2 → 1

it follows a pattern of:

🔗

32 → 31 → 30 → 29 → 28 → ... → 2 → 1

This pattern has a length of \(n\) instead of \(\log n\text{.}\) Thus the recursion depth becomes \(n\) instead of \(\log n\text{.}\) This now means that we have \(n\) levels of recursion, each needing \(O(n)\) work to partition the list and we end up with a total time complexity of \(O(n^2)\) instead of \(O(n \cdot \log n)\text{.}\) (The recurrence relation would be \(T(n) = T(n - 1) + c_1*n + c_2\) which solves to \(O(n^2)\text{.}\))

🔗

Even worse, this worst-case case scenario happens if the array is already sorted or mostly sorted! Usually we would hope a sorting algorithm would perform better on such inputs.

🔗

However, there are various strategies that can be used to mitigate this issue and ensure that sorting sorted data does not degrade to worst-case performance. All of the strategies involve changing how the pivot is selected. Some common strategies include:

Using the middle element as the pivot

🔗
Choosing a random element as the pivot (Randomized QuickSort)

🔗
Looking at the first, middle and last elements and choosing the median of those three as the pivot (Median-of-three)

🔗

In all of these cases, the goal is to avoid consistently poor pivot choices that lead to unbalanced partitions. After selecting the pivot value, we can start by swapping it with the first element to reuse the same partitioning logic as before.

🔗

Given any of these approaches, it can be demonstrated that there is only a small probability of consistently poor pivot choices and that the average-case time complexity thus remains \(O(n \cdot \log n)\text{.}\) However, there is always a chance that we get extraordinarily unlucky and end up with poor pivot choices repeatedly, leading to worst-case performance.

🔗

While normally we focus on the worst-case behavior when categorizing algorithms, a good pivot selection strategy makes the worst-case for quick sort extraordinarily unlikely. The chance of starting with \(n\) items and consistently randomly picking the smallest remaining element to be the pivot would be \(1/n!\text{.}\) For even 10 items, that is only a ~0.00003% chance. Thus it is common to still consider quick sort to be an \(O(n \cdot \log n)\) algorithm despite the theoretical worst-case.

🔗

Normally we do not concern ourselves too much with constant factors when doing Big-O analysis. \(O(n \cdot \log n)\) is so much smaller than \(O(n^2)\) that any constants in play almost certainly do not matter. However, when comparing two algorithms with the same asymptotic complexity, constant factors can become important. Quick sort gets its name from the fact that it is often faster in practice than other \(O(n \cdot \log n)\) algorithms like merge sort. Constant factors in the algorithm, the lack of need for additional memory, and cache efficiency all contribute to this practical speed advantage.

🔗

Quick Sort Key Facts.

Quick sort has an average-case time complexity of \(O(n \cdot \log n)\) when using good pivot selection strategies. However, poor pivot choices can lead to \(O(n^2)\text{.}\) A good pivot selection strategy can make the chance of consistently poor pivot choices very small but not completely zero.

🔗

In practice, quick sort tends to perform better than other \(O(n \cdot \log n)\) algorithms like merge sort.

🔗

Quick sort is not stable. If we care about stability, we should use merge sort or other stable sorting algorithms.

🔗

Checkpoint 26.13.1.

Drag each algorithm into the Big-O category that describes its time complexity.

🔗

insertion sort (best)
\(O(n)\)
merge sort
quick sort (average)
\(O(n \cdot \log n)\)
insertion sort (average)
selection sort
quick sort (worst)
\(O(n ^ 2)\)

🔗

You have attempted of activities on this page.

🔗

Prev Top Next