Skip to main content

Section 26.9 Merge Sort Efficiency

Now that we understand how Merge Sort works, let us analyze its efficiency. We will first do so informally and then more formally using recurrence relations.
For both versions, we will analyze the abstract version of the code that looks like this:
mergesort(list)
  split list into leftHalf and rightHalf
  mergesort(leftHalf)
  mergesort(rightHalf)
  list = merge(leftHalf, rightHalf)
That level of abstraction will help us focus on the key operations that determine the efficiency of merge sort.
We have already established that merging n items takes \(O(n)\) work. In other words, merging 5 items takes ~5 units of work, merging 20 items, takes ~20 units of work, etc…
Splitting a list using the algorithm we have developed takes \(O(1)\) work since it just involves doing some simple arithmetic to find the midpoint. A version of the algorithm that actually created two new lists would take more work - likely \(O(n)\text{.}\) Which it is won’t change our overall analysis.

Subsection 26.9.1 Informal Analysis

To understand how much work merge sort does, we can do a rough analysis of how long it takes to do a merge sort on a list of length 1024. Since our list is 1024 items, we will think of \(O(n)\) as representing ~1024 units of work.
All the hard work takes place as we merge the lists back together:
  • The first merge for a list of 1024 things would be to merge 1024 single items into 512 lists of length two. Each merge would take ~2 units of work since there are two items in the new lists. 512 lists times 2 units of work = ~1024 units of work.
  • The next level would be to merge those 512 lists of size 2 into 256 lists of size 4. Each merge in that level would take ~4 units of work. 256 lists times 4 units of work = 1024 units of work again.
The table below shows this and the pattern for the rest of the level:
Level Description Number of Lists Size of Each List Amount of Work
1 Merge 1024 lists of size 1
into 512 lists of size 2
512 2 512 Γ— 2 = 1024
2 Merge 512 lists of size 2
into 256 lists of size 4
256 4 256 Γ— 4 = 1024
3 Merge 256 lists of size 4
into 128 lists of size 8
128 8 128 Γ— 8 = 1024
… … … … …
??? Merge 2 lists of size 512
into 1 lists of size 1024
1 1024 1 Γ— 1024 = 1024
Note that at each level the work is ~1024 units of time - exactly the number of items in the full list. Thus we can say each level takes \(O(n)\) work.
The only other thing we need to figure out is β€œHow many levels are required?” The table above skips a few steps in the middle. We could go back and add them in - starting with 1024 items the levels would look like this:
1024 β†’ 512 β†’ 256 β†’ 128 β†’ 64 β†’ 32 β†’ 16 β†’ 8 β†’ 4 β†’ 2 β†’ 1
The length of that sequence - dividing by 2 repeatedly until we reach 1 - can be determined by the mathematical function \(log_2(n)\text{.}\) \(log_2(1024) = 10\text{.}\)
So we can use \(log_2(n)\) to calculate the number of levels of merges required in a merge sort of \(n\) items. For example, if we do merge sort on a list with 100,000 items, it will take \(log_2(100,000) = 16.61\) levels. Since we can’t do 16.61 merges levels, we would call that 17.
We can use that idea to develop a general formula for the total work in merge sort:
\begin{gather*} \textrm{total work} = \textrm{work per level} \cdot \textrm{number of levels} \end{gather*}
We know that sorting n items will require \(log_2(n)\) levels and each level will take \(n\) work:
\begin{gather*} \textrm{total work} = n \cdot log_2(n) \end{gather*}
\begin{gather*} \textrm{total work} = n \cdot log_2(n) \end{gather*}
Thus we conclude merge sort requires \(O(n \cdot log_2(n))\) work to sort a list of size n.

Subsection 26.9.2 Formal Analysis Using Recurrence Relations

We can also analyze merge sort more formally using recurrence relations. We start by defining \(T(n)\) as the amount of work required to sort a list of size \(n\) using merge sort. We can than analyze the steps of merge sort and use \(T(n/2)\) for the time it takes to sort a list of size \(n/2\text{:}\)
mergesort(list)                                  //T(n)
  split list into leftHalf and rightHalf         //   O(1) or O(n)
  mergesort(leftHalf)                            //   T(n/2)
  mergesort(rightHalf)                           //   T(n/2)
  list = merge(leftHalf, rightHalf)              //   O(n)
To do a merge sort on a list of size \(n\text{,}\) we first split the list into two halves of size \(n/2\text{.}\) This takes \(O(1)\) or \(O(n)\) work. We then do a merge sort on each half. Each of those recursive calls is on a list of size \(n/2\) and will therefore take \(T(n/2)\) work. Finally, we merge the two sorted halves back together, which takes \(O(n)\) work. Putting that all together, we can express \(T(n)\) as:
\begin{gather*} T(n) = 2 \cdot T(n/2) + O(n) + O(1) \end{gather*}
or
\begin{gather*} T(n) = 2 \cdot T(n/2) + c_1 \cdot n + c_2 \end{gather*}
where \(c_1\) and \(c_2\) are some constants.
Note that it does not matter whether we assume the split is done in constant time or linear time. Nor does it matter if we add in other constant or linear terms (like the β€œif size is 1, return” check). We will still end up with the same recurrence relation.
This recurrence relation solves to \(T(n) = O(n \cdot \log n)\text{.}\)
Merge sort has a time complexity of \(O(n \cdot log_2(n))\text{.}\)
It has a space complexity of \(O(n)\) due to the additional space required for the temporary array used during the merge process.
Merge sort is a stable sorting algorithm.
You have attempted of activities on this page.