Merge Sort for Linked Lists

Section 28.7 Merge Sort for Linked Lists

While quicksort is challenging to implement efficiently on linked lists, merge sort is ideally suited to them. The core logic of the merge sort algorithm is repeated below:

🔗

MergeSort(list):
  if length of list <= 1:
    return list  // size 0/1 already sorted
  else:
    Split(list, leftHalf, rightHalf)  // split list into two halves
    MergeSort(leftHalf)
    MergeSort(rightHalf)
    list = Merge(leftHalf, rightHalf)

In quicksort, we need to worry about being able to partition (and possibly reassemble) the list in \(O(n)\) time, and we also need to worry about our partition algorithm producing reasonably balanced partitions to ensure \(O(\log n)\) recursive steps.

🔗

For merge sort, we always split the list into equal halves. So, we should always have \(O(\log n)\) recursive steps before we get down to size 0/1 lists. Thus our only concern is whether we can do the splitting and merging steps in \(O(n)\) time.

🔗

As we saw in Section 28.5, we can split a linked list into two halves in \(O(n)\) time by walking through the list to find the midpoint and then adjusting a few pointers to break the list into two.

🔗

To reassemble the sublists once they are sorted, we can’t do a simple splice, so we can’t just connect the two lists together with a few pointer modifications. Instead, we need to merge the individual items together in sorted order. We can do this in \(O(n)\) time with something like:

🔗

Merge(leftHalf, rightHalf):
  mergedList = new empty list
  while leftHalf is not empty and rightHalf is not empty:
    if leftHalf.front() <= rightHalf.front():
      mergedList.push_back(leftHalf.front())
      leftHalf.pop_front()
    else:
      mergedList.push_back(rightHalf.front())
      rightHalf.pop_front()
  // gather up any remaining items
  while leftHalf is not empty:
    mergedList.push_back(leftHalf.front())
    leftHalf.pop_front()
  while rightHalf is not empty:
    mergedList.push_back(rightHalf.front())
    rightHalf.pop_front()
  return mergedList

Although pop_front and push_back are both \(O(1)\) operations, they involve either creating a node with new or deleting a node with delete. These are expensive operations, especially if the items being sorted are large or complex objects. If we wanted to minimize constant factors, we could rewrite the code to do low level pointer manipulation to move items from one list to the other without needing to create or delete any nodes. That approach would still be \(O(n)\) but would have better constant factors. It also would be more complex and error prone, so it is the kind of thing that would be worth doing in a library we expected to be used heavily, but maybe not worth doing for our simple implementation.

🔗

Insight 28.7.1.

A node in one list can be detached and moved to another list by adjusting a few pointers, without needing to delete and reallocate the node. This approach is more efficient than creating and deleting nodes but only by a constant factor.

🔗

For these reasons, merge sort is often the preferred sorting algorithm for linked lists. The std::list class in the C++ Standard Library provides a sort member function that implements merge sort specifically optimized for linked lists. Because it can sort by reordering nodes by changing pointers, but without changing where in memory the nodes are located, it is efficient even when the elements being sorted are large or complex objects. It even preserves iterator validity, meaning that any iterators pointing to elements in the list before sorting will still point to the same elements after sorting. No array based sort can provide that guarantee.

🔗

Checkpoint 28.7.1.

Which of the following are reasons merge sort is an even better choice for linked lists than for arrays?

🔗

A linked list can be merged without requiring extra space.
A linked list can be split into two halves in \(O(1)\) time.
Making the split itself is \(O(1)\text{,}\) but finding the midpoint to know where to split takes \(O(n)\text{.}\)
Two sorted linked lists can be merged in \(O(1)\) time.
We can connect two lists together in \(O(1)\) time, but merging two sorted lists requires walking through both lists to put items in the correct order, which takes \(O(n)\text{.}\)
Iterators to elements can be preserved during sorting.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next