Section 28.7 Merge Sort for Linked Lists
While quicksort is challenging to implement efficiently on linked lists, merge sort is ideally suited to them. The core logic of the merge sort algorithm is repeated below:
MergeSort(list):
if length of list <= 1:
return list // size 0/1 already sorted
else:
Split(list, leftHalf, rightHalf) // split list into two halves
MergeSort(leftHalf)
MergeSort(rightHalf)
list = Merge(leftHalf, rightHalf)
In quicksort, we need to worry about being able to partition (and possibly reassemble) the list in
\(O(n)\) time, and we also need to worry about our partition algorithm producing reasonably balanced partitions to ensure
\(O(\log n)\) recursive steps.
For merge sort, we always split the list into equal halves. So, we should always have
\(O(\log n)\) recursive steps before we get down to size 0/1 lists. Thus our only concern is whether we can do the splitting and merging steps in
\(O(n)\) time.
As we saw in
Sectionย 28.5, we can split a linked list into two halves in
\(O(n)\) time by walking through the list to find the midpoint and then adjusting a few pointers to break the list into two.
To reassemble the sublists once they are sorted, we canโt do a simple splice, so we canโt just connect the two lists together with a few pointer modifications. Instead, we need to merge the individual items together in sorted order. We can do this in
\(O(n)\) time with something like:
Merge(leftHalf, rightHalf):
mergedList = new empty list
while leftHalf is not empty and rightHalf is not empty:
if leftHalf.front() <= rightHalf.front():
mergedList.push_back(leftHalf.front())
leftHalf.pop_front()
else:
mergedList.push_back(rightHalf.front())
rightHalf.pop_front()
// gather up any remaining items
while leftHalf is not empty:
mergedList.push_back(leftHalf.front())
leftHalf.pop_front()
while rightHalf is not empty:
mergedList.push_back(rightHalf.front())
rightHalf.pop_front()
return mergedList
Although
pop_front and
push_back are both
\(O(1)\) operations, they involve either creating a node with
new or deleting a node with
delete. These are expensive operations, especially if the items being sorted are large or complex objects. If we wanted to minimize constant factors, we could rewrite the code to do low level pointer manipulation to move items from one list to the other without needing to create or delete any nodes. That approach would still be
\(O(n)\) but would have better constant factors. It also would be more complex and error prone, so it is the kind of thing that would be worth doing in a library we expected to be used heavily, but maybe not worth doing for our simple implementation.
For these reasons, merge sort is often the preferred sorting algorithm for linked lists. The
std::list class in the C++ Standard Library provides a
sort member function that implements merge sort specifically optimized for linked lists. Because it can sort by reordering nodes by changing pointers, but without changing where in memory the nodes are located, it is efficient even when the elements being sorted are large or complex objects. It even preserves iterator validity, meaning that any iterators pointing to elements in the list before sorting will still point to the same elements after sorting. No array based sort can provide that guarantee.
Checkpoint 28.7.1.
Which of the following are reasons merge sort is an even better choice for linked lists than for arrays?
A linked list can be merged without requiring extra space.
-
A linked list can be split into two halves in \(O(1)\) time.
Making the split itself is \(O(1)\text{,}\) but finding the midpoint to know where to split takes \(O(n)\text{.}\)
Two sorted linked lists can be merged in \(O(1)\) time.
We can connect two lists together in \(O(1)\) time, but merging two sorted lists requires walking through both lists to put items in the correct order, which takes \(O(n)\text{.}\)
Iterators to elements can be preserved during sorting.
-
You have attempted
of
activities on this page.