Section 26.7 Merging
The merge sort algorithm is deceptively simple. In pseudocode, it looks like this:
MergeSort(list):
if length of list <= 1:
return list // size 0/1 already sorted
else:
Split list into leftHalf and rightHalf
MergeSort(leftHalf)
MergeSort(rightHalf)
// Merge the two sorted halves into one sorted list
return Merge(leftHalf, rightHalf)
To sort a list, it divides it into two halves and sorts each half. Then it merges the two sorted halves. All the interesting work in this algorithm happens during the merge step, where two sorted lists are combined into a single sorted list. This section focuses on that merging process.
An efficient vector implementation of merge sort does not actually use two separate lists for the left and right halves. Instead, it uses indexes to track the left and right halves within a single vector.
Consider the activity below. It shows a list of items that consists of two sorted halves.
nextLeft and nextRight represent the indices of the first unmerged items in the left and right halves, respectively.
Note that the two halves are each sorted. This is what the vector will look like after we call
MergeSort(leftHalf) and MergeSort(rightHalf). The only thing left to do is to merge the two halves into one long sorted list.
If the left half is sorted, and the right half is sorted, then the smallest overall item must be either the item at
nextLeft or the item at nextRight. So our task is simply to pick the smaller of the two and add it to the merged list, then move the corresponding index forward. We will use mergeIndex to track the location in the merged list where the next item should be placed.
Use the animation below to complete a merge step by step.
Try completing the process a few times. As you do so, note that:
-
We are merging the range 0-7. That means the midpoint is
floor((0 + 7) / 2)which is 3. -
The left side ends at the midpoint index. When
nextLeftgoes beyond the midpoint, it means all items from the left half have been merged. -
The right side ends at the end index (7). When
nextRightgoes beyond the end index, it means all items from the right half have been merged.
Also important to note is that we need a temporary vector to hold the merged result as we build it. Once the merge is complete, we can copy the items from the temporary vector back into the original vector.
Insight 26.7.1.
Efficient merging of a vector of size \(n\) requires \(O(n)\) additional space for the temporary vector.
We can use this same merging process to merge two sorted portions of a vector that do not encompass the entire vector. For example, consider the vector shown in the figure below. In it, we are focusing on the region from index 5 to index 9. The midpoint is index \(floor((5 + 9) /
2) = 7\text{.}\) The items from index 5 to index 7 are sorted, and the items from index 8 to index 9 are sorted. We can merge these two sorted portions into a single sorted portion from index 5 to index 9 by using the same process as before.
In the merge we would start with the following initial values:
-
start = 5(the first index of the portion being merged) -
end = 9(the last index of the portion being merged)
From those, we can compute:
-
mid = 7(fromfloor((start + end) / 2)) -
nextLeft = 5(same asstart) -
nextRight = 8(one more thanmid) -
mergeIndex = 5(same asstart)
Note that in this logic the destination vector has the same size as the source vector and that the merge process only modifies the portion from index 5 to index 9. The other items remain unchanged.
Although we could create a temporary vector that is only as large as the portion being merged (in this case, size 5), it is often simpler to create a temporary vector that is the same size as the original vector. This way, we can use the same indices for both the source and destination vectors without needing to adjust them. It also will allow us to reuse the same temporary vector for multiple merge operations once we implement the full merge sort algorithm.
Implemented to sort a vector in C++, it looks like this:
Note 26.7.2.
Most implementations will reduce the selection logic to something like:
if (nextLeft <= mid && (nextRight > end || vec.at(nextLeft) <= vec.at(nextRight))) {
// left is not empty and (right is empty or left <= right)
// take from left
} else {
// take from right
}
Our implementation uses separate boolean variables to make the logic clearer and easier to follow.
Another strategy uses three loops:
while (nextLeft <= mid && nextRight <= end) {
// compare and take from left or right
}
while (nextLeft <= mid) {
// take remaining from left
}
while (nextRight <= end) {
// take remaining from right
}
All of these strategies are equivalent in terms of their time complexity and overall behavior.
How long does this process take? Let us consider a simplified pseudocode version of the algorithm. In it we donβt worry about indexes, we just focus on the key operations and how many times loops run:
merge(list, start, end)
n = end - start + 1
repeat n times:
compare nextLeft and nextRight
copy smaller to mergeVector
repeat n times:
copy item from mergeVector back to list
The size of the job is determined by the distance between
start and end. That is what defines \(n\) for this algorithm.
\(n\) times we have to compare the nextLeft and nextRight items and then copy one of them to the merge vector. That is \(n\) comparisons and \(n\) copy operations, or \(2n\) operations. Then, the items from the merge vector must be copied back to the source vector. That is another \(n\) copy operations. Thus, the total number of operations is proportional to \(3n\text{,}\) which simplifies to \(O(n)\text{.}\)
Checkpoint 26.7.1.
Checkpoint 26.7.2.
Checkpoint 26.7.3.
Given the following list of numbers: [2, 7, 14, 1, 8, 19, 6, 18] which are valid ranges to merge using our algorithm?
- Indexes 0-3 ([2, 7, 14, 1 ...])
- The second half ([14, 1]) is not sorted, so it is not a valid range to merge.
- Indexes 4-7 ([... 8, 19, 6, 18])
- That divides into two sorted ranges [8, 19] and [6, 18].
- Indexes 1-4 ([... 7, 14, 1, 8 ...])
- That divides into two sorted ranges [7, 14] and [1, 8].
- Indexes 0-8 (the entire list)
- That does not divide in the middle into two sorted ranges. It divides into [2, 7, 14, 1] and [8, 19, 6, 18], neither of which is sorted.
You have attempted of activities on this page.
