Merging

Section 26.7 Merging

The merge sort algorithm is deceptively simple. In pseudocode, it looks like this:

MergeSort(list):
  if length of list <= 1:
    return list  // size 0/1 already sorted
  else:
    Split list into leftHalf and rightHalf
    MergeSort(leftHalf)
    MergeSort(rightHalf)
    // Merge the two sorted halves into one sorted list
    return Merge(leftHalf, rightHalf)

To sort a list, it divides it into two halves and sorts each half. Then it merges the two sorted halves. All the interesting work in this algorithm happens during the merge step, where two sorted lists are combined into a single sorted list. This section focuses on that merging process.

🔗

An efficient vector implementation of merge sort does not actually use two separate lists for the left and right halves. Instead, it uses indexes to track the left and right halves within a single vector.

🔗

Consider the activity below. It shows a list of items that consists of two sorted halves. nextLeft and nextRight represent the indices of the first unmerged items in the left and right halves, respectively.

🔗

Note that the two halves are each sorted. This is what the vector will look like after we call MergeSort(leftHalf) and MergeSort(rightHalf). The only thing left to do is to merge the two halves into one long sorted list.

🔗

If the left half is sorted, and the right half is sorted, then the smallest overall item must be either the item at nextLeft or the item at nextRight. So our task is simply to pick the smaller of the two and add it to the merged list, then move the corresponding index forward. We will use mergeIndex to track the location in the merged list where the next item should be placed.

🔗

Use the animation below to complete a merge step by step.

🔗

Activity 26.7.1. Merging.

Instructions.

Decide if the next item to merge should come from the left half or the right half. Click the matching button to perform the merge step. Continue until all items have been merged.

🔗

Try completing the process a few times. As you do so, note that:

We are merging the range 0-7. That means the midpoint is floor((0 + 7) / 2) which is 3.

🔗
The left side ends at the midpoint index. When nextLeft goes beyond the midpoint, it means all items from the left half have been merged.

🔗
The right side ends at the end index (7). When nextRight goes beyond the end index, it means all items from the right half have been merged.

🔗

🔗

Also important to note is that we need a temporary vector to hold the merged result as we build it. Once the merge is complete, we can copy the items from the temporary vector back into the original vector.

🔗

Insight 26.7.1.

Efficient merging of a vector of size \(n\) requires \(O(n)\) additional space for the temporary vector.

🔗

We can use this same merging process to merge two sorted portions of a vector that do not encompass the entire vector. For example, consider the vector shown in the figure below. In it, we are focusing on the region from index 5 to index 9. The midpoint is index \(floor((5 + 9) / 2) = 7\text{.}\) The items from index 5 to index 7 are sorted, and the items from index 8 to index 9 are sorted. We can merge these two sorted portions into a single sorted portion from index 5 to index 9 by using the same process as before.

🔗

Figure 26.7.1. Merging sorted portions of a vector that are not the entire vector.

🔗

In the merge we would start with the following initial values:

start = 5 (the first index of the portion being merged)

🔗
end = 9 (the last index of the portion being merged)

🔗

From those, we can compute:

mid = 7 (from floor((start + end) / 2))

🔗
nextLeft = 5 (same as start)

🔗
nextRight = 8 (one more than mid)

🔗
mergeIndex = 5 (same as start)

🔗

🔗

Note that in this logic the destination vector has the same size as the source vector and that the merge process only modifies the portion from index 5 to index 9. The other items remain unchanged.

🔗

Although we could create a temporary vector that is only as large as the portion being merged (in this case, size 5), it is often simpler to create a temporary vector that is the same size as the original vector. This way, we can use the same indices for both the source and destination vectors without needing to adjust them. It also will allow us to reuse the same temporary vector for multiple merge operations once we implement the full merge sort algorithm.

🔗

Implemented to sort a vector in C++, it looks like this:

🔗

Listing 26.7.2.

🔗

#include <iostream>
#include <vector>
using namespace std;

#include "vector-helpers.h"

// Assumes start/end consists of two sorted halves:
//   [start..mid] and [mid+1..end]
// Temp should be same size as vec. It is a parameter so that later we
//   can reuse the same temp vector across multiple calls to merge
template<typename T>
void merge(vector<T>& vec, int start, int end, vector<T>& temp) {
    int mid = (start + end) / 2;
    int nextLeft = start;
    int nextRight = mid + 1;
    int mergeIndex = start;

int totalItems = end - start + 1;
    for(int i = 0; i < totalItems; ++i) {
        // determine if taking from left or right
        bool takeLeft;
        bool leftEmpty = (nextLeft > mid);
        bool rightEmpty = (nextRight > end);
        if (rightEmpty) {
            takeLeft = true;
        } else if (leftEmpty) {
            takeLeft = false;
        } else if (vec.at(nextLeft) <= vec.at(nextRight)) {
            // items in both, left is smaller
            takeLeft = true;
        } else {
            // items in both, right is smaller
            takeLeft = false;
        }

// take from the chosen side
        if (takeLeft) {
            cout << "Merging left item (" << vec.at(nextLeft) << ")" << endl;
            temp.at(mergeIndex) = vec.at(nextLeft);
            nextLeft++;
        } else {
            cout << "Merging right item (" << vec.at(nextRight) << ")" << endl;
            temp.at(mergeIndex) = vec.at(nextRight);
            nextRight++;
        }
        mergeIndex++;

// Debug output to show the array after each outer loop iteration
        cout << "Temp: " << vectorToString(temp) << endl;
    }

// Copy the merged elements back into the original vector
    for (int i = start; i <= end; ++i) {
        vec.at(i) = temp.at(i);
    }
}

int main() {
    vector<int> numbers = {0, 0, 1, 4, 5, 2, 6, 7, 0, 0};
    vector<int> temp(numbers.size());
    // We will merge the portion from index 2 to index 7
    // which has two sorted halves: 1,4,5 and 2,6,7
    int start = 2;
    int end = 7;
    merge(numbers, start, end, temp);
    cout << "Merged array: " << endl;
    cout << vectorToString(numbers) << endl;
}

Note 26.7.2.

Most implementations will reduce the selection logic to something like:

🔗

if (nextLeft <= mid && (nextRight > end || vec.at(nextLeft) <= vec.at(nextRight))) {
    // left is not empty and (right is empty or left <= right)
    // take from left
} else {
    // take from right
}

Our implementation uses separate boolean variables to make the logic clearer and easier to follow.

🔗

Another strategy uses three loops:

🔗

while (nextLeft <= mid && nextRight <= end) {
    // compare and take from left or right
}
while (nextLeft <= mid) {
    // take remaining from left
}
while (nextRight <= end) {
    // take remaining from right
}

All of these strategies are equivalent in terms of their time complexity and overall behavior.

🔗

How long does this process take? Let us consider a simplified pseudocode version of the algorithm. In it we don’t worry about indexes, we just focus on the key operations and how many times loops run:

🔗

merge(list, start, end)
  n = end - start + 1
  repeat n times:
    compare nextLeft and nextRight
    copy smaller to mergeVector
  repeat n times:
    copy item from mergeVector back to list

The size of the job is determined by the distance between start and end. That is what defines \(n\) for this algorithm.

🔗

\(n\) times we have to compare the nextLeft and nextRight items and then copy one of them to the merge vector. That is \(n\) comparisons and \(n\) copy operations, or \(2n\) operations. Then, the items from the merge vector must be copied back to the source vector. That is another \(n\) copy operations. Thus, the total number of operations is proportional to \(3n\text{,}\) which simplifies to \(O(n)\text{.}\)

🔗

Merging requires \(O(n)\) time, where \(n\) is the number of items being merged.

🔗

Checkpoint 26.7.1.

🔗

Checkpoint 26.7.2.

We do a merge on the list ??, ??, 1, 4, 8, 2, 3, 6, ??, ??. Focusing on the range 2-7 (the six numbers shown).

🔗

Place the blocks in the order to show the order of the sides merged from.

🔗

Left
---
Right
---
Right
---
Left
---
Right
---
Left

🔗

Checkpoint 26.7.3.

Given the following list of numbers: [2, 7, 14, 1, 8, 19, 6, 18] which are valid ranges to merge using our algorithm?

🔗

Indexes 0-3 ([2, 7, 14, 1 ...])
The second half ([14, 1]) is not sorted, so it is not a valid range to merge.
Indexes 4-7 ([... 8, 19, 6, 18])
That divides into two sorted ranges [8, 19] and [6, 18].
Indexes 1-4 ([... 7, 14, 1, 8 ...])
That divides into two sorted ranges [7, 14] and [1, 8].
Indexes 0-8 (the entire list)
That does not divide in the middle into two sorted ranges. It divides into [2, 7, 14, 1] and [8, 19, 6, 18], neither of which is sorted.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next