Selection Sort

Section 26.5 Selection Sort

Selection sort sorts a list of data by repeatedly finding the smallest (or largest) remaining unsorted item and moving it to the location it belongs in.

🔗

A selection sort does \(n - 1\) passes to sort n items. During each pass, the algorithm selects the smallest remaining unsorted item and places it at the front of the unsorted portion of the list.

🔗

On the first pass, the algorithm looks at all \(n\) items to find the smallest one. Once it finds that item, it swaps it with the item at index 0. On the second pass, it starts from index 1, looks at the remaining \(n - 1\) items to find the smallest among them, and swaps that item with the item at index 1. For the third pass, it starts at index 2, examines n - 2 items to find the smallest among them, and swaps that item with the item at index 2. This process continues, with each pass looking at one fewer item than the previous pass, until only one item remains. There is no need to sort that last item because it must be the largest item.

🔗

Activity 26.5.1. Selection Sort Demonstration.

🔗

Implemented to sort a vector in C++, it looks like this:

🔗

Listing 26.5.1.

🔗

#include <iostream>
#include <vector>
using namespace std;

#include "vector-helpers.h"

template<typename T>
void selectionSort(vector<T>& vec) {
    int n = vec.size();
    // Loop to do n-1 passes
    for (int i = 0; i < n - 1; ++i) {
        // Find the index of the minimum element in the unsorted portion starting at i
        int minIndex = i;
        for (int j = i + 1; j < n; ++j) {
            if (vec.at(j) < vec.at(minIndex)) {
                minIndex = j;
            }
        }
        // Swap the found minimum element with the first element of the unsorted portion
        if (minIndex != i) {
            swap(vec.at(i), vec.at(minIndex));
        }
        // Debug output to show the array after each outer loop iteration
        cout << "After pass " << i + 1 << ": " << endl;
        cout << vectorToString(vec) << endl;
    }
}

int main() {
    vector<int> numbers = {8, 3, 7, 1, 9, 12, 6, 20, 11, 14};

selectionSort(numbers);
    cout << "Sorted array: " << endl;
    cout << vectorToString(numbers) << endl;
}

We can do variations on this recipe:

Instead of finding the smallest item during each pass, we could find the largest item and place it at the end of the unsorted portion of the list.
🔗

In this version, each pass would always start at index 0. The first pass would traverse the whole list and end at index \(n - 1\text{.}\) The second pass would stop before the last element and end at index \(n - 2\text{.}\) The third pass would end at index \(n - 3\text{,}\) and so on, until the last pass which would only look at the first two elements.
🔗

🔗
We could sort in descending order instead of ascending order by finding the largest item and moving it to the front (or finding the smallest item and moving it to the end).
🔗

🔗

🔗

The efficiency of selection sort is easy to analyze. Regardless of the initial order of the items, selection sort always does the same number of comparisons. On the first pass, it does \(n - 1\) comparisons to find the smallest item. On the second pass, it does \(n - 2\) comparisons, and so on down to 1 comparison on the last pass. The total number of comparisons is therefore:

\begin{equation*} (n - 1) + (n - 2) + ... + 2 + 1 \end{equation*}

That sum can be simplified as follows:

\begin{equation*} n(n - 1) / 2 = (n^2 - n) / 2 = O(n^2) \end{equation*}

🔗

The number of swaps is at most \(n - 1\text{,}\) since one swap is done at the end of each pass. (We can eliminate some of those if we identify that the minIndex is the same as the starting index of a pass.) This is \(O(n)\text{,}\) which is smaller than the number of comparisons. So the overall time complexity is dominated by the number of comparisons (\(O(n^2)\)).

🔗

So selection sort has a time complexity of \(O(n^2)\text{,}\) making it inefficient on large lists. It does have an advantage of making the minimum possible number of swaps, which can be useful when write operations are significantly more expensive than read operations. Imagine a list of very large records - swapping two records might involve copying a lot of data, so minimizing swaps can be beneficial.

🔗

The complexity of the algorithm does not depend on the initial order of the items. Try the Reset (Mostly Sorted) button in the simulation above to test this out. It requires just as many steps to sort a list where most of the items are in order as it does for a list in random order.

🔗

Selection sort offers an interesting guarantee if we stop it after \(k\) passes: the first \(k\) items in the list are guaranteed to be the smallest \(k\) items in sorted order. This can be useful if we only need the top \(k\) items from a larger list. Try running the simulation above and stopping it after three passes. No matter what the initial order of the items was, the smallest three items will be in the first three positions.

🔗

Selection sort is not stable. Consider the list {B(1), B(2), A} where the two B’s are distinct elements but have the same value. When we perform the first pass and identify A as the smallest value, it swaps to the front and trades place with B(1), changing the relative order of the two B’s.

🔗

Insight 26.5.1.

Any algorithm that swaps items across a large distance is likely to be unstable.

🔗

Selection sort is always \(O(n^2)\text{.}\)

🔗

It is NOT stable.

🔗

It makes the minimum possible number of swaps, which can be useful when write operations are significantly more expensive than read operations.

🔗

Checkpoint 26.5.1.

We do a selection sort (using min value to front strategy) on the vector 3, 2, 4, 5, 1. Arrange the blocks below to show the state of the vector after each pass of the algorithm.

🔗

Note that you may use the same block twice in a row if the current pass does not change the vector.

🔗

You will not use all of the blocks.

🔗

1, 2, 4, 5, 3
---
1, 2, 4, 5, 3
---
1, 2, 3, 5, 4
---
1, 2, 3, 4, 5
---
1, 2, 5, 4, 3 #distractor

🔗

Checkpoint 26.5.2.

Suppose you have the following vector of numbers to sort: [11, 7, 12, 14, 19, 1, 6, 18, 8, 20] which vector represents the partially sorted vector after three steps of selection sort when using the swap smallest to front approach?

🔗

[11, 18, 12, 14, 19, 20, 8, 7, 6, 1]
That is moving the smallest to the end.
[7, 11, 12, 14, 19, 1, 6, 18, 8, 20]
This looks like an insertion sort.
[1, 6, 7, 11, 12, 14, 19, 18, 8, 20]
This one looks similar to the correct answer, however, it is not how selection sort works. With this answer, instead of swapping values, the values have been shifted to the left to make room for the correct numbers.
[1, 6, 7, 8, 11, 12, 14, 18, 19, 20]

🔗

You have attempted of activities on this page.

🔗

Prev Top Next