Skip to main content

Section 31.5 Inserting in Binary Search Trees

When inserting into a binary search tree, we will always add the new value as a leaf node. For any value we might want to add, there will always be a single location where that leaf can go that maintains the search tree property. Consider the following tree:
A binary search tree with question marks indicating possible insertion points for new values. The tree has root node 10, with left child 5 and right child 15. The 5 node has children 3 and 7. The 15 node has right child 20. The 20 node has right child 22.πŸ”—
Figure 31.5.1. Where would you put 4? 12? 25?
If we want to insert 12, it needs to be to the right of 10 and the left of 15. If we want to insert 6, it would need to be to the left of 10, the right of 5, and the left of 7. Each of those series of comparisons results in a unique locationβ€”the same location where we would expect to find it using the find operation. So, the insertion algorithm is very similar to the find algorithm, only instead of stopping when we find the value, we won’t stop until we reach a null pointer where the new node can be added as a leaf.

Activity 31.5.1. Binary Search Trees.

The binary search tree for the values \(10, 5, 15, 3, 7, 20, 22\) is shown below. We are prepared to insert \(4\text{.}\) Use the > button to step through the insertion process.
Then try inserting other values. Try to predict where they will go before inserting them.

Instructions.

When the > button is highlighted, an animation is prepared to run. Use the Step button to step through the animation one step at a time.
When no animation is in progress, you can use the data controls to start a new animation.
You can click and drag on the animation area to pan around. Use the Ctrl + mouse wheel (or touchpad scroll gesture) while over the animation area to zoom in and out.
  • Core Controls.
    • << : Skip to the beginning of the animation.
    • < : Step backwards one step in the animation.
    • > : Step forward one step in the animation.
    • >> : Skip to the end of the animation.
    • Auto Step Speed Set to anything other than off to automatically step through the animation at the selected speed.
    • Zoom Set the zoom level. You can also use Ctrl + Mouse Wheel to zoom in and out.
  • Data Controls.
    • Value Use this area to enter a value when inserting, finding, deleting, etc...
Because nodes always go in as leaves, the order we insert values can have a big impact on the shape of the tree. For instance, if we insert the values \(1, 2, 3, 4, 5, 6, 7\) in that order, we end up with a tree that is essentially a linked list. If we instead insert them in the order \(4, 2, 6, 1, 3, 5, 7\text{,}\) we end up with a much more balanced tree.
A binary search tree with root node 1. The right child of 1 is 2, whose right child is 3, whose right child is 4, whose right child is 5, whose right child is 6, whose right child is 7. This tree is unbalanced and resembles a linked list.πŸ”—
Figure 31.5.2. A tree built by inserting the values 1, 2, 3, 4, 5, 6, 7 in order.
A binary tree with root node 4. The left child of the root is 2, which has children 1 and 3. The right child of the root is 6, which has children 5 and 7. This tree is balanced.πŸ”—
Figure 31.5.3. A tree built by inserting the values 4, 2, 6, 1, 3, 5, 7 in order.
Because many algorithms on a binary search tree depend on the height of the tree, this balance can have a big impact on performance. For now we will just note this fact and return to it later when we discuss tree balancing algorithms.

Insight 31.5.1.

Inserting data in order into a BST just results in a linked list if we do not have a way to balance the tree.
If you clear the BST interactive and try inserting a value, you will see one special case that needs to be handled. When the tree is empty, there is no root node to start the search from. In that case, we simply create a new node and set it as the root of the tree.
In every other case, we need to find the right location to insert the new node. However, once we find the right location, we need to update a pointer in the parent node to point to the new node. This means we have a few choices:
  • Peek ahead to the child pointer before moving down the tree. If the child pointer is null, insert the new node there. Otherwise, move down to the child and continue.
  • Keep track of the parent node as we search for the insertion point. Our recursive function could take the parent node as an additional parameter.
  • Use the logic we first saw while implementing recursive functions for linked lists. We can return the new node back to the function that was responsible for its parent.
Here is an implementation of the third option. It is a recursive function that returns the (possibly new) subtree that results from inserting the value at the current location. If the current node is null, we create a new node and return it. Otherwise, we recursively call the insertion function on either the left or right child, depending on the value being inserted. Finally, we return the current node.
Listing 31.5.4.
Node* insertRecursive(Node* node, int value) {
    if (node == nullptr) {
        // found location
        Node* newNode = new Node(value);
        return newNode;  // return the new node
    }
    else if (value < node->data) {
        // belongs in left subtree, hand it off
        // set left child to result of insertion
        node->left = insertRecursive(node->left, value);
    }
    else {
        // belongs in right subtree, hand it off
        // set right child to result of insertion
        node->right = insertRecursive(node->right, value);
    }
    return node; // return the node we started with
}
There are a couple of key things to note. First, we always return a Node*. It is either the new node we created (if we found the location, line 5) or it is the address of the node that we started with (β€œI’m still here!”, line 17). Second, when we make the recursive call, we know that we will get back the memory address that should be assigned to the appropriate child pointer. So we assign the result of the recursive call to either node->left or node->right (lines 10, 15). Most of the time, that assignment will not make a change. But, if the recursive call creates a new node, this is how we will update the parent.
This recursive helper will be called by a public BST::insert method that starts the process at the root of the tree. It will call the recursive function on the root. If the root is null, the new node will be created and returned, and we will set the tree’s root pointer to that new node. Otherwise, the existing root pointer will be returned and reassigned to itself (no change).
class IntBST {
private:
    Node* root = nullptr;
public:
    void insert(int value);
};

// Non-recursive member function that starts the recursive search
void IntBST::insert(int value) {
    // Start the recursive insertion at the root (possibly nullptr)
    // The address of the old or new root node will come back
    root = insertRecursive(root, value);
}

Checkpoint 31.5.1.

Construct the logic for an insert function that β€œpeeks ahead” to determine where to insert the new value. (We would need to update the logic of IntBST::insert to match it.)

Checkpoint 31.5.2.

We can also build an iterative version. To do so, we first find the correct parent node by walking the tree until we reach a null pointer. As we do so, a parent pointer will keep track of the last non-null node we visited by remaining one step behind the current pointer. Once we find the correct parent node, we can insert the new node as its left or right child.
You have attempted of activities on this page.