Size, Depth, and Efficiency

Section 31.6 Size, Depth, and Efficiency

Subsection 31.6.1 Size and Depth

The size of a binary search tree is the number of values (nodes) it contains. As with a linked list, we can either maintain a size variable that gets updated on insertions and deletions, or we can compute the size on demand by traversing the tree and counting nodes. For now we will not maintain a size variable, (there are not many algorithms that need to know the size of a BST), so we will compute it on demand.

🔗

The size of any binary tree is naturally recursive. It is 0 if the tree is empty, and otherwise it is 1 plus the size of the left subtree plus the size of the right subtree:

🔗

Listing 31.6.1.

🔗

int sizeRecursive(Node* node) {
    if (node == nullptr) {
        return 0;  // Base case: empty subtree
    }
    int leftSize = sizeRecursive(node->left);
    int rightSize = sizeRecursive(node->right);
    return 1 + leftSize + rightSize;
}

// Public member function to get size
int IntBST::size() {
    return sizeRecursive(root);
}

The depth or height of a binary search tree is the length of the longest path from the root to a leaf. A tree with only a single node has depth 0.

🔗

We can also talk about the depth or height of individual nodes. The height of a node is the length of the longest path from that node to a leaf. A leaf node has height 0. The depth of a node is the length of the path from the root to that node. The root has depth 0.

🔗

Figure 31.6.2. The depth and height of nodes in a binary search tree

🔗

Note 31.6.1.

Like size, depth can be computed recursively. The depth of an empty tree is considered to be -1, and otherwise it is 1 plus the maximum of the depths of the left and right subtrees.

🔗

The size and depth are related. We can’t possibly have a depth greater than the size. But, a binary search tree with a given set of values can have many different shapes, each with a different depth. (As we saw in the previous section). The worst case depth for a tree with size \(n\) is \(n-1\) (a degenerate tree that is really just a linked list). The best case depth for a perfectly balanced binary tree is \(O(log n)\) (as established in Section 30.3 ).

🔗

Insight 31.6.2.

The best case depth for a perfectly balanced binary tree is \(O(log n)\text{.}\)

🔗

Subsection 31.6.2 Efficiency

Some of our algorithms on a binary search tree, like size, need to visit every node in the tree. Intuitively, we can see that this will take time proportional to the number of nodes in the tree, or \(O(n)\) time. (Visiting \(n\) things can’t take less than \(O(n)\) time.)

🔗

More formally, if we examine the an algorithm like size, we will find a structure like this:

🔗

doThing(node):
    if node is null:
        return
    doThing(left child)
    doThing(right child)
    return

We make two recursive calls on the children of the current node and otherwise just do some constant work. If we assume that the left and right subtrees are roughly the same size, then the time complexity of this algorithm can be expressed as:

\begin{equation*} T(n) = 2T(n/2) + O(1) \end{equation*}

🔗

This is the pattern that solves to \(O(n)\) So, any algorithm that needs to visit every node in the tree will take \(O(n)\) time.

🔗

For algorithms like find, we do not need to visit every node. From each node we will only go left or right, never both.

🔗

find(node):
    if node is null:
        return false
    if node.data == target:
        return true
    else if target < node.data:
        return find(left child)
    else:
        return find(right child)

If we assume that the tree is balanced, then each time we make a recursive call, we are halving the size of the remaining tree to search. Each recursive call does constant work and then makes a recursive call on half of the remaining tree. This means that the time to find a value can be expressed as:

\begin{equation*} T(n) = T(n/2) + O(1) \end{equation*}

🔗

This is the pattern that solves to \(O(log n)\) So, in a balanced binary search tree, algorithms like find can run in logarithmic time.

🔗

Note that this time is the same as the expression to get the depth of the tree. This is not a coincidence: depth is what determines how many times we need to recurse before we are guaranteed to reach a base case (either finding the value or reaching a null pointer).

🔗

While we hope that the depth of a binary search tree is \(O(log n)\text{,}\) in the worst case it can be \(O(n)\text{.}\) In that case, algorithms like find will also take \(O(n)\) time.

🔗

For now, we will try to insert data in random order and assume that our binary search trees are relatively balanced. But it is worth remembering that inserting data in sorted order will produce a very unbalanced tree. Later we will discuss techniques for keeping a binary search tree balanced to ensure that we get \(O(\log n)\) time.

🔗

Insight 31.6.3.

An algorithm that visits every node in a binary search tree will take \(O(n)\) time, where \(n\) is the number of nodes in the tree.

🔗

An algorithm that only visits one path from the root to a leaf in a binary search tree will take \(O(log n)\) time, where \(n\) is the number of nodes in the tree. (As long as the tree is relatively balanced. It could take up to \(O(n)\) time in the worst case.)

🔗

Checkpoint 31.6.1.

Build a recursive depth function for a binary tree.

🔗

int depthRecursive(Node* node) { #tag:a; depends:;
---
    if (node == nullptr) #tag:b; depends:a;
---
        return -1; #tag:c; depends:b;
---
    int leftDepth = depthRecursive(node->left); #tag:d; depends:c;
---
    int rightDepth = depthRecursive(node->right); #tag:e; depends:c;
---
    if (leftDepth > rightDepth) { #tag:f; depends:d,e;
---
        return 1 + leftDepth; #tag:g; depends:f;
---
    } else { #tag:h; depends:g;
---
        return 1 + rightDepth; #tag:i; depends:h;
---
    }
} #tag:j; depends:i;

🔗

Checkpoint 31.6.2.

What things must be true for an operation to run in \(O(\log n)\) time in a binary search tree?

🔗

The operation only needs to visit one path from the root to a leaf.
Going down either the right or left path from any node will approximately halve the number of nodes remaining to visit.
The operation needs to visit every node in the tree.
Going down either the left or right path from any node will approximately double the number of nodes remaining to visit.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next