The size of a binary search tree is the number of values (nodes) it contains. As with a linked list, we can either maintain a size variable that gets updated on insertions and deletions, or we can compute the size on demand by traversing the tree and counting nodes. For now we will not maintain a size variable, (there are not many algorithms that need to know the size of a BST), so we will compute it on demand.
The size of any binary tree is naturally recursive. It is 0 if the tree is empty, and otherwise it is 1 plus the size of the left subtree plus the size of the right subtree:
int sizeRecursive(Node* node) {
if (node == nullptr) {
return 0; // Base case: empty subtree
}
int leftSize = sizeRecursive(node->left);
int rightSize = sizeRecursive(node->right);
return 1 + leftSize + rightSize;
}
// Public member function to get size
int IntBST::size() {
return sizeRecursive(root);
}
The depth or height of a binary search tree is the length of the longest path from the root to a leaf. A tree with only a single node has depth 0.
We can also talk about the depth or height of individual nodes. The height of a node is the length of the longest path from that node to a leaf. A leaf node has height 0. The depth of a node is the length of the path from the root to that node. The root has depth 0.
Like size, depth can be computed recursively. The depth of an empty tree is considered to be -1, and otherwise it is 1 plus the maximum of the depths of the left and right subtrees.
The size and depth are related. We canβt possibly have a depth greater than the size. But, a binary search tree with a given set of values can have many different shapes, each with a different depth. (As we saw in the previous section). The worst case depth for a tree with size \(n\) is \(n-1\) (a degenerate tree that is really just a linked list). The best case depth for a perfectly balanced binary tree is \(O(log n)\) (as established in SectionΒ 30.3 ).
Some of our algorithms on a binary search tree, like size, need to visit every node in the tree. Intuitively, we can see that this will take time proportional to the number of nodes in the tree, or \(O(n)\) time. (Visiting \(n\) things canβt take less than \(O(n)\) time.)
doThing(node):
if node is null:
return
doThing(left child)
doThing(right child)
return
We make two recursive calls on the children of the current node and otherwise just do some constant work. If we assume that the left and right subtrees are roughly the same size, then the time complexity of this algorithm can be expressed as:
find(node):
if node is null:
return false
if node.data == target:
return true
else if target < node.data:
return find(left child)
else:
return find(right child)
If we assume that the tree is balanced, then each time we make a recursive call, we are halving the size of the remaining tree to search. Each recursive call does constant work and then makes a recursive call on half of the remaining tree. This means that the time to find a value can be expressed as:
Note that this time is the same as the expression to get the depth of the tree. This is not a coincidence: depth is what determines how many times we need to recurse before we are guaranteed to reach a base case (either finding the value or reaching a null pointer).
While we hope that the depth of a binary search tree is \(O(log n)\text{,}\) in the worst case it can be \(O(n)\text{.}\) In that case, algorithms like find will also take \(O(n)\) time.
For now, we will try to insert data in random order and assume that our binary search trees are relatively balanced. But it is worth remembering that inserting data in sorted order will produce a very unbalanced tree. Later we will discuss techniques for keeping a binary search tree balanced to ensure that we get \(O(\log n)\) time.
An algorithm that only visits one path from the root to a leaf in a binary search tree will take \(O(log n)\) time, where \(n\) is the number of nodes in the tree. (As long as the tree is relatively balanced. It could take up to \(O(n)\) time in the worst case.)