The Math of Complete Binary Trees

Section 30.3 The Math of Complete Binary Trees

The shape property of heaps is essential because a complete binary tree provides some important mathematical relationships between the number of nodes in the tree and its height.

🔗

We refer to the height of a tree as the number of edges on the longest path from the root to a leaf. A tree with just a single node has a height of 0, since there are no edges. A tree with a single layer of children has a height of 1. If there are children of those children, the height is 2, and so on.

🔗

A single node with two children.🔗 — Figure 30.3.2. A height 1 tree.
🔗

A single node with two children. Each of those children has two children.🔗 — Figure 30.3.3. A height 2 tree.
🔗

Insight 30.3.1.

Height is number of steps you can take from the root to the deepest node.

🔗

Note that in a complete binary tree, every level has twice as many nodes as the level above it. The first level (the root) has 1 node. The second level has 2 nodes. The third level has 4 nodes. The fourth level has 8 nodes. In general, level \(h\) has \(2^h\) nodes.

🔗

If we sum up all the nodes from level 0 to level \(h\text{,}\) we can calculate the total number of nodes in a complete binary tree of height \(h\text{.}\) This sum is a geometric series:

🔗

\begin{equation*} 1 + 2 + 4 + \cdots + 2^h = \sum_{i=0}^h 2^i \end{equation*}

That sum can be calculated using the formula for the sum of a geometric series:

\begin{equation*} \sum_{i=0}^h r^i = \frac{r^{h+1} - 1}{r - 1} \end{equation*}

🔗

Which means that for our series where \(r = 2\text{,}\) we have:

\begin{equation*} \sum_{i=0}^h 2^i = \frac{2^{h+1} - 1}{2 - 1} = 2^{h+1} - 1 \end{equation*}

🔗

We can confirm this using a height 3 tree. A height 3 three would have levels of 1, 2, 4, and 8 nodes respectively, summing to 15 nodes in total. The formula gives us \(2^{3+1} - 1 = 2^4 - 1 = 15\text{,}\) a perfect match.

🔗

Insight 30.3.2.

A complete binary tree of height \(h\) has \(2^{h+1} - 1\) nodes.

🔗

The more useful formula for heaps is the inverse of this. Given \(n\) nodes, what is the height of the complete binary tree that contains them? We can solve the formula for \(h\text{:}\)

🔗

\begin{align*} n & = 2^{h+1} - 1\\ n + 1 & = 2^{h+1}\\ \log_2(n + 1) & = \log_2(2^{h+1})\\ \log_2(n + 1) &= h + 1\\ h &= \log_2(n + 1) - 1 \end{align*}

Let’s test this with our previous example of a height 3 tree with 15 nodes. Plugging in 15 for \(n\) gives us:

\begin{equation*} h = \log_2(15 + 1) - 1 = \log_2(16) - 1 = 4 - 1 = 3\text{.} \end{equation*}

What if a tree has 100 nodes? Plugging in 100 for \(n\) gives us:

\begin{equation*} h = \log_2(100 + 1) - 1 = \log_2(101) - 1 \approx 6.658 - 1 \approx 5.658\text{.} \end{equation*}

Since height must be an integer, we round up (ceil) to get a height of 6.

🔗

Insight 30.3.3.

A complete binary tree with \(n\) nodes has a height of \(ceil(\log_2(n + 1) - 1)\text{.}\)

🔗

The precise formula for height is not as important as the fact that height grows logarithmically with the number of nodes. Given a number of nodes \(n\text{,}\) the height \(h\) is given by \(h\) and\(h \approx \log_2(n)\text{.}\) The height of a tree also tells us the maximum number of steps it takes to get from the root to any node. In the case of our binary heap, this means that any operation that involves traversing from the root to any node (or vice versa) will take at most \(O(\log n)\) steps.

🔗

We can see this fact intuitively if we focus on the number of nodes in the current row as we move from the bottom of the tree to the top of the tree. Imagine we start on a level with 32 nodes. When we move up to the next level, there will only be 16 nodes. The level above that will have 8 nodes. Then 4, then 2, then 1. Each time we move up a level, the number of nodes is halved. This pattern:

🔗

32, 16, 8, 4, 2, 1

Is a classic \(log_2\) pattern. The starting value we used is the number of nodes at the bottom level. That is different than the number of nodes in the entire tree, but the two are directly related. In a complete tree, the bottom level will always have about half the nodes in the entire tree. (Confirm it yourself by drawing a few complete binary trees and counting the nodes.)

🔗

So, if we know there are 100 nodes in a tree, we can expect the bottom level to have about 50 of them. And, from there, it is approximately \(log_2(50)\) steps to the top. Those are approximations, but even so, they give an answer of ~5.64, which is essentially what we calculated with the precise formula.

🔗

Insight 30.3.4.

The max number of levels in a heap is \(O(\log n)\text{.}\)

🔗

Checkpoint 30.3.1.

What is the height of a heap with 4 “rows” of nodes?

🔗

Checkpoint 30.3.2.

Hint.

Remember to round up your answer to the nearest greater integer.

🔗

You have attempted of activities on this page.

🔗

Prev Top Next