Subsection Section Preview
Investigate!
Consider the graph drawn below.
Find a subgraph with the smallest number of edges that is still connected and contains all the vertices.
Find a subgraph with the largest number of edges that doesnβt contain any cycles.
What do you notice about the number of edges in your examples above? Is this a coincidence?
One very useful and common approach to studying graph theory is to restrict your focus to graphs of a particular kind. For example, you could try to really understand just complete graphs or just bipartite graphs, instead of trying to understand all graphs in general. That is what we are going to do now, looking at
trees. Hopefully by the end of this section we will have a better understanding of this class of graph, and also understand why it is important enough to warrant its own section.
Definition 2.2.1. Trees and Forests.
A
tree is a connected graph containing no cycles.
A
forest is a graph containing no cycles. Note that this means that a connected forest is a tree.
Does the definition above agree with your intuition for what graphs we should call trees? Try thinking of examples of trees, and make sure they satisfy the definition. One thing to keep in mind is that while the trees we study in graph theory are related to trees you might see in other subjects, the correspondence is not exact. For instance, in anthropology, you might study family trees, like the one below,
So far so good, but while your grandparents are (probably) not blood relatives, if we go back far enough, it is likely that they did have
some common ancestor. If you trace the tree back from you to that common ancestor, then down through your other grandparent, you would have a cycle, and thus the graph would not be a tree.
You might also have seen something called a
decision tree (such as the algorithm for deciding whether a series converges or diverges). Sometimes these too contain cycles, as the decision for one node might lead you back to a previous step.
Both the examples of trees above also have another feature worth mentioning: There is a clear order to the vertices in the tree. The definition of a tree does not include this added structure, although we can impose such a structure by considering
rooted trees, where we simply designate one vertex as the
root. We will consider such trees in more detail later in this section.
In this section, we will explore some basic properties of trees, which will serve as an excellent introduction to writing proofs about graphs. We will also consider a special kind of tree, called a
spanning tree, which is a tree that includes all the vertices of a connected graph. Finally, we will briefly consider rooted trees.
Worksheet Preview Activity
1.
Subsection Properties of Trees
We wish to really understand trees. This means we should discover properties of trees: what makes them special and what is special about them.
A tree is a connected graph with no cycles. Is there anything else we can say? It would be nice to have other equivalent conditions for a graph to be a tree. That is, we would like to know whether there are any graph theoretic properties that all trees have, and perhaps even that
only trees have.
To get a feel for the sorts of things we can say, we will consider three
propositions about trees. These will also illustrate important proof techniques that apply to graphs in general, and happen to be a little easier for trees.
Our first proposition gives an alternate definition for a tree. That is, it gives necessary and sufficient conditions for a graph to be a tree.
Proposition 2.2.2.
A graph
is a tree if and only if between every pair of distinct vertices of
there is a unique path.
Proof.
This is an βif and only ifβ statement, so we must prove two implications. We start by proving that if
is a tree, then between every pair of distinct vertices there is a unique path.
Assume
is a tree, and let
and
be distinct vertices (if
only has one vertex, then the conclusion is satisfied automatically). We must show two things to show that there is a unique path between
and
that there is a path, and that there is not more than one path. The first of these is automatic; since
is a tree, it is connected, so there is a path between any pair of vertices.
To show the path is unique, we suppose there are two paths between
and
and get a contradiction. The two paths might start out the same, but since they are different, there is some first vertex
after which the two paths diverge. However, since the two paths both end at
there is some first vertex after
that they have in common, call it
Now consider the two paths from
to
Taken together, these form a cycle, which contradicts our assumption that
is a tree.
Now we consider the converse: If between every pair of distinct vertices of
there is a unique path, then
is a tree. So assume the hypothesis: Between every pair of distinct vertices of
there is a unique path. To prove that
is a tree, we must show it is connected and contains no cycles.
The first half of this is easy:
is connected, because there is a path between every pair of vertices. To show that
has no cycles, we assume it does, for the sake of contradiction. Let
and
be two distinct vertices in a cycle of
Since we can get from
to
by going clockwise or counterclockwise around the cycle, there are two paths from
and
contradicting our assumption.
We have established both directions so we have completed the proof.
Read the proof above very carefully. Notice that both directions had two parts: the existence of paths, and the uniqueness of paths (which related to the fact that there were no cycles). In this case, these two parts were really separate. In fact, if we just considered graphs with no cycles (a forest), then we could still do the parts of the proof that explore the uniqueness of paths between vertices, even if there might not
exist paths between vertices.
This observation allows us to state the following
corollary.
Corollary 2.2.3.
A graph
is a forest if and only if between any pair of vertices in
there is at most one path.
We do not give a proof of the corollary (it is, after all, supposed to follow directly from the proposition), but for practice, you are asked to give a careful proof in the exercises. When you do so, try to use proof by contrapositive instead of proof by contradiction.
Our second proposition tells us that all trees have
leaves: vertices of degree one.
Proposition 2.2.4.
Any tree with at least two vertices has at least two vertices of degree one.
Proof.
We give a proof by contradiction. Let
be a tree with at least two vertices, and suppose, contrary to stipulation, that there are not two vertices of degree one.
Let
be a path in
of longest possible length. Let
and
be the endpoints of the path. Since
does not have two vertices of degree one, at least one of these must have degree two or higher. Say that it is
We know that
is adjacent to a vertex in the path
but now it must also be adjacent to another vertex, call it
Where is
It cannot be a vertex of
because if it was, there would be two distinct paths from
to
the edge between them, and the first part of
(up to
). But
also cannot be outside of
for if it was, there would be a path from
to
that was longer than
which has the longest possible length.
This contradiction proves that there must be at least two vertices of degree one. In fact, we can say a little more:
and
must
both have degree one.
The proposition is quite useful when proving statements about trees, because we often prove statements about trees by
induction. This is a proof technique that we investigate fully in
Section 4.5, but for now, all we need to understand is that it is useful to compare a given tree to smaller trees. To show something is true of a tree with
vertices, we will assume the thing is true of all trees with
vertices. By removing a vertex of degree 1, we get this smaller tree, and then we just need to show that putting the vertex back doesnβt mess us up.
Is there a tree with exactly 7 vertices and 7 edges? Try to draw one. Could a tree with 7 vertices have only 5 edges? There is a good reason that these seem impossible to draw.
Proposition 2.2.5.
Let
be a tree with
vertices and
edges. Then
We will prove that this proposition is true for all possible values of
Note that if
then the tree must have
edges, so yes,
We could then look at trees with
vertices (there is only one, and it has
edges) and then all trees with
vertices, and then
and so on, but that would take forever. Literally!
Instead, we will do a version of
proof by contradiction called the
minimal criminal. We will assume the proposition is not true (just like we would start any proof by contradiction). This means that there is some
smallest tree for which it isnβt true. If this smallest counterexample (i.e., minimal criminal) has
vertices, then we are guaranteed that
all trees with
vertices are
not counterexamples. Letβs see how this works out.
Proof.
Suppose, for the sake of contradiction, that the proposition is not true for all trees. Let
be a tree for which the proposition is not true, with the smallest number of vertices among all the counterexamples. Let
be the number of vertices of
Since the only tree with one vertex has zero edges, that cannot be our tree
so we can assume
In particular, we know by
Proposition 2.2.4 that
must contain at least one vertex of degree 1, call it
Let
be the tree resulting from removing
from
(together with its incident edge). Since we removed a leaf,
is still a tree (the unique paths between pairs of vertices in
are the same as the unique paths between them in
).
Now
has
vertices. Since
was the smallest tree for which the proposition isnβt true, we know that the proposition is true for
So
must have one fewer edges than vertices; that is,
has
edges. But
has one more edge than
so it has
edges. This contradicts our assumption that
does not satisfy the proposition, which completes our proof.
Subsection Spanning Trees
One of the advantages of trees is that they give us a few simple ways to travel through the vertices. If a connected graph is not a tree, then we can still use these traversal algorithms if we identify a subgraph that
is a tree.
First we should consider if this even makes sense. Given any connected graph
will there always be a subgraph that is a tree? Well, that is actually too easy: You could just take a single vertex of
If we want to use this subgraph to tell us how to visit all vertices, then we want our subgraph to include all of the vertices. We call such a tree a
spanning tree.
Definition 2.2.6.
Given a connected graph
a
spanning tree of
is a subgraph of
which is a tree and includes all the vertices of
It turns out that every connected graph has one (and usually many).
Theorem 2.2.7.
Every connected graph has a spanning tree.
How do we know? We can give an algorithm for
finding a spanning tree! Start with a connected graph
If there is no cycle, then
is already a tree and we are done. If there is a cycle, let
be any edge in that cycle and consider the new graph
(i.e., the graph you get by deleting
). This tree is still connected: Since
belonged to a cycle, there were at least two paths between its incident vertices. Now repeat: If
has no cycles, we are done; otherwise define
to be
where
is an edge in a cycle in
Keep going. This process must eventually stop, since there are only a finite number of edges to remove. The result will be a tree, and since we never removed any vertex, a
spanning tree.
This is by no means the only algorithm for finding a spanning tree. You could have started with the empty graph and added edges that belong to
as long as adding them would not create a cycle. You have some choices as to which edges you add first: You could always add an edge adjacent to edges you have already added (after the first one, of course), or add them using some other order. Which spanning tree you end up with depends on these choices.
Example 2.2.8.
Find two different spanning trees of the graph,
Solution.
Here are two spanning trees.
Although we will not consider this in detail, these algorithms are usually applied to
weighted graphs. Here every edge has some weight or cost assigned to it. The goal is to find a spanning tree that has the smallest possible combined weight. Such a tree is called a
minimum spanning tree. Finding the minimum spanning tree uses basically the same algorithms as we described above, but when picking an edge to add, you always pick the smallest (or when removing an edge, you always remove the largest).
Subsection Rooted Trees
So far, we have thought of trees only as a particular kind of graph. However, it is often useful to add additional structure to trees to help solve problems. Data is often structured like a tree. This book, for example, has a tree structure: Draw a vertex for the book itself. Then draw vertices for each chapter, connected to the book vertex. Under each chapter, draw a vertex for each section, connecting it to the chapter it belongs to. The graph will not have any cycles; it will be a tree, but a tree with a clear hierarchy which is not present if we donβt identify the βbook vertexβ as the βtopβ.
As soon as one vertex of a tree is designated as the
root, then every other vertex on the tree can be characterized by its position relative to the root. This works because there is a unique path between any two vertices in a tree. So from any vertex, we can travel back to the root in exactly one way. This also allows us to describe how distinct vertices in a rooted tree are related.
If two vertices are adjacent, then we say one of them is the
parent of the other, which is called the
child of the parent. Of the two, the parent is the vertex that is closer to the root. Thus the root of a tree is a parent, but is not the child of any vertex (and is unique in this respect: All non-root vertices have
exactly one parent).
Not surprisingly, the child of a child of a vertex is called the
grandchild of the vertex (and it is the
grandparent). More generally, we say that a vertex
is a
descendent of a vertex
provided
is a vertex on the path from
to the root. Then we would call
an
ancestor of
For most trees (in fact, all except paths with one end the root), there will be pairs of vertices neither of which is a descendant of the other. We might call these cousins or siblings. In fact, vertices
and
are called
siblings provided they have the same parent. Note that siblings are never adjacent (do you see why?).
Example 2.2.9.
If we designate vertex
as the root, then
and
are the children of
and are siblings of each other. Among the other things we cay say are that
is a child of
and a descendant of
The vertex
is a descendant of
in fact, is a grandchild of
Vertices
and
are siblings, since they have the common parent
Notice how this changes if we pick a different vertex for the root. If
is the root, then its lone child is
which also has only one child, namely
We would then have
the child of
(instead of the other way around), and
is the descendant of
instead of the ancestor.
and
are now siblings.
All of this flowery language helps us describe how to
navigate through a tree. Traversing a tree, visiting each vertex in some order, is a key step in many algorithms. Even if the tree is not rooted, we can always form a rooted tree by picking any vertex as the root. Here is an example of why doing so can be helpful.
Example 2.2.10.
Explain why every tree is a bipartite graph.
Solution.
To show that a graph is bipartite, we must divide the vertices into two sets, and so that no two vertices in the same set are adjacent. Here is an algorithm that does just this.
Designate any vertex as the root. Put this vertex in set Now put all of the children of the root in set None of these children are adjacent (they are siblings), so we are good so far. Now put into every child of every vertex in (i.e., every grandchild of the root). Keep going until all vertices have been assigned one of the sets, alternating between and every βgeneration.β That is, a vertex is in set if and only if it is the child of a vertex in set
The key to how we partitioned the tree in the example was to know which vertex to assign to a set next. We chose to visit all vertices in the same generation before any vertices of the next generation. This is usually called a
breadth-first search (we say βsearchβ because you often traverse a tree looking for vertices with certain properties).
In contrast, we could also have partitioned the tree in a different order. Start with the root; put it in
Then look for one child of the root to put in
Then find a child of that vertex, into
and then find its child, into
and so on. When you get to a vertex with no children, retreat to its parent, and see if the parent has any other children. So we travel as far from the root as fast as possible, then backtrack until we can move forward again. This is called
depth-first search.
These algorithmic explanations can serve as a proof that every tree is bipartite, although care needs to be spent to prove that the algorithms are
correct. Another approach to prove that all trees are bipartite, using induction, is requested in the exercises.