Trees

Exercise 6.1. A graph $G$ on $2n$ vertices is said to have “doubled degrees” if it has exactly 2 vertices with degree $k$ , for every $k \in \{1,2,\ldots n\}$ .

How many trees exist with “doubled degrees?”

Exercise 6.2. You’re a curator of a large modern art museum! Because your museum is particularly “edgy,” the room in which you’re displaying your artwork is a very strange-looking polygon.

You want to install $360^\circ$ cameras in the corners of your gallery, in such a way that your cameras see the entire room. What is the smallest number of cameras you need to install?

In general, what’s the largest number of cameras you could need for an art gallery that is a $n$ -sided polygon?

Trees

In our last section, we saw that the language of graph theory could be used to describe tons of real-life objects: the internet, transportation, social networks, tasks, and many other things! Even your computer (a particularly relevant thing to consider in a computer science class) can be described as a graph:

Vertices: all of the files and folders in your computer.
Edges: Draw an edge from a file or folder to every object it contains.

If you draw this out, you’d get something similar to the drawing below:

This graph represents the file system for your computer, and is extremely useful for organizing files: imagine trying to find a document if literally every file on your computer had to live on your desktop, for instance!

This graph has a particularly useful structure: starting from $\fbox{C:}$ , there’s always exactly one way to get to any other file or folder if you don’t allow backtracking. That is: there are no files you can’t get to by starting from your root and working your way down, and also there are no files that you can get to in multiple different ways! This is a very nice property for a file system to have: you want to be able to navigate to every file in some way, and it’s very nice to know that files in different places are different (imagine deleting a file from your desktop and having all copies of it disappear in other places!)

We call graphs with the structural property described above trees. Trees come up all the time in real life:

PDF documents (like the one you’re reading right now!) are tree-based formats. Every PDF has a root note, followed by various sections, each of which contains various subsections.
In genealogy and genetics, people study family trees: i.e. take your great-grandmother, all of her children, all of her children’s children, and so on/so forth until you’re out of relatives. This is a tree, as starting from your great-grandmother there should only be one way to get to any relative.
Given any game (e.g. chess, or tic-tac-toe, or Starcraft), you can build a decision tree to model possible outcomes as the game progresses. To do so, make a vertex for the starting state. Then make a vertex for every possible move player 1 could make, and connect the starting state to all of these. For each of those states, make a vertex for every response player 2 could make, and connect those states up as well; doing this for all possible moves generates a decision tree, which you can use to win!

In short: they’re useful!

To define what a tree is, we first need to introduce a useful concept from graph theory that we didn’t have time to discuss last chapter:

Definition 6.1. A graph $G$ has another graph $H$ as a subgraph if $H$ is “contained within” $G$ . In other words, if you can take $G$ and remove vertices and/or edges from it until you get the graph $H$ , then $H$ is a subgraph of $G$ .

For example, the graph below has $C_5$ as a subgraph, because we can delete the “inside” vertices and edges to have just a $C_5$ left over.

Note that any graph $G$ is “trivially” a subgraph of itself, as we can just delete “nothing” from $G$ and have $G$ left over.

With this stated, we can define a tree as follows:

Definition 6.2. A tree is a graph $T$ that is connected and has no cycle graph $C_n$ as a subgraph.

For example, the three graphs in the margins are not trees, as each of them has a cycle graph of some length as a subgraph.

However, the three graphs below are all trees:

We call vertices of degree 1 in a tree the leaves of the tree. For example, the leaves of the trees above are colored green.

To get a bit of practice with these ideas, let’s prove a straightforward claim about trees:

Theorem 6.1. If $T$ is a tree containing at least one edge, then $T$ has at least two leaves.

Proof. Consider the following process for generating a path in $T$ :

Algorithm 6.11.

Choose any edge $e = \{x_0, y_0\}$ in $G$ .
Starting from $i=0$ , repeatedly do the following: if $x_i$ has degree $\geq 2$ , then pick a new edge $\{x_i, x_{i+1}\}$ leaving $x_i$ . Because $T$ is a tree, $x_{i+1}$ is not equal to any of our previously-chosen vertices (if it was, then we’d have created a cycle.) Stop when $x_i$ eventually has degree 1.
Starting from $i=0$ , do the same thing for $y_i$ .

Notice that this process must eventually stop: on a tree with $n$ vertices, we can only put $n$ vertices in our path because the “no cycle” property stops us from repeating vertices. When it stops, the endpoints of the path generated are both leaves because this is the only way we stop this process. Therefore, this process eventually finds two leaves in any tree!

\square

As a bit of extra practice, let’s try to use our tree language to sketch a solution to our second exercise:

Answer to Exercise 6.2. It turns out that you can guard any $n$ -sided polygon (without any holes, and where all sides are straight) with at most $\left \lfloor \frac{n}{3} \right \rfloor$ cameras! To do so, use the following process:

Take your $n$ -sided polygon. By connecting opposite vertices, divide it up into triangles.

Turn this into a graph: think of each triangle as a vertex, and connect two triangles with an edge when they share a side.

This graph is a tree! (Why? Justify this to yourself.)
Use this tree structure to do the following:
- Take any triangle. Color its 3 vertices red, blue, and green.
- Now, go to any triangle that shares a boundary with that colored triangle. It will have 2 of its three vertices given colors. Give its third vertex the color it’s currently missing.
- Repeat this process! It never runs into conflicts, because our graph is a tree (and so we don’t have cycles.)

Result of the above: every triangle has one red vertex, one blue vertex, and one green vertex.
Put a camera on the least-used color! This needs at most $n/3$ rounded down cameras, as we’re using the least popular of three colors. It also guards everything, as a camera sees everything in each triangle it’s in!

Useful Results on Trees

One particularly useful thing about trees is that they can be defined in many different ways! In the section above, we define a tree as a connected graph with no cycles. However, we have two other properties that also characterize when a graph is a tree:

Theorem 6.2.

\boxed{\begin{array}{c} \textit{T is a tree} \\ \end{array}}

is equivalent to

\boxed{\begin{array}{c} \textit{there is exactly one} \\ \textit{path between any two} \\ \textit{vertices in T} \end{array}}

Theorem 6.3.

\boxed{\begin{array}{c} \textit{T is a tree} \\ \end{array}}

is equivalent to

\boxed{\begin{array}{c} \textit{T is connected} \\ \textit{and has n-1 edges} \\ \end{array}}

Recall from Claim that that two statements are equivalent if they hold in precisely the same situations: i.e. whenever one is true, the other is true, and vice-versa.

We prove the first of these theorems here:

Proof of Theorem 6.2. Because we’re proving that these two statements are equivalent, we need to show that if either of them is true, then the other statement follows. That is, if you wanted to show that “attending office hours” and “getting an $A+$ in Compsci 120” were equivalent things, you wouldn’t be satisfied if I said “everyone who got an $A+$ in Compsci 120 attended office hours:” you’d also want to know whether “everyone who attended office hours got an $A+$ “!

As such, this proof needs to go in 2 steps:

First, we need to show that if $T$ is a tree, then there’s a unique path between any two vertices in $T$ .
Then, we need to show that if there’s a unique path between any two vertices in $T$ , then $T$ is a tree.

We do each of these one-by-one:

Because $T$ is a tree, by definition we know that $T$ is connected. By the definition of connected, we know that for any two vertices $x,y$ there is at least one path that goes from $x$ to $y$ . To complete our proof, then, we just need to show that there aren’t multiple paths between any two vertices.

To see why two distinct paths is impossible, we proceed by contradiction: i.e. we suppose that we’re wrong, and that it is somehow possible for us to have two different paths linking a pair of vertices.

Let’s give those vertices and paths names: that is, let’s assume that there are vertices $x,y$ linked by two different paths $P_1 = \{v_1 = x, v_2\}, \{v_2, v_3\}, \ldots \{v_{n-1}, y\}$ and $P_2 = \{w_1 = x, w_2\}, \{w_2, w_3\}, \ldots \{w_{m-1}, y\}$ . Because these two paths are different, there must be some value $i$ such that $v_i \neq w_i$ . Let $i$ be the smallest such value, so that these paths agree at $v_i = w_i$ and diverge immediately afterwards.

These two paths must eventually meet back up, as they end at the same vertex $y$ . Let $k, l$ be the two smallest values greater than $i$ such that $v_k = w_l$ . Notice that this means that all of the vertices $v_{i-1}, v_i, v_{i+1}, \ldots$ $v_{k-1}, v_k, w_{l-1}, w_{l-2} \ldots, w_{i}$ are distinct (as otherwise we could have picked even smaller values at which these paths met back up.)

Now, look at the walk formed by starting $P_1$ at $v_{i-1}$ , proceeding until $v_k$ , and then taking $P_2$ backwards from $w_l$ until $w_{i-1}$ . This walk repeats no vertices other than the starting and ending one, by construction. Therefore it is a cycle!

But we are in a tree, and trees do not contain cycles. Therefore this is a contradiction to our assumption that we had two distinct paths. In other words, our assumption that two paths could exist was false, and we must have exactly one path, as claimed!
More-or-less, we can just reverse the argument above!

That is: if $T$ has our unique path property, then every two vertices in $T$ are connected by a path, and so $T$ is connected.

To see why $T$ cannot have any cycles: simply notice if $T$ did contain a cycle, then it would give us two different paths between two vertices: you could go one way or the other around the cycle! Therefore, our unique path property stops us from having cycles, and thus means that $T$ is a tree.

\square

This result should help us understand how our “trees don’t have cycle” property connects to the “there’s exactly one path from $\fbox{C:}$ to any other file” property that made our filesystem example so useful!

The second of these results requires induction, and so we’ll delay its proof until Section 7.9. It’s quite useful, though, and worth knowing even if we can’t prove it yet! For example, it lets us solve one of our exercises:

Answer to Exercise 6.1. Let $T$ be a tree on $2n$ vertices with the “doubled degrees” property. Notice that if a graph $T$ on $2n$ vertices has the “doubled degree” property, then the sum of the degrees in $T$ is $1+1+\ldots + n + n = 2(1+2+\ldots + n) = 2\dfrac{n^2+n}{2} = n^2+n$ .

As well, the degree-sum formula from our graph theory chapter tells us that the sum of the degrees in $T$ is twice the number of edges. Therefore, we have that $n^2+n = 2E$ .

Finally, because $T$ is a tree, we know that it has one less edge than it has vertices (i.e. it has $2n-1$ edges, because $G$ is on $2n$ vertices.)

Combining this all together tells us that $n^2+n = 2(2n-1) = 4n-2$ ; i.e. $n^2-3n + 2 = 0$ ; i.e. $(n-1)(n-2) = 0$ , i.e. $n=1$ or $n=2$ . In other words, if $T$ is a tree with the doubled degrees property, then $T$ is either a tree on $2$ or $4$ vertices.

For $n=1$ , the “doubled degree” property would tell us that $T$ should be a two-vertex graph with two vertices of degree 1. There is exactly one tree of this form, namely

For $n=2$ , the “doubled degree” property would tell us that $T$ should be a four-vertex graph with two vertices of degree 1 and two of degree 2. There is also exactly one tree of this form, namely

Rooted Trees

In many of the examples we looked at earlier (file systems, genealogy) it is natural to think of one vertex in our tree as the start, or root of our tree, and of all of our other vertices as “descending” from that root vertex. We formalize this idea with the concept of a rooted tree, defined here:

Definition 6.3. We say that a rooted tree is any tree in which we designate one vertex $r$ to be the “root” of the tree.

Given a rooted tree, we can draw it as follows:

Algorithm 6.12.

0: At the top of the page, draw the root vertex $r$ . We think of this as “level 0” of the tree.
1: Below this vertex, draw all vertices adjacent to $r$ along with those edges. Call this “level 1.”
2: Below these vertices, draw all vertices that are adjacent to vertices in level 1 that have not already been drawn. Call this “level 2.” …
k: In general, if level $k-1$ has been drawn, draw all vertices that are adjacent to vertices in level $k-1$ that have not already been drawn. Call this “level $k$ .”

Keep doing this until you run out of vertices to draw!

Example 6.1. Three trees are given below.

We draw each of them with vertex $A$ chosen as the root:

We do this again,but now with vertex $B$ chosen as the root:

The following terms are useful when discussing rooted trees:

Definition 6.4. We say that the children of a vertex $v$ are all of the neighbors of $v$ at the level directly below $v$ , and the parent of $v$ is the neighbor of $v$ at the level directly above $v$ . The height of a rooted tree is the largest level index created when drawing the graph as above.

A particularly useful tree in computer science is a binary tree:

Definition 6.5. A binary tree is a rooted tree in which every vertex has either no children, one child, or $2$ children. A binary tree is called full if every vertex only ever has no children or two children.

You can generalize this to $m$ -ary trees for any $m$ , by changing the restriction here to ask that every vertex has at most $m$ children.

Example 6.2. Three binary trees are drawn below.

To finish out our chapter and practice working with this concept, let’s study a quick result about binary trees:

Exercise 6.3. Suppose that $T$ is a full binary (i.e. 2-ary) tree with 100 leaves. How many vertices does $T$ have in total?

Answer to Exercise 6.3. There are three kinds of vertices in $T$ :

The root vertex $r$ . Because this is a full binary tree that contains more than one vertex, $r$ must have exactly two children, and thus has degree 2.
All of the other parent vertices. Each of these have two children because we’re a full binary tree, and exactly one parent by (c) + the fact that they’re not the root. So these all have degree 3.
All of the leaves, which have degree 1.

Suppose that there are $p$ parent vertices in our graph; then the sum of degrees in this graph is $2 + 3p + 100$ . As shown in class, the sum of degrees in any graph is twice the number of edges. Because this is a tree on $1 + p + 100$ vertices (100 leaves, one root, and $p$ parents), this is therefore equal to $2p + 200$ , as any tree has one less edge than it has vertices.

So: $2 + 3p + 100 = 2p + 200$ implies that $p=98$ , and therefore that we have $\fbox{199}$ vertices in total!

Practice Problems

(-) In a rooted tree on $n$ vertices, what is the maximum number of children a vertex can have?
Show that in a rooted tree, no vertex has two distinct parents.
Suppose that $T$ is a full ternary (i.e. $3$ -ary) tree with exactly 99 leaves. How many vertices in total does $T$ have?
Suppose that $G$ is a graph with the following two properties:
- $G$ is connected.
- If we delete any edge from $G$ , $G$ is no longer connected.
Show that $G$ is a tree.
Suppose that $T$ is a rooted full binary tree on 99 vertices. What is the maximum height of $T$ ? What is the minimum height of $T$ ? Justify your claims.
(+) A graceful labeling of a graph with $E$ edges is a labeling $l(v)$ of its vertices with distinct integers from the set $\{0,1,2, \ldots E\}$ , such that each edge $\{u, v\}$ is uniquely determined by the difference $|l(u) - l(v)|$ .

Let $P_n$ denote the $n$ -vertex path graph, formed by drawing $n$ vertices $v_1, \ldots v_n$ in a row and connecting each $v_i$ to $v_{i+1}$

Show that each $P_n$ tree is graceful.
(+) A caterpillar tree is a tree such that deleting all of its leaves leaves us with a single path (i.e. they kinda look like caterpillars.)

Show that all caterpillar trees are graceful.
(++) Show that all trees are graceful.