Trees are fundamental data structures that are important to know for interviews or on any computer science course. They are actually a type of graph without cycles. An example of a tree we use daily is the file structure in a computer. Another example is the DOM or Document Object Model of a webpage. There are different types of trees. The one we will focus on here is the binary search tree.

cluster-create

Figure 1. Parts of a Binary Search Tree

Tree Terminology

Algorithmic Trees and Trees of the Earth

An example of a simplified binary tree is shown above. These have similarities to a natural tree, flipped upside-down so that the roots are at the top. Note that a natural tree is represented diagrammatically above only for an illustration of the analogy and this is not typically done.

In keeping with this idea, we have roots at the top, which are analogous to the root node in the binary tree. There are branches which are analogous to edges in the binary tree. These connect the child nodes and leaves (leaf nodes in the binary tree) to their parent nodes.

In the diagram above, node 26 and node 49 are the children of the root node 35. All the child nodes have parent nodes. It follows that the root node 35 is the parent of node 26 and node 49. Every node except the root node has one incoming edge. Though leaf nodes have no children they can be considered to have null pointers to their would-be children.

Levels

The level of a node denotes the number of generations a node is from the root. Hence, to find it, count the number of edges to the node from the root.

Sibling Nodes

Sibling nodes are nodes that live on the same level of the tree. That is, they have the same parent. For example, 26 and 49 are siblings whose parent is 35.

Paths

A path is what we get when we move from one node to another in the tree. Highlighting the edges between the nodes will show the path. An example of a path is shown as two blue dotted edges in the diagram above.

Subtrees

Any node can be considered the start of a subtree. All the descendants of the node become a part of the subtree. This includes the children and those children’s children etc. In the illustrated subtree, 49 is the root and the descendants are 43 and 55.

Keys

Keys are shown in the circle of the node. It is the value designated to the data field of the node in the implementation below. In this post it also called the data of the node.

 Binary Trees

Binary Tree Properties

In a binary tree, one node can have 0, 1 or 2 child nodes. Hence the terminology binary! The first child node is the left node and the second is the right node.

In other types of trees where the number of child nodes can vary (like a Trie, a type of tree commonly used for character lookups and autocompleting words) you can use an array to represent the children of a node, but since there are only two nodes here, two pointers implemented as properties of the node are sufficient.

Binary Search Tree Properties

In order for a binary tree to be considered a binary search tree, it has to fulfill certain ordering properties. For any subtree, the left child must be less than or equal to the root and the right child must be greater than the root.

Additionally, all of the descendants of a root node or the root node of a subtree which are on that node’s left side must also be less or equal to that root node. Similarly, for the right side descendants of a root or subtree root node, all the descendants must be greater than that root or subtree node.

Inserting a Value

To build a tree such as the one in Figure 1 above, we have to insert the nodes in a particular order. Insert 35 as the root node. Then consider 26, it is smaller than 35 so insert it as the left child of 35.

Then for 9, it is less than 35, so go to the left node. 9 is less than 26, so go to the left pointer of the node with the key value of 26. Since there is no node there insert 9 as the left node to the node with the key 26. See below for an implementation.

Searching for a value

The benefit of this structure is that it makes searching very efficient. The runtime is O(log N). This is because at every step half of the nodes are eliminated from the search. Searching for a value is similar to inserting a value as previously described.

For example, if searching for the key 31 in the tree, first look at the root node and compare it to the key. Since 31 is less than 35 we can eliminate the right half of the tree since if 31 is present in the tree it must be in the left subtree.

Next, compare 31 to 26. 31 is larger than 26 so the left subtree can be eliminated. Hence, compare 31 to the right node key. They match and we have found the search value successfully.

This search path is the same as the one highlighted by the blue dotted line in the diagram above.

Shortcomings

One major issue that may arise due to this binary search tree insertion property is that on insertion an unbalanced tree may form. For example, if we first insert a node with a key of 5 at the root, then nodes with keys of 4, 3, 2, and 1 successively we essentially end up with a linear data structure and lose the benefits of the O(log N) runtime. The worst case runtime becomes O(N) just as if we had to search to the end of a linked list.

cluster-create

Figure 2. An Unbalanced Binary Tree


AVL trees and Red-Black trees offer solutions to these problems. Red-Black trees, for instance, have specific additional properties and techniques they use like rotations which keep the tree balanced. They are beyond the scope of this discussion.

Traversals

There are three types of traversals to consider: in-order, pre-order, and post-order. For an in-order traversal visit the left node then the root then the right node. Use this to get all the nodes in the tree in order. An implementation is shown in the code below.

Pre-order traversals print the root node first then the left node and then the right node. Post-order traversals print the left node, the right node and then the root node last.

Implementation

Below (Example 1) is an implementation of a binary search tree in Python. Only a Node class is needed since the first node inserted is the root and the next nodes are inserted from that node using the insert function as described before.

Each node is initialized with a data value or key, and a pointer to a left and right child which in Python is None to indicate that there is no child yet.

The Insert Function

The insert function is implemented recursively. Essentially, the function keeps pushing further down the tree until you find an empty and appropriate spot to create a new node.

Consider the case of inserting 31, by imagining it is not yet in Figure 1. Compare 31 to 35. It is less than 35, so check if there is a pointer to a left node. Since there is a node there call insert on the node with 26. Then compare 31 to 26. It is larger so look at the right pointer of 26. There is no node there so create a new node with the key/data 31 and set the right pointer of 26 to this new node.

In-Order Traversal

A recursive implementation of an in-order traversal is shown. To do a pre-order traversal, move the print statement to line 1 of the body of the function. To do a post-order traversal, move the print statement to line 3 of the function.

This code goes all the way down to the left side of the tree and prints the leftmost leaf. Then it goes back up the recursion stack to print the parent of that node and goes to the right node all the way down to the leftmost leaf associated with that right node, then back up to the parent of that leaf and continues this process until all the nodes are printed.

Example 1. Implementation of a Binary Search Tree

class Node(object):
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None
    
    def insert(self, val):
        if val <= self.data:
            if self.left != None:
                self.left.insert(val)
            else: self.left = Node(val)
        else:
            if self.right != None:
                self.right.insert(val)
            else: self.right = Node(val)

    
    def in_order(self):
        if self.left != None: self.left.in_order()
        print(self.data)
        if self.right != None: self.right.in_order()

Conclusion

In this post, we went back to basics and covered binary search trees! We covered terminology relevant to all types of trees, binary tree and binary search tree properties, as well as functions for binary search trees like insertions, searching and traversing.

Finally, we ended with an implementation of a binary search tree in Python. A key point to remember is that for any subtree, the left child key must be less than or equal to the root and the right child key must be greater than the root. Also, all the descendants on the left of a node must have a key less than or equal to that node and all the descendants on the right of a node must have a key greater than that node.

 References

  1. McDowell, Gayle Laakmann. Cracking the Coding Interview: 189 Programming Questions and Solutions. CareerCup, LLC, 2015.

  2. Lafore, Robert. Data structures and algorithms in Java. Sams Publishing, 2017.