# Trees Interview Questions & Tips

By Kenny Polyak | Published: June 26, 2023

## What is a Tree?

A tree is a hierarchical data structure in computer science consisting of nodes connected by edges. Each node is an element which can store data, representing some scalar value or a key, and the edges contained in that node, representing references to other nodes.

In the most general sense, trees are simply directed acyclic graphs (DAGs) where each node can only have a single other node pointing to it, containing N nodes and N-1 edges. As a hierarchical structure, the topmost node in a tree is called the root node, which has no parent. Every other node in the tree has exactly one parent node and contains connections to an arbitrary number of child nodes; a node that has no children is called a leaf node.

## Node Structure

There is no superstructure that can randomly access any particular node in the tree. Instead, trees are inherently recursive: each child of a node is a subtree in itself. When handling trees in software engineering, we usually pass around a reference to the root, from which we can access and manipulate the tree data.

Here's an example of tree node definition:

``````class TreeNode {
Integer data = null;
List<TreeNode> children = new ArrayList<TreeNode>();

TreeNode(Integer value) {
data = value;
}
}``````

Typically, children don't have references to their parents. While this is a possible modification to make on the tree node if the need justifies the extra space, it is rarely done given that the tree is almost always traversed from the root. To learn more about graph and tree traversal algorithms, read about Depth-First Search and Breadth-First Search.

## Different Types of Trees

Trees can be tall or wide and everything in between. Since the only restriction on a tree is that the edges are directed and there are no cycles, we can find really tall trees where each node only has one child, or really wide trees where each node has thousands of children.

Trees can have rulesets that enforce the order of nodes on insertion, or they can be randomly built top-down.

By applying specific rulesets to trees, we can define different subtypes with their own advantages and tradeoffs. Typically, these subtypes enforce some criteria related to the height of the tree - the height of a tree is the number of edges in the longest path from the roof to a leaf - or to the order that tree nodes must be in.

There are many types of tree implementations, each with their own set of rules and advantages. Below are a just a few examples, with links to further discussion for the tree types we see most often in interviews:

• Binary Trees: a tree where each node has up to two children.
• N-ary Trees: a tree where each node can have any number of children.
• Binary Search Trees: a binary tree where each node value follows a specific ordering property - the node is greater than all the nodes in its left subtree and smaller than all the nodes in its right subtree.
• AVL and Red-black Trees: a height-balanced tree, such that the heights of the left and right subtrees do not differ more than a certain amount.
• Binary Heap Trees: a complete binary tree that satisfies the heap property, used for priority queue operations.
• Tries: an n-ary tree specifically used to store string data.

## Common Operations on Trees

Although the implementation of these operations will differ greatly based on the type of tree, let's look at the simplest case with an unbalanced N-ary tree - this is a tree without any constraints on the positions of nodes or the number of children per node.

### Insert

If there are no constraints on where a node needs to be in a tree, then insertion can be as simple as finding a leaf and adding it as a new child. Below we implement a version of insert where we add the new node as a child to a specific parent.

``````public static void insertNode(TreeNode root, Integer parentValue, Integer newValue) {
if (root == null) return;
if (root.val == parentValue) {
TreeNode newNode = new TreeNode(newValue);
return;
}
for (TreeNode child: root.children) {
insertNode(child, parentValue, newValue);
}
}``````

### Basic Search

Searching in a tree is often explored in terms of traversal algorithms, as these will determine the path taken to search. But in its simplest form, searching in a tree is merely a recursive function - taking advantage of the recursive nature of a tree - that calls itself on each child of a node until the target node is found.

`````` public static TreeNode searchNode(TreeNode root, Integer target) {
if (root == null || root.val == target) {
return root;
}

for (TreeNode child : root.children) {
TreeNode result = searchNode(child, target);
if (result != null) {
return result;
}
}

return null;
}``````

### Delete

Deleting nodes from a tree is sometimes considered an advanced topic, especially when we want to preserve subtrees or adhere to constraints on the tree structure itself. We won't be diving into those here. But, if we want to delete an entire subtree, its as easy as performing the searchNode method above, and once the target is identified, removing it from the list of children from its parent.

## When to Use Trees In Technical Interviews

Trees come up often in technical interview questions because they’re the right amount of difficulty to challenge candidates without taking an unreasonable amount of time for those who understand them. As a result, trees come up in both coding and system design interviews.

### Using Trees in Coding Interviews

The most common tree questions involve either manipulation or traversal of trees. Manipulation can look like building a tree, converting a tree from one format to another, converting a linked list to a tree, inserting/removing nodes from a tree, and the like. These really test that you thoroughly understand how the data is structured.

Tree traversals generally involve iterating over the tree or searching for data in the tree [link to Tree and Graph traversals article]. This is where we will use tools like depth-first search (DFS) and breadth-first search (BFS) to our advantage.

Most tree interview problems will explicitly have a tree data structure as an input. These are usually cases where we'll be asked to traverse the tree, but it's also important to be familiar with adding and deleting nodes from a tree.

There are some common tree questions that are derived from well-known problems in computer science that come up often in interviews. A few examples include:

• The Lowest Common Ancestor (LCA) problem: Given two nodes in a binary tree, find their lowest common ancestor node.
• Checking if a tree is a subtree of another tree: this requires being able to write an algorithm that traverses the tree and can identify duplicate sub-structures.
• Maximum Path Sum: Find the maximum sum of values along any path from the root to a leaf node in a tree.
• Find the Height of a Tree: Calculate the height of a tree (the number of edges on the longest path from the root to a leaf node).

Finally, there are problems where the use of a tree is not so obvious. One common example is using a trie to find string prefixes. If a problem asks us to find strings by their prefix, for example when looking up a term in a dictionary or when implementing an autocomplete tool, then tries are inherently a great way to do that.

### Using Trees in System Design Interviews

Here are some examples of where trees are commonly used:

1. File Systems: Many file systems, such as the hierarchical file systems used in operating systems, utilize tree structures. Directories and subdirectories can be represented as nodes in a tree.
2. Search Engines: Search engines use tree-based indexing structures like tries to efficiently store and retrieve data related to keywords or phrases.
3. User Interfaces: Tree structures are used in various user interface components, such as menus, navigation bars, and organization charts. They enable hierarchical representation and navigation of elements.
4. Compilers: Control flow graphs (CFGs) and other tree-based structures are used in compiler optimizations to analyze program flow and optimize code generation. Trees can also be used to map dependencies.
5. XML/HTML Parsing: Tree structures are used in parsing and representing XML and HTML documents. The Document Object Model (DOM) represents these documents as trees, allowing for efficient manipulation and traversal.

## Common Mistakes in Interviews Featuring Trees

• Forgetting to handle edge cases. When implementing an algorithm involving a tree, be sure to explicitly check cases where the tree has none or only one node. This is often missed when working through solutions. Given that trees are often used in recursive algorithms, we also typically need to include base cases.
• Mishandling leaf nodes. All trees with nodes have leaves, and it’s easy for our logic to unintentionally try to access the children of a leaf, which could lead to a null pointer exception.
• Not using a diagram to problem-solve and communicate. Tree logic can become very complex, given its recursive nature. Make sure to visualize your algorithm with a tree diagram before you start coding, and focus on simple and readable code.
• Not considering tradeoffs in time and space complexity. Since we can customize tree nodes to fit out use cases, be sure to discuss the space implications of your solutions with your interviewer. Often we want to optimize for time complexity, so added space can be a useful tool when looking for optimizations.
• Misunderstanding tree terminology. Make sure to correctly understand and use terms such as root, parent, child, leaf, subtree, depth, and height. For example, interview questions often ask us to find the depth or height of a tree, or count the leaves of a tree - it's important to be familiar with these concepts.
• Misusing BFS or DFS. Although there are cases where both traversal algorithms are applicable without any meaningful complexity tradeoff, like simply searching through an unstructured n-ary tree, a candidate needs to be confident about which situations call for a specific traversal. A common use-case for BFS, for example, is search for the shortest path between two nodes.

1. Can we prioritize time complexity over space? All data structures involve some kind of tradeoff, and trees are no different. In some cases, storing more metadata on the node can help make searching more efficient, but it will come at the cost of extra memory. A great example of this tradeoff is when working with tries - we can store prefix data on each node which will limit the distance needed to traverse, but we can imagine the memory implications are high when representing an entire dictionary with a trie!
2. How frequently will tree operations be performed? Knowing if the problem will be more read or write heavy will help inform the approach, as well as the kind of scale that your tree would have to support. For instance, we may be working with a problem or a system that demands a lot of reads but the dataset will not change often, like in the case of representing a family tree that will need to be queried often. In these cases, we can afford the extra complexity when constructing the tree in order to take advantage of more efficient search.
3. What operations need to be supported? If you'll be implementing a tree, make sure to ask your interviewer what operations to prioritize during the interview. In some cases, they can allow you to skip the implementation of some less-important operations. For example, if the problem merely asks us to build a tree, we can potentially skip implementing the delete operation and instead focus all our time on making an efficient insert implementation. Mention that in production we would want our data structures to support all CRUD operations (create, read, update, delete), but in the interest of time you would prefer to focus on what is essential for the problem.
4. What are the characteristics of the input tree? Be sure to determine if there are any constraints that the input tree adheres to, such as balancing or sorting. If so, this would be a clue as to what kind of tree data structure you should focus on during the interview. For instance, if the input tree is a BST, we will be expected to take advantage of this structure for efficient search. Alternatively, if the input tree is not already sorted and is not balanced, it is likely not efficient to sort and balance the tree, and instead we might be expected to introduce additional data structures like a hashmap to arrive at an optimal solution.
5. Is the input tree an n-ary tree, a binary tree or or a binary search tree? Be sure to ask your interviewer if the input tree is a specific type of tree. Not only does it never hurt to ask, but interviewers may sometimes intentionally omit details to see if you will ask follow-up questions. Don't ever assume the type of input tree!

## About the Author

Kenny Polyak

Kenny is a software engineer and technical leader with four years of professional experience spanning Amazon, Wayfair, and U.S. Digital Response. He has taught courses on Data Structures and Algorithms at Galvanize, helping over 30 students land new software engineering roles across the industry, and has personally received offers from Google, Square, and TikTok.