Time complexity - Algorithm to find the lowest common ancestor of all deepest leaves

https://cs.stackexchange.com/questions/120186

29-09-2020
|

Question

This is the problem statement I came across today.

Given a binary tree, find the lowest common ancestor of all deepest leaves.

I came up with a correct algorithm, but I would like to confirm the running time of this approach.

The algorithm is as follows:

Traverse the tree and find the deepest level of the tree, dmax.
Traverse the tree and find all leaves at depth dmax.
Given that LCA(A,B,C) = LCA(LCA(A,B),C), traverse all nodes found at step 2 and calculate LCA.

The subroutine for LCA(A,B) is simple. Starting at A, go all the way up to the root and store each visited node in a hashset. Then, starting at B, go up until you find a node which is contained in the hashset.

I know the first two steps of the algorithm are both O(n), where n corresponds to the number of nodes in the tree. However, I am uncertain about the last step.

LCA(A,B) subroutine has a linear complexity. But how many nodes, in the worst scenario, can be found at step 2?.

Intuitively, I would argue that it has to be far less than n nodes.

Solution

As @Rick Decker explained, you could have $n/2$ leaves at the max depth in the one case. In this case, step 3 is $O(n\log n)$. This post shows the worst case. Consider a tree $T$ consists of a chain of $n/2$ nodes where the remaining $n/2$ nodes are attached as a balanced tree at the bottom of the chain. This gives every leaf depth $n/2+\log_2(n)=\Theta(n)$ With $n/4$ leaves at depth $\Theta(n)$ we have $\Theta(n^2)$ runtime for step 3 in this case. This is asymptotically the worst case since we have $n$ nodes at max depth $n$.

There's a better way to do this. Lets define a function $f$

$ f(v) = \begin{cases} v & \text{if}\quad \texttt{height}(v.left) = \texttt{height}(v.right) \\ f(v.left) & \text{if}\quad \texttt{height}(v.left) > \texttt{height}(v.right) \\ f(v.right) & \text{if}\quad \texttt{height}(v.left) < \texttt{height}(v.right) \end{cases} $

If the heights of the children of a node $v$ are the same, then clearly $v$ is the LCA of the deepest nodes of the subtree rooted at $v$. If the left subtree is taller, then we want the LCA of the deepest nodes of the subtree rooted at $v.left$, since they are deeper than the deepest nodes in the subtree rooted at $v.right$. The same logic follows for $v.right$ when it is taller.

The values for $\texttt{height}$ and $f(v)$ can be computed in a post-order traversal of $T$ in linear time.

Calling $f(root)$ should return the LCA of the deepest nodes in the tree.

OTHER TIPS

Well, I don't know what you intend by "far less than" but it's clear that you could have about $n/2$ leaves at maximum depth: take a complete binary tree, for example, with the lowest level full.

I'd write a function which for each tree calculates the lowest common ancestor of all deepest nodes, and the height of the tree. This is quite simple:

If the root R has no left or right branch, then the height is 1 and the common ancestor is R. If the root R has either a left or right branch, then calculate height and lowest common ancestor of that branch, and the height with root R is one higher, but the lowest common ancestor is the same.

If there is a left and a right branch, then calculate height and lowest common ancestor of both branches. If the heights are different, then we return (height + 1) plus the lowest ancestor of that branch. If the heights are the same, then we return (height + 1), and R as the lowest common ancestor.

That should take cN steps with a small constant c if the number of nodes is N. And since we must visit every node N (at least to know that it has no branches), this cannot be improved upon.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange