The tree data structure is one of the most fascinating data structures in computer science. Many interesting problems can be framed based on this data structure. Today we will be discussing one such interesting problem. Here is how the problem can be defined -
Given a rooted tree with n nodes, you have to answer q queries. In each query you have to find the k^th ancestor of a node.
Let us first discuss a simple solution. For each query, we iterate over all the ancestors of the given node. We return the required ancestor node. Here is a pseudo implementation.
def solveQuery(node, k):
while k>0:
k--;
node = parent[node];
return node;
Though the implementation is simple, the worst-case time complexity is O(N.Q)
. Thus the solution won’t work when quadratic implementation is too costly. This is where Binary Lifting comes into the picture. It is an interesting idea of solving the same problem in O(log(N).Q)
.
With our failed attempt in solving the problem efficiently, we can feel the need for some pre-processing. The pre-processing that we perform should help us answer each query quickly. The big question is, what kind of pre-processing does the problem need?
Pre-processing is the process of performing (some) computation beforehand to support the handling of each query in a better and more efficient manner. Some meta-data is maintained as a part of this process, which is used when answering actual queries.
Let us think about it. We currently know for each node its parent. So from any node, we can only find 1^st
ancestor of a node in constant time. So when we are faced with the problem of finding the k^th
ancestor we are forced to take at least k
hops. In other words, we are representing k
as 1 + 1 + … + 1
, where 1
appears k
times. So, what if we can find any (2^x)th
parent of a node in constant time? We can certainly represent k
as the sum of the powers of 2
. Since k
can be at most n
, we can always represent k
as the sum of log(n)
elements. Thus answering each query now takes O(log(n))
time.
Thus, if we can perform the preprocessing efficiently, the overall time complexity becomes O(PP + Q.log(N))
.
The complete process of Binary Lifting is divided into two parts - (1) Pre-Processing and (2) Query Resolution. We will be discussing these in the following sub-sections.
According to the ideas discussed in the motivation section, we will need to store (2^x)th
ancestor for each node. Let us store this information in a table ancestor
. The dimensions of this table are n*log(n)
. Here is what each entry in this table represents -
ancestor[n][x] -> (2^x)th parent of node n
Note that since the first ancestor of any node is its own parent we can write ancestor[n][0] = parent[n]
for each node n
. Now to fill the remaining entries we can follow the following recursive relation -
ancestor[n][x] = ancestor[ancestor[n][x-1]][x-1]
The idea with which we can arrive upon the above relation is as follows - To reach the (2^x)th
ancestor we can make two jumps of size (2^(x-1))
. Since (2^(x-1)) + (2^(x-1)) = (2^(x))
, we will ultimately reach the required ancestor. Thus we have all the required knowledge for performing the pre-processing step. Here is a pseudo implementation.
for node in [0:n-1]:
ancestor[node][0] = parent[node]
// We can not go up from root
for x in [1:log(n)]:
ancestor[root][x] = root
for x in [1:log(n)]:
for node in [0:n-1]:
if node is root then continue
ancestor[node][x] = ancestor[ancestor[node][x-1]][x-1]
Once we have completed the pre-processing part, we can answer each query efficiently. The idea is the same as that discussed in the motivation section, we will express k as the sum of powers of 2. We will then move up the tree following these powers of 2. The pseudo-code below expresses this idea more succinctly.
def solveQuery(node, k):
for i in [31:0]:
if k & (1<<i) > 0: // Checking if 2^i is present in binary representation of k
node = ancestor[node][i] // Move up to (2^i)th ancestor
return node
The below code assumes that the tree consists of n nodes from 0 to n-1. Since a tree with n nodes has n-1 edges, the input provides these n-1 edges in parent to child format. The tree is rooted at 0.
#include <bits/stdc++.h>
using namespace std;
vector<int> parent;
vector<vector<int>> ancestor;
int main() {
int n; cin>>n; // n is the number of nodes numbered 0 to n-1
int LOG = log2(n);
parent.assign(n, 0);
ancestor.assign(n, vector<int>(LOG+2));
// Assuming node 0 is the root of the tree
parent[0] = 0;
// Assuming next n-1 lines contain edges in format: parent_node child_node
for(int i=0; i<n-1; ++i) {
int p, c; cin>>p>>c;
parent[c] = p;
}
for(int i=0; i<n; ++i) ancestor[i][0] = parent[i];
for(int x=1; x<=LOG+1; ++x) ancestor[0][x] = 0;
for(int x=1; x<=LOG+1; ++x) {
for(int i=0; i<n; ++i) {
ancestor[i][x] = ancestor[ancestor[i][x-1]][x-1];
}
}
int q; cin>>q; // Number of Queries
while(q--) {
int node, k; cin>>node>>k;
for(int i=LOG+1; i>=0; i--) {
if( k & (1<<i) ) node = ancestor[node][i];
}
cout<<node<<endl;
}
}
Having understood the interesting ideas of Binary Lifting, we can focus our attention on some of its interesting applications. One such application is to find the lowest common ancestor of a given pair of nodes in a tree. Also, assume that there can be Q
queries of such kinds. Surprisingly we can use the process of Binary Lifting to solve this task quite elegantly.
We begin with a description of the algorithm. We will then see where Binary Lifting comes into play.
We are given two nodes and say n1 and n2 are pointers to these nodes. We have to find their LCA.
Move up the pointer pointing to the node at a larger depth (farthest from the root) until the depth of both the nodes become equal.
If both n1 and n2 point to the same node, the node being pointed is the LCA. We return this node and terminate. Else, we move to step 4.
We advance both node pointers up the tree until they can no longer be moved without pointing to the same node.
Return the parent of the node pointed by n1 as the LCA.
The algorithm is fairly simple and a naive implementation will take O(n) time to return the LCA. However, the catch in this algorithm is that we can use Binary Lifting to speed up steps 2 and 4. We first see how it fits in step 2.
Consider that the node pointer n1
points to the node at a larger depth, if not swap n1
and n2
. Now according to step 2, we need to move the node pointer n1
up until the depth of the newly pointed node becomes equal to that of the node pointed by n2
. We can see this as finding [depth(n1)-depth(n2)]th
ancestor of the node pointed to by n1
. This is solvable in logarithmic time using Binary Lifting.
Now on to step 4. At the point of reaching step 4, we are certain that nodes pointed by n1
and n2
are different. We have to move up both the node pointers simultaneously while ensuring they do not point to the same node. The following code snippet describes how we can use binary lifting to perform this step.
n1, n2 <- Node Pointers
for (int i=log(n); i>=0; i--) {
if(ancestor[n1][i]!=ancestor[n2][i]) {
n1 = ancestor[n1][i];
n2 = ancestor[n2][i];
}
}
return parent[n1] as LCA
This completes this amazing process of Binary Lifting. It is often fascinating how simple ideas can be used to solve problems of great difficulty. We discussed one of the many applications of Binary Lifting i.e., Lowest Common Ancestor(LCA). Can you figure out any others? Do let me know in the comment section😊.
You can follow me on Twitter at @rrriiisss03. Stay tuned for more interesting articles.