Childhood moments with father Today, I am remembering my childhood moments with my father. My father is a dreamer and he always encouraged me to dream bigger. His encouragement for me to dream and his support for me in pursuing my dreams have shaped greatly what I am today. I was born and raised in the days when gadgets and the internet were not common. At that time, my father loved to scribble when he was free, thanks to the unavailability of smartphones and the internet in my home. Sometime, I would find him writing a poem and other times, I would find him designing an interior architecture of a house (for fun). He loved cards. He used to see if the day is fortunate by playing the cards. I think cards speak the truth. The day he could not crack the cards fortune puzzle, he used to spend the whole day trying to make it right. I remember my father playing with “Regmi ancestry tree”, sometime in his big notebook and sometime in chart papers. He was fond of drawing the tree with a marker pen and giving a rectangular boundary to each person (node). His edges would be linear block edges with the outgoing arrows. Remembering my father’s love with the tree and thanks to my love for the computer science tree, I to build the tree using Python. tried Regmi ancestry tree created using Python in 2017 Random “Regmi” Today, I want to take a random “Regmi” from there (just for fun and not any genetic study) and was thinking of a programmatic way of doing it. It turns out it is a popular problem of computer science — getting a random node from a tree with equal probability. Just as how I wrote a about getting a random line from a file in one pass previously with code and proof of correctness, I do the same for getting a random node from a tree. blog Random node from a tree For the selection of node to be completely random process, each node should have an equal probability of being selected. As the total number of nodes is denoted by , the probability of any node being selected should be . n 1 / n Here, the time and space solution will be presented with the proof of correctness at the end. O(n) O(n) Algorithm Traverse through the whole tree and save the total number of nodes for all the subtrees in the tree. The total number of nodes for all the subtrees can be saved in any format and that takes the space . O(n) Traverse through the tree from the root and traverse on the following rule. Suppose the left subtree and right subtree of the node that we currently are on have and elements respectively. If , return the node. Otherwise, get a random integer ( ) in the range . If , go left. If , select the element. If , go right. Continue the process until a node is returned. The node returned is the random node. b c b = c = 0 i [1, b + c + 1] i ≤ a i = a+1 i > a+1 Code Let’s suppose we have the tree as shown in the figure below. For simplicity, we have made all the nodes’ value unique. A binary tree The function counts the number of nodes in all the subtrees and saves in the attribute of the instance. The function gets the random node of the tree. We have returned here the value of the random node. count size Node get_random_node random root: count(root.left) + + count(root.right) a = root.size
    b = root.left.size root.left c = root.right.size root.right b == c == : root.val
    
    rand_num = random.randint( , a) rand_num <= b: get_random_node(root.left) rand_num == b + : root.val : get_random_node(root.right) import # one pass in the whole tree to count the size of all the subtrees : def count (root) if not return 0 return 1 # function to get the random node : def get_random_node (root) if else 0 if else 0 if 0 return 1 if return elif 1 return else return Let’s simulate the random events and see the percentage frequency of each node's value. As expected, the percentage frequency is nearly the same for all the node values. Proof of correctness We do the proof of correctness here by induction. The total number of nodes in the tree is . n For any node in the tree, let , and be the number of nodes in the subtree (containing the node), left subtree and right subtree respectively. a b c Differentiation is made between the selection and arrival process as the selection is selecting (and returning) the node whereas arrival is arriving at the node while traversing. Arrival at some node does not guarantee the selection of the node but arrival at some node is mandatory for the selection of the node. First, let’s prove the randomness of the arrival process. Based on the randomness of the arrival process, we prove the randomness of the selection process. Let’s suppose the arrival process is random up to some node. So, the probability of we coming at that node is . a / n Now, the probability of arriving at the left subtree of that node is the probability of arriving at the node multiplied by the probability of choosing the left subtree. So, the probability of arriving at the left subtree: = a / * b / a
= b / n n Similarly, the probability of arriving at the right subtree: = a / * c / a
= c / n n So, if the arrival process at some node is random, the arriving process is random on its children too. At the root node, the arrival probability according to our formula is . As we always start from the root node, the process satisfied the calculated probability. n / n = 1 Hence, it is proved that the arrival process is random. Now, let’s prove that the selection process is random. As the arrival process is random, the probability of arrival at some node is: no of nodes in the subtree / total number of nodes in the tree (n) Now, the probability of the node being selected is: = probability of arrival at that node * probability of selecting that node
= no of nodes in the subtree / * / no of nodes in the subtree
= / n 1 1 n As the probability of selection of any node is , the selection process is random. 1 / n Proved . Beauty of algorithms I remember a quote from Nelson Mandela, “It always seems impossible until it is done” and think about the algorithmic/mathematical puzzles. There are a number of mathematical/algorithmic puzzles that have taken decades to get solved. Some are still to be solved and many smart minds are burning the midnight oil to solve them. For example, you can find “ ” posted by Clay Institute which have one million USD cash prize for the solutions. Once a problem is solved, the solution is studied by many other researchers, scientists, and engineers. The solution is refined, optimized and studied more by other researchers and scientists. The solution is used as far as practicable by engineers in as many applications. Millennium Problems What do the problem-solvers see in the problems, solutions, proofs, optimizations, and implementations whether they are researchers, scientists or engineers? — Beauty. And it starts with some small problem, small algorithm, small proof. If you found this or any other algorithm beautiful, comment here. I would love to know which algorithm’s beauty are you most fascinated by.

How to Select a Random Node from a Tree

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Breaking Down Secretary Problem

10 Things Everyone Should Know About Machine Learning

10 Repositories that Will Transform the Way You Approach Technical Interviews

10 (Free) Data Structure and Algorithm Courses Junior Developers Should Explore

10 Data Structure & Algorithms Books Every Programmer Should Read

The Noonification: How to Develop a DSL in Kotlin (12/12/2023)

Breaking Down Secretary Problem

10 Things Everyone Should Know About Machine Learning

10 Repositories that Will Transform the Way You Approach Technical Interviews

10 (Free) Data Structure and Algorithm Courses Junior Developers Should Explore

10 Data Structure & Algorithms Books Every Programmer Should Read

The Noonification: How to Develop a DSL in Kotlin (12/12/2023)

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps