paint-brush
Efficiently Manage Data with Hierarchical Tree Structure in .NET C#by@ahmedtarekhasan
556 reads
556 reads

Efficiently Manage Data with Hierarchical Tree Structure in .NET C#

by Ahmed Tarek HasanMay 2nd, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Design a data structure for Hierarchical Tree Form data and its related operations in .NET C#. In simple words, this is data presented into parent-child nodes. In this article, I would provide you with one of the possible designs for such cases.
featured image - Efficiently Manage Data with Hierarchical Tree Structure in .NET C#
Ahmed Tarek Hasan HackerNoon profile picture

Sometimes, you may find yourself in need of dealing with Hierarchical Tree Form data. In simple words, this is data presented in parent-child nodes.


In such situations, you might struggle with the complexity of the implementation, especially when dealing with a large amount of data.


In this article, I will provide you with one of the possible designs for such cases. However, you need to keep in mind that, as usual, having a generic design can offer abstracted capabilities, but it might lack specific enhancements for certain scenarios.


Therefore, I recommend treating the design presented in this article as a mind opener. You should study it, learn from it, gain ideas from it, and ultimately adapt it to your own needs.


Photo by Markus Winkler on Unsplash

Main Goals

These are the main goals of this design:

  1. Represent Hierarchical data into a tree data structure.
  2. Have the capability of creating nodes in isolation.
  3. Have the capability of attaching and detaching nodes.
  4. Have the capability of searching nodes with an appropriate performance.
  5. Don’t lose the edge of strongly typed data.


Therefore, while moving across our design, we need to keep in mind these requirements.


Photo by Med Badr  Chemmaoui on Unsplash

The Design

First, we need to keep in mind that the data itself could be any business entity. However, the Hierarchical structure is something else. This is the structure controlling the relation between nodes, how to add, how to remove, how to search,… and so on.


Payload

namespace HierarchicalData.Abstractions
{
    public abstract class Payload
    {
    }
}


What we can notice here:

  1. This is the main entity representing the business entity to be wrapped into our Hierarchical structure.
  2. We didn’t add here any members but for sure you can add some if you think that this is common between all your business entities.



NodeType

namespace HierarchicalData.Abstractions
{
    public enum NodeType
    {
        Structural = 1,
        Leaf = 2
    }
}


What we can notice here:

  1. This is an enum representing the type of the node.
  2. We have two node types only; Structural and Leaf.
  3. Structural node is the node which only provides info about the Hierarchical structure. In simple words, it is like a folder or a node which would have child node(s).
  4. Leaf node is the node which would only provide the business data. In simple words, it is the node that wraps the data and would not have any children.



INode

namespace HierarchicalData.Abstractions
{
    public interface INode
    {
        string Id { get; }
        string Name { get; }
        string PathPart { get; }
        NodeType Type { get; }
        IStructuralNode Parent { get; set; }
    }
}


What we can notice here:

  1. This is the interface representing the common members of any node whether it is Structural or Leaf.
  2. Id is the unique identifier of the node. This should be absolute unique. I implemented it as a string but for sure you are free to change it as per your needs.
  3. Name is the name of the node. This should at some point be used for the presentation layer.
  4. PathPart is the name to be used to represent the path of the node.
  5. Parent is the parent node. Its type is IStructuralNode as it could not be a Leaf because a Leaf would never have a child.



IStructuralNode

using System.Collections.Generic;

namespace HierarchicalData.Abstractions
{
    public delegate void ChildNodeAddedEventHandler(object sender, IStructuralNode node, INode added);

    public delegate void ChildNodeRemovedEventHandler(object sender, IStructuralNode node, INode removed);

    public interface IStructuralNode : INode
    {
        event ChildNodeAddedEventHandler ChildNodeAdded;
        event ChildNodeRemovedEventHandler ChildNodeRemoved;

        IReadOnlyCollection<INode> Children { get; }
        new NodeType Type => NodeType.Structural;

        void AddChild(INode child);
        void RemoveChild(INode child);
        IReadOnlyDictionary<string, INode> GetFlatChildren();
    }
}


What we can notice here:

  1. We defined the delegate ChildNodeAddedEventHandler to represent a handler of an event to be fired when a node is added as a child under another node.
  2. We also defined the delegate ChildNodeRemovedEventHandler to represent a handler of an event to be fired when a node is removed from the children of another node.
  3. We defined IStructuralNode interface which extends INode.
  4. It has the two events; ChildNodeAdded and ChildNodeRemoved.
  5. It has a read-only collection of children nodes.
  6. Its node type by default would be Structural.
  7. It has a method to add child.
  8. It has a method to remove child.
  9. It also has a method to return a flat collection of all nested child nodes.



ILeafNode and ILeafNode<TPayload>

namespace HierarchicalData.Abstractions
{
    public interface ILeafNode : INode
    {
        Payload Payload { get; }
        new NodeType Type => NodeType.Leaf;
    }

    public interface ILeafNode<out TPayload> : ILeafNode where TPayload : Payload
    {
        new TPayload Payload { get; }
    }
}


What we can notice here:

  1. The ILeafNode is extending the INode.
  2. It has the Payload property which returns the wrapped data of type Payload.
  3. It hides the parent Type and defaults it to Leaf.
  4. The ILeafNode<out TPayload> hides the parent Payload and defines a typed one.



Node

using System.Linq;
using HierarchicalData.Abstractions;

namespace HierarchicalData.Implementations
{
    public abstract class Node : INode
    {
        protected Node(
            string id,
            string name,
            string pathPart,
            NodeType type,
            IStructuralNode parent)
        {
            Id = id;
            Name = name;
            PathPart = pathPart;
            Type = type;
            Parent = parent;
        }

        public string Id { get; }
        public string Name { get; }
        public string PathPart { get; }
        public NodeType Type { get; }
        public IStructuralNode Parent
        {
            get => m_Parent;
            set
            {
                if (m_Parent != value)
                {
                    if (value != null && value.Children.All(c => c.Id != Id))
                    {
                        value.AddChild(this);
                    }
                    else if (value == null)
                    {
                        m_Parent.RemoveChild(this);
                    }

                    m_Parent = value;
                }
            }
        }
    }
}


What we can notice here:

  1. It implements INode.
  2. The implementation is simple.
  3. If you like, you can extend this to implement IEquatable<Node>.
  4. We also allow the user to set the Parent directly through the Parent property. However, the implementation is delegated to the AddChild and RemoveChild methods.



LeafNode<TPayload>

using HierarchicalData.Abstractions;

namespace HierarchicalData.Implementations
{
    public class LeafNode<TPayload> : Node, ILeafNode<TPayload> where TPayload : Payload
    {
        public LeafNode(
            string id,
            string name,
            string pathPart,
            IStructuralNode parent,
            TPayload payload) : base(id, name, pathPart, NodeType.Leaf, parent)
        {
            Payload = payload;
            parent?.AddChild(this);
        }

        Payload ILeafNode.Payload => Payload;

        public TPayload Payload { get; }
    }
}


What we can notice here:

  1. It inherits from Node and implements ILeafNode<TPayload>.
  2. Worth to mention here is that in the constructor implementation, if a parent is set, we call the AddChild on the parent passing in this.



StructuralNode

using System.Collections.Generic;
using System.Linq;
using HierarchicalData.Abstractions;

namespace HierarchicalData.Implementations
{
    public class StructuralNode : Node, IStructuralNode
    {
        private readonly List<INode> m_Children;
        private readonly Dictionary<string, INode> m_Catalog;

        public event ChildNodeAddedEventHandler ChildNodeAdded;
        public event ChildNodeRemovedEventHandler ChildNodeRemoved;

        public StructuralNode(
            string id,
            string name,
            string pathPart,
            IStructuralNode parent = null,
            List<INode> children = null) : base(id, name, pathPart, NodeType.Structural, parent)
        {
            m_Children = new List<INode>();
            m_Catalog = new Dictionary<string, INode>();

            if (children != null && children.Any())
            {
                foreach (var child in children)
                {
                    AddChild(child);
                }
            }

            parent?.AddChild(this);
        }

        public IReadOnlyCollection<INode> Children => m_Children;

        public void AddChild(INode child)
        {
            if (child != null && Children.All(c => c.Id != child.Id))
            {
                child.Parent?.RemoveChild(child);

                m_Catalog.Add(child.Id, child);
                m_Children.Add(child);

                OnDirectChildNodeAdded(child);

                if (child.Type == NodeType.Structural && child is IStructuralNode structuralNode)
                {
                    foreach (var keyValuePair in structuralNode.GetFlatChildren())
                    {
                        m_Catalog.Add(keyValuePair.Key, keyValuePair.Value);
                        OnNestedChildNodeAdded(keyValuePair.Value.Parent, keyValuePair.Value);
                    }

                    structuralNode.ChildNodeAdded += AddHandler;
                    structuralNode.ChildNodeRemoved += RemoveHandler;
                }

                child.Parent = this;
            }
        }

        public void RemoveChild(INode child)
        {
            if (child != null && Children.Any(c => c.Id == child.Id))
            {
                m_Catalog.Remove(child.Id);
                m_Children.Remove(child);

                OnDirectChildNodeRemoved(child);

                if (child.Type == NodeType.Structural && child is IStructuralNode structuralNode)
                {
                    foreach (var keyValuePair in structuralNode.GetFlatChildren())
                    {
                        m_Catalog.Remove(keyValuePair.Key);
                        OnNestedChildNodeRemoved(keyValuePair.Value.Parent, keyValuePair.Value);
                    }

                    structuralNode.ChildNodeAdded -= AddHandler;
                    structuralNode.ChildNodeRemoved -= RemoveHandler;
                }

                child.Parent = null;
            }
        }

        public IReadOnlyDictionary<string, INode> GetFlatChildren()
        {
            return m_Catalog;
        }

        protected void OnDirectChildNodeAdded(INode added)
        {
            ChildNodeAdded?.Invoke(this, this, added);
        }

        protected void OnNestedChildNodeAdded(IStructuralNode node, INode added)
        {
            ChildNodeAdded?.Invoke(this, node, added);
        }

        protected void OnDirectChildNodeRemoved(INode removed)
        {
            ChildNodeRemoved?.Invoke(this, this, removed);
        }

        protected void OnNestedChildNodeRemoved(IStructuralNode node, INode removed)
        {
            ChildNodeRemoved?.Invoke(this, node, removed);
        }

        private void AddHandler(object sender, IStructuralNode node, INode added)
        {
            m_Catalog.Add(added.Id, added);
            OnNestedChildNodeAdded(node, added);
        }

        private void RemoveHandler(object sender, IStructuralNode node, INode removed)
        {
            m_Catalog.Remove(removed.Id);
            OnNestedChildNodeRemoved(node, removed);
        }
    }
}


Before analyzing this code, we need to understand how it is going to work.


As we said, nodes should be able to be created in isolation. This means that the end user should be able to create a node without setting any parent or even children.


Then he should be able to attach this node as a child to another node. Also, he should be able to attach other nodes as children to the children of this node,… and so on.


For all these actions, our code should keep the structure and relations between nodes intact.


Also, we want to make it easy to search for nodes. A special case would be searching by Id. This should be as fast as we can.


Therefore, keeping this in mind, let’s analyze the code.


What we can notice here:

  1. In the constructor, if a collection of children is set, we call the AddChild method passing in each child of the children.
  2. Also, if a parent is set, we call the AddChild on the parent passing in this.
  3. To make it easy to search for a node, we defined Dictionary<string, INode> m_Catalog. This is a catalog where we keep references to all the nodes -even the nested ones- below the current node.
  4. To keep this catalog always synchronized, the current node subscribes to the ChildNodeAdded and ChildNodeRemoved events of every direct child nodes at the moment they are added.
  5. Following the same rule, the current node unsubscribes from the ChildNodeAdded and ChildNodeRemoved events of every direct child nodes at the moment they are removed.
  6. Also, we need to keep in mind that whenever a node is added to the current node as a direct child, this node might have other nested subtree. In this case, the current catalog must be updated with all the nested children of the subtree as well.
  7. So, having this said, let’s have a look on the implementation of AddChild(INode child).
  8. child.Parent?.RemoveChild(child); makes sure that the new child is detached from its old parent before attaching it to the new parent which is the current node.
  9. m_Catalog.Add(child.Id, child); adds the new child to the catalog.
  10. m_Children.Add(child); adds the child to the Children collection.
  11. OnDirectChildNodeAdded(child); fires the ChildNodeAdded event to notify the direct parent that a new child is added and eventually new updates should be applied to its catalog.
  12. Then we check if the new child is a Structural node because in this case we need to listen to the changes that could be applied to it.
  13. So, if this is true, we loop on the new child’s flat children, add them one by one to the current catalog and also fire the ChildNodeAdded event but this time with the proper values of node and added arguments. This is done to notify the direct parent with the nested updates as well. Note that the direct parent, would also do the same and notify its direct parent,… and so on.
  14. As we said, the current node should subscribe to the ChildNodeAdded and ChildNodeRemoved events of the new child so that it is always updated. This is what we do on structuralNode.ChildNodeAdded += AddHandler; and structuralNode.ChildNodeRemoved += RemoveHandler;.
  15. child.Parent = this; sets the parent of the child node to the current node.
  16. Now, let’s have a look on the implementation of RemoveChild(INode child).
  17. m_Catalog.Remove(child.Id); removes the child from the catalog.
  18. m_Children.Remove(child); removes the child from the Children collection.
  19. OnDirectChildNodeRemoved(child); fires the ChildNodeRemoved event to notify the direct parent that a child is removed and eventually new updates should be applied to its catalog.
  20. Then we check if the new child is a Structural node because in this case we need to listen to the changes that could be applied to it.
  21. So, if this is true, we loop on the child’s flat children, remove them one by one from the current catalog and also fire the ChildNodeRemoved event but this time with the proper values of node and added arguments. This is done to notify the direct parent with the nested updates as well. Note that the direct parent, would also do the same and notify its direct parent,… and so on.
  22. As we said, the current node should unsubscribe from the ChildNodeAdded and ChildNodeRemoved events of the child to be removed. This is what we do on structuralNode.ChildNodeAdded -= AddHandler; and structuralNode.ChildNodeRemoved -= RemoveHandler;.
  23. child.Parent = null; resets the parent to null.

I know that this part was tricky, but, if you just go through it again, you will find it not that complicated.



TreeNavigator

using System;
using System.Collections.Generic;
using System.Linq;
using HierarchicalData.Abstractions;

namespace HierarchicalData.Implementations
{
    public class TreeNavigator
    {
        public string[] GetNodeFullPath(INode node)
        {
            var result = new List<string> { node.PathPart };

            var parent = node.Parent;

            while (parent != null)
            {
                result.Insert(0, parent.PathPart);
                parent = parent.Parent;
            }

            return result.ToArray();
        }

        public INode SearchById(IStructuralNode root, string id)
        {
            if (root?.GetFlatChildren().ContainsKey(id) ?? false)
            {
                return root.GetFlatChildren()[id];
            }

            return null;
        }

        public IList<INode> Search(IStructuralNode root, Func<INode, bool> predicate)
        {
            var flat = root?.GetFlatChildren();

            if (flat != null)
            {
                return flat
                       .Where(entry => predicate(entry.Value))
                       .Select(entry => entry.Value)
                       .ToList();
            }

            return new List<INode>();
        }

        public int FindNodeDepth(IStructuralNode root, INode node)
        {
            var foundNode = SearchById(root, node.Id);

            if (foundNode != null)
            {
                var depth = 0;
                var parent = node.Parent;

                while (parent != null)
                {
                    depth++;

                    if (parent.Id == root.Id)
                    {
                        break;
                    }

                    parent = parent.Parent;
                }

                return depth;
            }

            return 0;
        }
    }
}


This is like a Helper class which provides extra capabilities like searching the nodes, evaluating full paths, finding node depth,… and so on.


Internally, it is making a good use of the well defined and established structure we defined on our nodes classes.


I didn’t abstract this class but for sure you can do so and you can extend it with as many capabilities as you need.


Photo by Elisha Terada on Unsplash, adjusted by Ahmed Tarek

Final Words

I was going to write a sample application to show you how efficient our design is, but, after a second thought, I decided to leave this to you as an exercise.


So, I know that when you look at this design you might feel that it is too hard to understand. However, you need to keep in mind how complex the job is in the first place.


As I said at the beginning, you need to use the design in this article as a mind opener, give it some thought, learn from it, drop what you don’t need, adapt it, enhance it, and finally start using it. This would always be the right way to learn and gain new skills.


That’s it, hope you found reading this article as interesting as I found writing it.


Also published here.