When it comes to event-based analytics, most of the teams struggle to leverage its power and get stuck on the “ ”. I’ve been there done that, so I have some learnings to share with theory and practical steps on how to achieve it. our events data is a mess! Today we will talk about best practices in organizing the event definition data and why you are likely building yesterday’s news. Why should I care about the events catalog? Sheets is usually a place where teams store their events with triggers, definitions, and properties. While it's a good option just to start with, you cannot use it as a long-term solution. Google When documentation relies on manual support, it is only a matter of time before it becomes outdated. One day you will not update the document after yet another iteration with the dev team, and it will store wrong information about the events sent. Also, as the number of teams grows, you will inevitably face multiple files aiming to describe existing events (and of course they will have mismatches with each other). Additionally, Google Sheets have limited structure to capture complex events, are not handy in validating events consistency, and most importantly, are not connected to the source code that indeed fires the event in your application. So the main lesson on events' definition is— there should be one source of truth for the events, and it should not be supported manually. The magic here is in reusing the event definition data. When stored properly, definitions can serve multiple valuable purposes. It can be leveraged by software engineers to generate code for firing the events, presented in a user-friendly interface to act as comprehensive event documentation for analysts and product managers. Additionally, it can be utilized for validating the events sent to your data warehouse, particularly if you store the raw data. What you get as a result – your documentation is always showing what is sent and stored – pure magic. How can this be achieved in practice? Start by communicating with the development team that can be used by them to generate the code based on. For example, you can use JSON or YAML formats to define each event in a separate file. Remember that these files should contain all the information you what to have in your documentation: event name, trigger, properties with possible values and their meaning, supported platforms, comments, etc. to define an event description format Choose a . It's important to do such research in advance to amend the schema if needed. In my previous project, we used JSON Schema Reader from Atlassian to transform files into handy documentation. Collect the events definitions in the needed format and place them in a single place, f.e. the GitHub repository. This might not be a simple exercise but surely worth it. To make it easier, ask the engineers if they can pull directly from the code what events are fired with their structure – you will only have to fill in the definitions in this case. tool for rendering a given format into the frontend interface JSON in the needed format and place them in a single place, f.e. the GitHub repository. This might not be as simple exercise but surely worth it. To make it easier, ask the engineers if they can pull directly from the code what events are fired with their structure — you will only have to fill in the definitions in this case. Collect the events definitions on how events are added, reviewed and updated. To provide inspiration, let me share an example on how it can be settled up. Set up a process – knows the business logic, edge cases, metrics that are calculated by this event. This person should be the one to mainly update the event as the product evolves and review the changed made by the other analysts. To control it on role level, you can set GitHub code owners for each event you have in your repository. Each event should have an analyst who owns it Software engineers can challenge not needed properties, understand their priorities, raise questions and edge cases, and validate that the needed logic can be implemented. It's a win-win for both analysts and engineers as reviewing steps decreases misunderstanding and ambiguous naming. To enable this, each change to the structure of the event should be done by opening a Pull Request (PR) and having approvals from the dev team to be merged. Updating an event should be approved by the event owner and the development team. for cases when an analyst does not know who to ask for approval, for PRs, . Provide a help page decide on the SLAs collect feedback and iteratively update the process Sometimes we do not get what we send – data injections introduce unauthorized data, bots can generate fake events, and encoding issues lead to data loss or misinterpretation. These all are serious problems as they might skew the metrics and lead to inaccurate analysis results → wrong decisions → customer loss. To have control over it, you should validate you receive what you expect to, and that you don't receive what you don't expect to. Validate the data you receive. events documentation is probably the most underestimated tool that can leverage the quality and ease of using it. —this is the mantra to a trustworthy data description. When the same definition is used in the documentation, production code, and data validation, you unlock the trustworthy events and insights from them. To sum up, Define once and reuse P.S. Don't forget to subscribe to my blog, I am and will be posting about events and insights from them. See you soon.

Walkthroughs, tutorials, guides, and tips. This story will teach you how to do something new or how to do something better.

The is an opinion piece based on the author’s POV and does not necessarily reflect the views of HackerNoon.

What Should You Do to Trust Event Data? Part 1 – Events Catalogue

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

03/09/2018: Biggest Stories in the Cryptosphere

The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)

The Noonification: How to Implement a Merkle Tree in Solidity (11/12/2023)

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

03/09/2018: Biggest Stories in the Cryptosphere

The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)

The Noonification: How to Implement a Merkle Tree in Solidity (11/12/2023)

10 Ways to Optimize Your Database

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps