The Operational Analytics Loop: From Raw Data to Models to Apps, and Back Again

Over the next decade or so, we’ll see an incredible transformation in how companies collect, process, transform and use data. Though it’s tired to trot out Marc Andreessen’s “software will eat the world” quote, I have always believed in the corollary: “Software practices will eat the business.” This is starting with data practices.

To understand modern DataOps, we can look to DevOps principles. DevOps completely changed engineering practices and philosophies like continuous integration and continuous deployment to close the gap between operations and development. Today, we’re starting to see DataOps wash over the analytics world, pushing it toward the same type of repeatability, flexibility, and speed in data operations and processes. However, DataOps doesn’t solve for one vital element: The request-and-wait cycle.

Unfortunately, DataOps hasn’t been able to fully parallel the success of DevOps. Where DevOps leverages a continuous, functional loop of code deployment between teams, DataOps has always struggled with a gap between the data stakeholders who use data (e.g. RevOps) and the data team that delivers it.

That was, it did before reverse ETL tools came on the market.

Reverse ETL tools have made it possible for the collected and cleaned data to be put into action and, furthermore, for that data to be operationalized to eliminate the request-and-wait cycle. This new genre of data tools closes the feedback loop that separated DataOps from DevOps and makes it possible for teams to deploy relatively real-time data and insights to core apps and services. With this operational analytics loop in place, teams can automate data deployment and realize the continuous integration and continuous delivery of data.

How the operational analytics loop works

The cycle of the operational analytics loop pulls raw data from individual apps, models that data, and then sends the modeled data back into each app. It’s made up of three core tools that all revolve around a cloud data warehouse: an ETL tool, a data modeling tool, and a reverse ETL tool.

The cycle starts with an ETL tool, which collects raw data coming in from apps and sends it to the data warehouse. Once it’s in the data warehouse, data teams can model that data efficiently using a modeling tool. Finally, a reverse ETL tool will automatically validate the modeled data, deploy it to their apps, saving the data team a mess of SQL moving the data from point A to B. Closing this loop creates a self-reinforcing virtuous cycle of data (to steal some verbiage from DevOps), where daily data-driven decisions are made better by quality data, and quality data is, in turn, made better by future good decisions.

Without this final step of what is essentially putting data back where it came from, the loop is a sad line. Modeled data is left to languish in a data warehouse, in some dashboard once made and long forgotten, or in an executive report getting ever-more-buried in someone’s inbox.

The most important part of the operational analytics loop is that it makes data immediately actionable. Instead of, for example, Stripe only using transaction data to calculate monthly recurring revenue, their teams can use that transaction data to automatically prioritize customer support tickets by metrics like customer spend through the following steps:

Step 1. Use an ETL tool to pull raw data from Stripe, Zendesk, and Segment.
Step 2. Use a data modeling tool to create a library of models that identify high-priority accounts, ranking by total spend amount, on-site and in-product behavior, or type of subscription.
Step 3. Use a reverse ETL to pull that modeled data out of the data warehouse and send it back into Zendesk, where incoming support tickets are automatically ranked by importance.

Customer support (CS) reps can act on the data without interrupting their existing workflow. By spending less time prioritizing, these CS reps can focus on more important metrics than the number of tickets closed, such as the product activity coefficient.

The product activity coefficient measures how often a customer used a product in the last week, which requires raw product usage data: How many times did they log in? When did they log in? What features did they use?

Because high-priority customers get the help they need quickly through ticket prioritization, they’re more likely to spend time using the product as intended. This, in turn, produces more raw behavioral data that can provide insight into product activity, which can be used in the future to prioritize support tickets.

At the same time, CS reps will understand what’s possible with data, creating a concurrent loop of communication. The more the CS reps use the data, the more data literate they’ll become, they’ll know what to ask for from their data team, and they’ll know how to ask for it. They’ll think, “If the data team can automatically prioritize support tickets, what other day-to-day operations need optimization?” Once CS reps have this realization, the impact of the operational analytics loop goes beyond metrics and automation.

The operational analytics loop makes what’s next possible

With the operational analytics loop in place, data drives daily operational decisions. Data becomes infused into the daily lives of every employee thanks to reverse ETL placing data within the context it’s used. These new capabilities have a huge impact on organizations, ranging from the workflow of individual contributors up to the organization’s approach to data as a whole.

For the individual contributor

For a day-to-day individual contributor, the operational analytics loop all but eliminates time-to-insight by placing data into the context in which it gets used (in Salesforce, Zendesk, Mixpanel, etc.) As time-to-insight disappears, everyday users will turn into internal data champions.

In the support ticket example above, the CS rep doesn’t have to leave Zendesk to look at the dashboard or send a Slack message to a data analyst. The answer to the question of “Which ticket should I tackle first?” is answered automatically within the context of the work that’s getting done.

Don’t underestimate the visceral impact this automation has on a CS rep. It’s stressful to be responsible for an ongoing stream of support tickets, each one competing for attention. Operational analytics eliminates that stress, which is vitally important for a data team to do. If data isn’t actionably making someone's life easier, data leaders will have a hard time demonstrating to the rest of the organization how essential it is.

For the data team

By gaining internal champions using the operational analytics loop, data teams will start to assert their role as the center of operations for a data-driven company. The data team’s role will move beyond just being the traditional secluded caretakers of the data infrastructure. Instead, they’ll be stewards of how stakeholders from around the company interact with that data infrastructure.

This new role will lead to the data as a product approach (DaaP) approach becoming more commonplace. With a continuous flow of data, the data team can package and serve data and insights to fit stakeholders’ needs. These stakeholders will learn more about what's possible with data first-hand and how to communicate their needs more effectively.

As communication around data improves, so will the data, the understanding of its role in your operations, and its impact on your growth. The DaaP approach is just one aspect of many larger trends — including the rise of analytics engineers, the modern data stack, and the ability to turn data warehouses into a customer data platform (CDP)—that are driven by the operational analytics loop and changing how data teams are treated within organizations.

For the organization

The operational analytics loop turns data and data teams into the central nervous system of the business. As data-driven decisions become baked into the daily life of all employees and data teams step into their new role as facilitators of insights, data will start to permeate throughout every aspect of daily operations.

A continuous flow of data from the app to warehouse/data team back to the app is similar to the CI/CD approach of DevOps, traditionally depicted as an infinite loop:

Source: DevOps: Breaking the Development-Operations Barrier by Atlassian

The operational analytics loop version of this DevOps graphic would work in a similar way, where there’s continuous integration with an ETL tool, modeling and testing in the data warehouse, and continuous deployment with a reverse ETL tool. Here’s a rough sketch of how the operational analytics loop functions in a similar way to this DevOps infinity loop:

Continuous integrating and deployment of data allow the data team to start to think of “data as code” with versioning, testing, standardization, and more. When companies treat their data as code, they can, by proxy, help analysts leverage the same development best practices as engineers to hold data to a single, high standard.

Using this framework, data-driven companies can weather the sharp spikes in demand—such as those seen as a result of COVID-19 lockdowns — and achieve sustainable scale simultaneously.

It’s not just the collection of data that makes this possible, but the everyday data-driven decisions facilitated by an operational analytics loop that, over time, makes high-caliber operations possible with agility and scale.

The operational analytics loop will eat the business

While Andreessen was right about software eating the world, I’ve always thought that the way that software is made is much more important. The operational analytics loop is the key that can unlock a truly data-driven business – one that builds strategy and takes action based on insights that are driven by the data team, using the data warehouse as its brain.

We’re not there yet, but it’s what the analytics world will spend the next 10 years building.