Dbt is a wildly popular tool. When it comes to the data pipeline, companies do (and should!) use dbt tests to mitigate problems. Dbt tests are great for organizations who want to improve data quality and reliability. But sometimes, dbt might not go far enough. Testing has inherent limitations. For example - tests only provide information from the areas that are tested. But what about the areas you don’t have time to test, because of resource or time limitations? When an organization gets big enough, it might need to fold in some additional . In this post, we’ll talk about 7 ways to level up your dbt tests, and some indicators that it’s time to move beyond the test. data observability 1. Get a handle on the different dbt test types When it comes to dbt tests, you have two options: , and . data tests schema tests Data tests are specific queries that you run against a specific model. They live in the “tests” folder of your dbt project. Schema tests are generic. They can be configured and applied to multiple models. Out of the box, dbt has four schema tests that can be immediately used: unique, not_null, accepted_values, and relationships. You can also write custom schema tests, as macros with the prefix test and stored in the folder. macros If you have one model that you want to validate in a specific way, you can simply write a data test in the tests folder. But if you find that you’re copying and pasting that query and configuring it. across multiple models - you might want to transition your data test into a schema test. 2. Use plugins and packages Level up your dbt with integrations, like: - an open platform for data lineage collection and analysis. It tracks metadata about datasets, jobs, and runs. OpenLineage consumes data produced by dbt runs. OpenLineage - a dbt plugin that allows users to deploy GE-like tests in their data warehouse directly from dbt dbt_expectations - small test cases for Macros that verify data is working as expected Unit tests 3. For all dbt runs, schedule dbt tests For daily dbt jobs, you likely run associated tests on the same schedule. You can configure your dbt job to have multiple commands, including tests, running, freshness checks, and more. You can set up automated emails and Slack notifications in the case of failure. 4. Run dbt tests on a model change / new PR When a PR is created in your database repo with a migration, use dbt tests to verify changes won’t break your dbt models before merging the PR. You can do this by using data model blue-green deployment. Your CI/CD tool should do the and commands to target a staging database environment or a staging schema within the production environment. dbt run dbt test If the tests pass, continue on with against the production environment or production schema. dbt run Also, make sure that your staging environment is fully replicated. That way, the intitial staging results are an accurate proxy for how production will play out. dbt test 5. Run dbt tests on your source models To catch potential problems as early as possible, run “dbt test” against your source models first: Run dbt tests against the source Run the dbt job Test the resulting outputs In other words: dbt test --models source:[sources] dbt run -m [models] dbt test -m [models] 6. Distribute dbt test results Set up an interactive way to display and visualize dbt test results. That way, you’ll always be aware when something in your dbt transformations goes wrong. Several tools on the market can help you with these displays. For example, and do these visualizations for you. Or, you can create a custom visualization. re_data elementary For example, the team at leveraged dbt artifacts and the query log to build reporting and alerts on dbt model/job runs and performance. This may also be a good point to begin considering a more generic data observability tool like . Snapcommerce Bigeye 7. Add data observability into your pipeline Testing and data observability have the same end goal - detecting problems. However, they approach this end results differently. Testing is opinionated - it originates in a human expressing a condition they think should be true. Observability is neutral - it asks, “Do we know the state of the data system at all times? How has it changed?” Much like a dashboard on a vehicle, data observability provides insight into how fast you’re going, whether the engine light is on, and whether you’re due for an oil change. In conclusion: Optimize your dbt testing sophistication Data engineers will go to some trouble to bridge the gap between dbt tests and complete data observability. It often results in a hacked-together setup that leaves out large parts of the data stack, outside the realm of dbt. As organizations scale, there’s infinite opportunity to test. While you can only write tests for so many things, you can layer in data observability on top of your set of tests. It will help you catch the things for which you don’t want to (or can’t) write tests.