Do you have many if/else conditions in your codebase? Or duplicated logic in both backend and UI? Does enabling a new set of similar behaviors mean a code change and a deployment? If so, it may be time to rethink the architecture.
An alternative approach could be: data-driven architecture. Instead of encoding every nuance in code, we let data—configuration, schemas, and metadata—drive behavior. One configuration becomes the single source of truth for both backend and UI, allowing us to adapt to new rules and fields without fragmenting our codebase.
This post walks through what data-driven architecture is, why it’s worth it, how to apply it—and, for production, how to track lineage and debug behavior when config is in the driver’s seat.
In this post:
- What Is Data-Driven Architecture?
- Why Data-Driven Architecture?
- Trade-offs
- When Not to Generalize Too Soon
- When Data-Driven Is Overkill
- How to Do It: Configuration at the Center
- Data Lineage: Where Config Comes From and Who Uses It
- Debugging and Observability
- Summary
What Is Data-Driven Architecture?
Data-driven architecture means the structure and behavior of our system are determined by data—configuration, schemas, metadata, rules—rather than by hardcoded logic. The code is generic; the data defines what actually happens.
Data Defines Behavior
- Data = config, schemas, rules, metadata keyed by the properties that determine the difference in behavior (e.g., entity type, region, tenant).
- Code = generic engines that load and interpret that data (validation, workflows, rendering).
- Change = update data or config, not necessarily redeploy code.
So “what should happen” lives in data; “how we execute it” lives in code.
The Opposite: Code-Driven Architecture
The opposite is code-driven (or logic-driven) architecture:
- Behavior is encoded in conditionals, branches, and switch statements.
- New rules or new countries mean new code paths and new deployments.
- The application code is the source of truth; config is minimal or an afterthought.
|
Data-driven |
Code-driven |
|---|---|
|
Behavior from config, schemas, rules |
Behavior from code branches |
|
Change by updating data |
Change by editing and deploying code |
|
One generic engine + many configs |
Many special cases in code |
|
Declarative (“what”) |
Imperative (“how”) |
Data-driven doesn’t mean “no code”—it means code is generic and data is authoritative.
Code-Driven vs Data-Driven: A Visual
The diagram below contrasts the two approaches. In code-driven architecture, the application contains multiple branches (e.g., per document type, tenant, or region); each new case adds a new path in code. In data-driven architecture, the application has a single path and reads from a central config keyed by context; new cases are new config entries, not new code.
A Concrete Example: Approval Workflow by Document Type
Suppose we need approval workflows that differ by document type: expense reports → Manager then Finance; contracts → Legal, Manager, Finance; leave requests → Manager only.
Less suitable (code-driven): A branch in code for each document type; every new type means a new branch and a new deployment.
if (documentType == "ExpenseReport") {
steps = [ { role: "Manager" }, { role: "Finance" } ];
runApprovalFlow(document, steps);
} else if (documentType == "Contract") {
steps = [ { role: "Legal" }, { role: "Manager" }, { role: "Finance" } ];
runApprovalFlow(document, steps);
} else if (documentType == "LeaveRequest") {
steps = [ { role: "Manager" } ];
runApprovalFlow(document, steps);
}
More suitable (data-driven): One engine: load the workflow for this document type, execute the steps in order. Each type has its own config entry (with different steps and roles). New document type = new config entry, no code change.
1. Engine (coded once): A workflow is a list of steps (role, optional conditions). Load by document type, execute in order—no branches on document type.
// Coded once; never branches on document type
configKey = documentType; // e.g. "ExpenseReport", "Contract"
workflow = configService.getWorkflow(configKey);
for (step in workflow.steps) {
assignToRole(document, step.role);
waitForApproval(document, step);
if (!approved) break;
}
2. Config (data): One entry per document type; each defines its own steps and roles.
{
"ExpenseReport": {
"steps": [
{ "order": 1, "role": "Manager" },
{ "order": 2, "role": "Finance" }
]
},
"Contract": {
"steps": [
{ "order": 1, "role": "Legal" },
{ "order": 2, "role": "Manager" },
{ "order": 3, "role": "Finance" }
]
},
"LeaveRequest": {
"steps": [
{ "order": 1, "role": "Manager" }
]
}
}
3. New document type = config only. Add an entry (e.g. PurchaseOrder); the engine already runs any list of steps.
"PurchaseOrder": {
"steps": [
{ "order": 1, "role": "Manager" },
{ "order": 2, "role": "Procurement" }
]
}
The same config can drive the UI (e.g. progress view of steps).
Why Data-Driven Architecture?
1. One Place to Control Behavior, Less Fragmentation
When config is the source of truth, we have a single place that defines what the backend validates and what the UI shows (fields, order, labels, widget types). Backend and UI share the same contract; no scattered logic per type or segment, no separate “UI config” that drifts from backend rules. Changes are centralized, and there are fewer places to look when debugging or evolving behavior.
2. Easier to Change Over Time
- New document type, segment, or dimension? Add config (and maybe schema), not new if branches or new UI components.
- New field or rule? Update config; often no code deploy.
- A/B test or gradual rollout? Tweak config by segment (e.g., tenant, plan, region) rather than using feature flags buried in code.
We get faster iteration and fewer code paths to maintain.
3. Consistency and Governance
- Schemas define the contract (field names, types, required vs optional).
- Config defines the actual behavior per context (e.g., entity type, tenant, region).
- Metadata and lineage (if added) improve discoverability and governance.
So data-driven architecture supports consistent behavior and clear ownership of “what is true where.”
4. Scalability Without Code Explosion
A generic engine, combined with data, scales to many entities without a proportional explosion of code. We add data, not code paths.
Trade-offs
Data-driven architecture has real benefits—but it also has trade-offs. Acknowledging them keeps the approach realistic and helps us invest in the right places.
- Config can get complex. Once behavior lives in config, we may have many keys. Each key is built from the constraints that define the behavior—e.g. tenant (which customer or organization), region (which geographic or logical partition), entity type (which kind of document or resource). The product of those dimensions (entity type × tenant × region) can grow quickly, plus nested structures and conditional rules. Planning for good tooling—and often a UI or admin surface to edit and preview config—helps so non-developers can change behavior without editing raw JSON or YAML by hand.
- Debugging “why did it do that?” can be harder. When behavior is in data, the answer isn’t always in a single stack trace. We need visibility into which config was loaded (e.g., entity type + tenant), versioning, and possibly audit logs or feature flags so we can reproduce and trace decisions.
- Config must be versioned and tested like code. Config changes can break production just as easily as code. Treating config as part of the delivery pipeline—versioning, review, and testing (e.g., validate against schemas, run smoke tests with representative configs) before rollout—helps keep production stable.
None of these are showstoppers—they’re the price of moving behavior into data. It’s worth planning for tooling, observability, and config hygiene from the start.
When Not to Generalize Too Soon
Data-driven architecture is powerful—but over-engineering and generalizing too early can backfire. It’s tempting to build the “perfect” config-driven system before we've seen real variety. The catch: it's hard to predict what use cases our application will actually see. We might design for dimensions (document type, tenant, product type) or rules that never matter, or miss the ones that do.
A better approach: code a couple first, then generalize by the third.
- First case: Implement it in code. Get it working, ship it, learn from real usage.
- Second case: Implement again, also in code. We'll notice repetition and start to see what’s common vs. what’s different.
- By the third: We have enough concrete examples to know which dimensions truly vary (e.g. document type, entity type) and what belongs in config vs. what stays in code. Now is the time to introduce a keyed configuration model and a generic engine.
If we generalize on the first or second case, we risk building a flexible system around the wrong abstractions. Real use cases will then fight the design, and we’ll pay for “flexibility” we don’t need. Let the problem show itself before we build the generic solution. Data-driven architecture shines when it’s informed by real variation—not by speculation.
When Data-Driven Is Overkill
Sometimes, data-driven is the wrong choice altogether. Skip it when:
- Behavior is fixed and will never vary. If we have one schema, one set of rules, and no plans to support multiple contexts (countries, tenants, products), keeping it in code is simpler. A generic config layer adds indirection with no payoff.
- We only ever have one or two cases. For example, If we'll only ever support two countries US and UK, and the difference is a handful of fields, a few code paths, or a small config might be enough. We don't need a full keyed-config engine.
Data-driven architecture pays off when we have real, recurring variation and multiple consumers (e.g. backend + UI) that benefit from one source of truth. If we don't have that yet, we stay simple.
How to Do It: Configuration at the Center
1. Define a Keyed Configuration Model
We can decide what dimensions matter (e.g. document type, entity type, tenant) and key the config by those dimensions.
Example structure:
- Key: e.g. documentType, or (entityTypeId, tenantId) for multi-tenant.
- Value: field definitions (id, type, required), validation rules, workflow references, and UI hints (labels, widget types, order, visibility).
Both backend behavior and UI display are configured from the same place, keyed by the same dimensions, so they stay in sync by design.
2. Backend: Generic Engine, Not Branches
- One code path: “Given the dimensions that drive behavior (e.g. document type, tenant), load the right config.”
- Use that config to:
- Validate incoming data.
- Persist only allowed fields.
- Run any rule engine or workflow (rules themselves can live in config).
No if (type == "X") in business logic—only “load config for this context and apply it.”
3. UI: Data-Driven UIs
Data-driven UI means the UI is also driven by config rather than hardcoded layouts. For a given context, the UI reads the fields to show, their order, labels, widget types, and visibility from config, and renders accordingly.
The UI has one generic flow: resolve config for this context, then render from it. Backend and UI each use the parts of the config relevant to them, and because everything lives in the same place under the same key, they naturally stay aligned.
Result: the UI “loads differently” per context because the data is different, not because the code is different. No separate front-end code per type or segment.
4. Keep Configuration in One Place
Storing config in a single system (e.g., config service, database, or versioned files), keyed by the dimensions that drive behavior, gives us one source of truth. Both the backend and the front end read from it—either directly via the API or with the backend passing the right slice to the UI.
Backend and UI configs don't have to be the same structure—decoupling and denormalizing them by concern (behavior vs. display) is often preferred. What matters is collocating them under the same key in the same system. That's what keeps them aligned without extra coordination: when we add or change a context, both sides are updated in one place.
Treating config as versioned and reviewable—with the same rigour as code but the flexibility of data—is what makes this sustainable in production.
5. Evolve Gradually
We don’t have to go all-in on day one—and it’s better not to before we’ve seen a few real cases (see When Not to Generalize Too Soon above). Once we’re ready:
- Starting with one dimension (e.g., document type or tenant).
- Moving the most variable parts (fields, validations, maybe one screen) to config.
- Once the pattern works, we can extend to more entities and more UI surfaces.
Running Config-Driven Systems in Production
Moving behavior into config introduces a question that doesn’t often come up in code-driven systems: why did it do that? The answer depends on which config was loaded, when it changed, and who is affected. Two practices keep this under control: lineage and observability.
Lineage: knowing where config comes from and who it affects
Config should be versioned and identified—every read path knows which version it used. If config is layered (base + tenant + region overrides), recording the resolution path (e.g. “base v3 + tenant-A v1 + region EMEA v2”) makes behavior reproducible and auditable. Attaching provenance to each entry—who changed it, when, and why—turns config into an auditable artifact.
Equally important is knowing who reads which config keys. Tracking consumers (backend validation, UI renderer, batch jobs) lets us answer “what breaks if we change this?” before we deploy. It helps us track the consumers who should be notified of the change. If config drives stored data, lineage can also link a record to the config version that wrote it—critical for interpreting historical data when schemas evolve.
Observability: making config-driven behavior inspectable at runtime
Attaching config key and version to every request—in logs, trace spans, or response headers—means any log line or trace answers “which config drove this.” Structured logging at resolution time (configKey, configVersion, cache hit or fallback) and metrics on resolution latency and validation failures give dashboards and alerts that catch regressions quickly.
When a bug is reported, we can look up the config key and version from the trace, re-fetch that exact version, and reproduce the behavior locally. When someone reports “it worked yesterday,” we compare config versions across that window to see exactly what changed.
For high-risk changes, a staged rollout (canary tenant or percentage of traffic) limits the blast radius. Treating config as a dependency—alerting on load failures, validation errors, or unexpected fallback rates—keeps production stable.
With lineage and observability in place, config-driven behavior stays inspectable and reproducible—we can change behavior with confidence even when the logic lives in data.
Summary
|
|
What |
Why |
How |
|---|---|---|---|
|
Core idea |
Data (config, schemas, rules) defines behavior; code is generic. |
Single source of truth, easier changes, less fragmentation, consistency, scalability. |
Keyed config (e.g. by document type, tenant); one engine in backend and UI. |
|
Backend |
No branches per type or segment; load config and apply. |
Fewer code paths, faster iteration. |
Config-driven validation, persistence, and rules. |
|
UI |
Same config drives layout and behavior (data-driven UI). |
Same contract as backend; no separate code per type or segment. |
Generic render-from-config; config defines fields, order, labels, and widgets. |
|
Production |
Config is versioned and traceable; behavior is reproducible. |
Safe rollouts, impact analysis, and auditability. |
Lineage (source, version, consumers); observability (logs, metrics, traces, alerts). |
Data-driven architecture puts configuration at the center: it controls how the backend behaves and how the UI works. That reduces fragmentation, makes future changes easier, and keeps one clear place to define “what’s true” for each context (e.g., document type, tenant). In production, lineage and observability make that config auditable and debuggable—so we can change behavior with confidence.
If we take one thing away: Let data define what should happen, and keep code generic so it just interprets that data.
Have you moved from code-driven to data-driven (backend or UI)? What worked best for you? Share your experience in the comments.
