With the Backends-for-frontends (BFF) pattern, you build servers that are an intermediary aggregation layer, orchestrating multiple API calls to provide one specific API, tailored to the needs of a specific client.
Why build BFFs? Well, if you’re building even moderately sized apps, you’re going to have to deal with a whole bunch of underlying domain services — whether they’re internal microservices, CRM/content APIs, third-party APIs that are business critical, or just databases — and you’ll need to bring them together somehow. That’s a lot of API calls.
10 times out of 10, I’m going to aggregate those calls on a BFF server layer with a ~10Gbps downstream interconnect with much lower latencies between my services, rather than on my client, which would have anywhere from ~10–100 Mbps upstream (half that on mobile) with bigger, slower, wildly unpredictable hops.
But because BFFs can aggregate and map downstream data however you like, from various sources, each with its own architecture (and idiosyncrasies), testing becomes even more critical.
Why? And how should we approach testing for BFFs, then? Let’s talk about it. I’ve been using WunderGraph — an open-source BFF framework — to build BFFs for data-heavy apps for a while now, and its integrated testing library that lets me write framework-agnostic integration tests (mocking data when needed) has been a godsend.
But first, let’s answer the one burning question most developers have:
TL;DR: with the Backends-for-frontends pattern you have multiple points of failure.
For any other app, if you’re using TypeScript and performing runtime validation with Zod, sure, maybe you can get away with no testing. But what happens when you need to orchestrate multiple downstream calls on the BFF server, with data sources ranging in the dozens?
Because of the inherently inconsistent nature of so many disparate underlying services, building and maintaining your BFF around them becomes a pain. You need to know your BFF does what you’ve coded it to do, so, testing becomes crucial.
You need to:
Make sure the BFF server is working correctly and can handle requests from the client(s), ensure that the client(s) can successfully communicate with the corresponding backend services, including any necessary data transformations, API calls, or business logic
Make sure each of the downstream services (calls to APIs, data sources, etc) is working and returning data according to the agreed-upon API specification.
Make sure the BFF aggregates and processes the data returned from downstream services correctly and returns data in the format the client needs. I.e. verify the contract.
Simulate failure scenarios, to evaluate the error handling and resilience mechanisms of the BFF, making sure it gracefully handles failures, retries, timeouts, or degraded service scenarios.
Simulate scenarios with exceeded limits/resource constraints to evaluate the performance and scalability of the BFF to ensure the system can handle these edge cases.
Unit tests help, sure, but integration tests are the name of the game when it comes to BFFs because they are specialized for evaluating component interaction — exactly what you want with the Backends-for-frontends pattern.
Using WunderGraph, you define a number of heterogeneous data dependencies — internal microservices, databases, as well as third-party APIs — as config-as-code, and it will introspect each, aggregating and abstracting them into a namespaced virtual graph.
https://github.com/wundergraph/wundergraph/
First of all, you’ll have to name your APIs and services as data dependencies, much like you would add project dependencies to a package.json
file.
// Data dependency #1 - API to get the capital of a country
const countries = introspect.graphql({
apiNamespace: "countries",
url: new EnvironmentVariable(
"COUNTRIES_URL",
"https://countries.trevorblades.com/"
),
});
// Data dependency #2 - API to get the weather of a given city
const weather = introspect.graphql({
apiNamespace: "weather",
url: new EnvironmentVariable(
"WEATHER_URL",
"https://weather-api.wundergraph.com/"
),
});
💡 You could just use the actual URL, but defining data sources as ENV variables here allows you to replace them with a mock server URL when testing. We’ll get to that later.
You can then write GraphQL operations (queries, mutations, subscriptions) or async resolver functions in TypeScript — or a combination of the two — to aggregate your downstream data in whatever fashion you want, process them, and compose them into the final response for the client your BFF is serving.
To get a country’s capital:
query ($code: String) {
countries_countries(filter: { code: { eq: $code } }) {
code
name
emojiU
}
}
To get the weather by city:
query ($city: String!) {
weather_getCityByName(name: $city, config: { units: metric }) {
name
country
weather {
summary {
title
}
temperature {
actual
feelsLike
min
max
}
}
}
}
Combine the two by writing an async function in TypeScript, to get the weather of a given country’s capital.
import { createOperation, z } from "../generated/wundergraph.factory";
export default createOperation.query({
input: z.object({
code: z.string(),
}),
handler: async ({ input, operations }) => {
const country = await operations.query({
operationName: "CountryByCode",
input: {
code: input.code,
},
});
const weather = await operations.query({
operationName: "WeatherByCity",
input: {
city: country.data?.countries_countries[0]?.capital || ""
},
});
return {
country: country.data?.countries_countries[0].name,
weather: weather.data?.weather_getCityByName?.weather,
};
},
});
Now you can just call this resolver clientside to get the data you want, using a fully typesafe client that WunderGraph generates for you if you’re using a React-based framework, or a data fetching library like SWR or Tanstack Query.
import { useQuery } from "../components/generated/nextjs";
const { data, isLoading } = useQuery({
operationName: "CountryByCode",
input: {
code: "DE", // insert country ISO code here
},
});
return(
<div>
{isLoading ? (
<CardSkeleton />
) : (
<Card
code={data?.countries_countries[0].code}
name={data?.countries_countries[0].name}
/>
)}
</div>
...
)
If you aren’t, WunderGraph always mounts each operation as its own endpoint by default, serving data as JSON over RPC, so regardless of framework, you could just use your library of choice to make a regular HTTP GET request to the WunderGraph BFF server:
http://localhost:9991/operations/WeatherByCountryCapital?code=INSERT_COUNTRY_ISO_CODE_HERE
…to get the data you want, in JSON.
All good, but as said before, the more data dependencies you have, the more important testing becomes. Whenever you’re interacting with multiple APIs and using disparate data to craft a final client response, you’re going to have multiple potential points of failure in your app. Which brings us to testing.
WunderGraph’s testing library makes writing tests for all of your data sources (whether they’re GraphQL, REST, databases, Apollo Federations, and more) in a single test suite dead simple — setting up a testing server for you, with full typesafe access to your data. It comes with Jest out of the box, but you could use it with any testing framework at all.
Let’s make sure each of our downstream calls works right, first. We can test integration points after.
createTestServer()
returns a WunderGraphTestServer
object that wraps the test server and the type-safe client WunderGraph auto-generated for you, so you’ll still have autocomplete when writing Jest assertions for your data structure.
You call the WunderGraph-generated client by calling testServer.client()
within a test, and choosing the query to run (including its inputs, if any).
import { expect, describe, it, beforeAll } from "@jest/globals";
import {
createTestServer
} from "../.wundergraph/generated/testing";
import { WunderGraphTestServer } from "@wundergraph/sdk/testing";
let testServer: WunderGraphTestServer;
/* Start up a test server */
beforeAll(async () => {
testServer = createTestServer()
return testServer.start();
});
/* Tear down test server once tests are done */
afterAll(() => testServer.stop());
/* Test individual API calls here */
describe("Test downstream calls", () => {
// Operation 1 : CountryByCode
it("Should be able to get country based on country code", async () => {
const result = await testServer.client().query({
operationName: "CountryByCode",
input: {
code: "DE",
},
});
const country = result.data?.countries_countries[0];
expect(country).toHaveProperty('code', 'DE');
expect(country).toHaveProperty('name', 'Germany');
expect(country).toHaveProperty('capital', 'Berlin');
});
// Operation 2 : WeatherByCity
it("Should be able to get weather based on city name", async () => {
const result = await testServer.client().query({
operationName: "WeatherByCity",
input: {
city: "Berlin"
},
});
const data = result.data?.weather_getCityByName // you get autocomplete here
const weather = data?.weather
expect(data?.name).toBe('Berlin');
expect(data?.country).toBe('DE');
expect(typeof weather?.summary?.title).toBe('string');
expect(typeof weather?.summary?.title).toBe('string');
expect(typeof weather?.temperature?.actual).toBe('number');
expect(typeof weather?.temperature?.feelsLike).toBe('number');
expect(typeof weather?.temperature?.min).toBe('number');
expect(typeof weather?.temperature?.max).toBe('number');
});
});
Next, let’s test our main Integration point — the BFF. This API’s response, and its ability to aggregate data, are critical for the client and needs to be tested.
import { expect, describe, it, beforeAll } from "@jest/globals";
import { createTestServer } from "../.wundergraph/generated/testing";
let testServer: ReturnType<typeof createTestServer>;
/* Start up a test server */
beforeAll(async () => {
testServer = createTestServer();
return testServer.start();
});
/* Tear it down after tests are done */
afterAll(() => testServer.stop());
/* Test your BFF response here */
describe("Test BFF API response", () => {
it("Should be able to get weather data based on country code", async () => {
const result = await testServer.client().query({
operationName: "WeatherByCapital",
input: {
code: "DE",
},
});
const country = result.data?.country;
const capital = result.data?.capital;
const weather = result.data?.weather;
// Assert structure of response
expect(country).toBe("Germany");
expect(capital).toBe("Berlin");
expect(weather).toHaveProperty("summary");
expect(typeof weather?.summary).toBe("string");
expect(weather).toHaveProperty("temperature");
expect(typeof weather?.temperature?.actual).toBe("number");
expect(typeof weather?.temperature?.feelsLike).toBe("number");
expect(typeof weather?.temperature?.min).toBe("number");
expect(typeof weather?.temperature?.max).toBe("number");
});
});
These are mostly happy-path tests, but now that you know the framework is there, and how easy setting up/spinning down test servers are, you could apply these principles and implement fuzzing, limit testing, or whatever you want.
Any time any of your data sources change, WunderGraph will regenerate the client (with wunderctl generate
, which you call each run when you start up the BFF server and your frontend), and your assertions would fail on a npm test
, letting you know immediately.
All of this is great, but you don’t want to actually call the individual services/APIs every single time you iterate in development, right? Or, another scenario: what if those services are things you know will exist come production time, but ones the backend team hasn’t finished building yet?
This makes for the perfect segue into…
While writing tests, you’ll often need to fake, or simulate data. The purpose of this “mocking” is to control the behavior of dependencies/external functions, making it easier to isolate and verify the correctness of the actual code you’re testing — without messing with your actual data sources. Otherwise, it’s way too easy to write tests that accidentally manipulate data, end up running 10–20x slower, and still pass (because they’re technically correct).
WunderGraph’s testing library provides a createTestAndMockServer()
function that works much the same way as the createTestServer()
we used before, wrapping a test server and the auto-generated typesafe client, but also allows you to replace calls to HTTP data sources (that you’ve defined as environment variables in wundergraph.config.ts
. See why that was needed?) and mocking their responses.
import { expect, describe, it, beforeAll, afterAll } from "@jest/globals";
import {
createTestAndMockServer,
TestServers,
} from "../.wundergraph/generated/testing";
let testServer: TestServers;
beforeAll(async () => {
testServer = createTestAndMockServer();
return testServer.start({
mockURLEnvs: ["COUNTRIES_URL", "WEATHER_URL"],
});
});
afterAll(() => testServer.stop());
...
Including our two data dependencies — COUNTRIES_URL
and WEATHER_URL
— in the mockURLEnvs
array tells WunderGraph’s test server, “Capture all requests made to these two URLs within each test, and mock their responses instead.”
Then you can set up that mock with mock()
, watching for a matching HTTP request, and writing a handler function to return the data you want.
The argument to the mock()
function is an object with the following properties:
// Step 1 : Set up the Mock
it("Should be able to get country based on mocked country code", async () => {
const scope = testServer.mockServer.mock({
times: 1,
persist: false,
match: ({ url, method }) => {
return url.path === "/" && method === "POST";
},
handler: async ({ json }) => {
const body = await json();
expect(body.variables.code).toEqual("DE");
expect(body.query).toEqual(
"query($code: String){countries_countries: countries(filter: {code: {eq: $code}}){code name capital}}"
);
return {
body: {
data: {
countries_countries: [
{
code: "DE",
name: "Germany",
capital: "Berlin",
},
],
},
},
};
},
});
// Step 2 : Call the real WunderGraph mounted endpoint for this operation (it'll be intercepted)
// Step 3 : Assert the mocked response
times — The number of times the mock should be called. Defaults to 1.
persist — If true, the mock will not be removed after any number of calls, and you’ll have to do it manually with testServer.mockServer.reset()
after tests are done. Defaults to false.
match — A function that returns true if the HTTP request for this test is a match
handler — A function that mocks data when it is a match, and either returns the response or throws an error.
Now, with the mock properly set up, and HTTP requests being properly intercepted — you can now simply call the real mounted endpoint for this operation using the WunderGraph generated client again, and get back the mocked response — one that never makes the actual request to the data source at all.
Here’s the full test:
import { expect, describe, it, beforeAll, afterAll} from "@jest/globals";
import {
createTestAndMockServer,
TestServers,
} from "../.wundergraph/generated/testing";
let testServer: TestServers;
beforeAll(async () => {
testServer = createTestAndMockServer();
return testServer.start({
mockURLEnvs: ["COUNTRIES_URL", "WEATHER_URL"],
});
});
afterAll(() => testServer.stop());
/* If downstream services don't exist yet... */
describe("Mock http datasource", () => {
// Operation 1 : CountryByCode
it("Should be able to get country based on mocked country code", async () => {
const scope = testServer.mockServer.mock({
match: ({ url, method }) => {
return url.path === "/" && method === "POST";
},
handler: async ({ json }) => {
const body = await json();
expect(body.variables.code).toEqual("DE");
expect(body.query).toEqual(
"query($code: String){countries_countries: countries(filter: {code: {eq: $code}}){code name capital}}"
);
return {
body: {
data: {
countries_countries: [
{
code: "DE",
name: "Germany",
capital: "Berlin",
},
],
},
},
};
},
});
// call the real WunderGraph mounted endpoint for this operation
const result = await testServer.testServer.client().query({
operationName: "CountryByCode",
input: {
code: "DE",
},
});
// If the mock was not called or nothing matches, the test will fail
scope.done();
expect(result.error).toBeUndefined();
expect(result.data).toBeDefined();
expect(result.data?.countries_countries[0].name).toBe("Germany");
});
// Operation 2 : WeatherByCity
it("Should be able to get weather based on mocked city name", async () => {
const scope = testServer.mockServer.mock({
match: ({ url, method }) => {
return url.path === "/" && method === "POST";
},
handler: async ({ json }) => {
const body = await json();
expect(body.variables.city).toEqual("Berlin");
expect(body.query).toEqual(
"query($city: String!){weather_getCityByName: getCityByName(name: $city){name country weather {summary {title} temperature {actual feelsLike min max}}}}"
);
return {
body: {
data: {
weather_getCityByName: {
name: "Berlin",
country: "DE",
weather: {
summary: {
title: "Clouds",
},
temperature: {
actual: 294.42,
feelsLike: 293.88,
min: 292.79,
max: 295.96,
},
},
},
},
},
};
},
});
// call the real WunderGraph mounted endpoint for this operation
const result = await testServer.testServer.client().query({
operationName: "WeatherByCity",
input: {
city: "Berlin",
},
});
// If the mock was not called or nothing matches, the test will fail
scope.done();
expect(result.error).toBeUndefined();
expect(result.data).toBeDefined();
expect(result.data?.weather_getCityByName?.name).toBe("Berlin");
});
});
WunderGraph’s generated typesafe client being accessible here (calling.client().query()
) means that you’re not limited to just GraphQL operations for this. You can use TypeScript operations too for your tests and mocks — just pass in the namespaced operationName
and you’re golden.
Finally, for End-to-End (E2E) testing, WunderGraph does not provide any libraries specifically for it, but it is fully compatible with PlayWright. Just make sure you run WunderGraph’s BFF server and the frontend first (WunderGraph’s default npm run start
script does this for you) in playwright.config.ts
.
…
/* Run your BFF + Frontend before starting the tests */
webServer: process.env.WG_NODE_URL
? undefined
: {
command: 'npm run start',
port: 3000,
},
//...
Where WG_NODE_URL
is a default WunderGraph Environment Variable pointing to the base URL for the WunderGraph server (http://localhost:9991 by default).
Hopefully, now you have a better idea of why testing is so critical for building better, more maintainable Backends-for-frontends.
Using WunderGraph’s built-in testing library opens up the opportunity to write better tests, more easily, for pretty much any integration point you want, and also mock responses so you can test your app and BFF implementation without ever calling the actual data sources during development, using up your quota or blowing past rate limits and getting throttled before your app is even in production.
Also published here.