After more than two years of building serverless applications on Firebase and writing countless Cloud Functions, I realised a lot of it began to repeat.
They were all functions that were triggered by updates to the project’s Cloud Firestore database, i.e. when a document was created, updated, or deleted. These involved basic operations such as copying one document’s data to another collection, automatically computing a specific value based on document data, and syncing data to a third-party service such as Algolia.
Since Cloud Firestore triggers are based on per-document updates, I ended up copy-pasting the same code for each collection at each specific trigger. For example, I ended up with the functions profiles-onCreate, profiles-onUpdate, and profiles-onDelete, and the same or very similar code to sync to Algolia was present in each function. Then, this was repeated for other collections that required similar functionality: posts-onCreate, posts-onUpdate, and posts-onDelete.
As I continued building these applications to add more functionality, they became more complex and harder to maintain: copy-pasting code into various trigger functions caused rapid growth in the codebase, and having each individual function perform multiple tasks made them more complex and prone to failure. A bug in the code for one task could cause the entire function to crash, preventing any other tasks from being run, potentially causing data loss. And trying to fix that bug when you’ve copy-pasted that code to different functions? Have fun.
To solve this problem, I first took a step back and looked at the patterns in the code I was writing. Each cloud function was performing multiple tasks and running the same code for different collections. But this goes against the very idea of a function. Wikipedia defines it as:
A sequence of program instructions that performs a specific task, packaged as a unit.
Rather than combining all the logic in a single cloud function, each task should be separated into its own cloud function triggered by the same event. There are no technical limitations preventing multiple functions subscribing to the same trigger.
Additionally, there was no good reason to be copy-pasting code that much anyway, so logic shared across different cloud functions should be written in its own function. For example, each -onUpdate cloud function should call a generalised syncFieldsToAlgolia function, where an argument can be passed to specify the fields to sync.
While those two steps improve the reliability of the resulting cloud functions, they don’t do much to solve the poor developer experience of having to explicitly write a cloud function for each combination of collection, Firestore trigger, and task. To re-implement the three tasks I mentioned above with this solution, I would need to explicitly write 3 cloud functions for each of the 3 Firestore triggers for the 2 collections mentioned — that’s 18 cloud functions in total!
Since each individual cloud function task has been simplified to a single function call with specific arguments, I could write a higher-order function that generates these cloud functions for me. All it needs is to take the collection name and any configuration arguments to be passed to the function. This allows me to separate the business logic, such as what fields need to be synced, with how this logic is implemented in the code.
Essentially, this is a declarative approach to adding business logic where you don’t have to touch the underlying code.
In fact, this is the same approach that makes React so popular: it lets developers declare what should appear on screen without worrying about how to manipulate and update the underlying DOM to achieve it.
Business requirement: We need to display a list of new users who haven’t been verified yet, sorting them by sign-up time.
Problem: Firestore does not allow querying for documents that do not have a specific field, i.e. the field is undefined.
Solution: All user documents must have the verified field set. We can securely ensure all documents have this field with a cloud function that runs when they are created.
Generalising this solution, we can create cloud functions that ensure documents have the correct initial values set.
Let’s create a new function that returns a cloud function triggered by an onCreate event. When the document is created, it updates the document with the specified initial values. This function doesn’t need to know what collection it will be listening to or what the initial values should be.
import { firestore } from "firebase-functions";
const initializeFn = (config) =>
firestore.document(`${config.collection}/{docId}`)
.onCreate(async (snapshot) =>
snapshot.ref.update(config.initialValues)
);
Then we can store the configurations separately, where we can simply declare what documents should have what initial values. In this case, we want to ensure all documents in the users collection have the verified field default to false and a createdAt field to the server timestamp.
import * as admin from "firebase-admin";
const initializeConfigs = [
{
collection: "users",
initialValues: {
verified: false,
createdAt: admin.firestore.FieldValue.serverTimestamp(),
},
},
];
Putting it all together, we can use the configs to create the actual cloud functions. Here, we reduce the array of config objects into a single object, where the key is the name of the collection and the value is the cloud function code.
export const initialize = initializeConfig.reduce(
(acc: any, config) => ({
...acc,
[config.collection]: initializeFn(config),
}),
{}
);
This code is then added to the index file of the cloud functions repo. The Firebase CLI creates cloud functions from the file’s exports. And I’ve deliberately chosen to export the functions that way to group functions: the CLI will automatically prefix the functions with initialize-.
Then we can deploy all these functions with one simple command:
firebase deploy --only functions:initialize
And here’s the cloud function in action!
Because the code has been moved to a single generalised function, we’ve entirely eliminated repeated code. This has a number of advantages:
There’s less code to maintain. We can easily add functionality or fix bugs in the one place the code is written. 🐛 🔫Zero errors from copy-pasting code. No more forgotten unchanged variable names that crash your function. 🚧Testing can be streamlined. It only needs to be run on the generalised function, making it easy to improve code coverage.
And since we’ve separated out the business logic by making it declarative, we have:
Improved readability of business logic. It’s significantly easier to read what business logic is implemented and what the expected result is.A much easier way to add new business logic. We could even build a GUI to create these cloud functions and have non-technical users add to this — all it needs to do is output a correct configuration object.
Slower deploys. Since the individual cloud functions are generated at deploy time, the Firebase CLI does not immediately recognise them, so the entire function group must be deployed every time.Higher costs. As the original “mega”-cloud functions were broken down into single-operation cloud functions, we incur more costs from increased compute time (as a result of more functions being booted) and from more invocations (if the project uses more than the free limit).
Recently, I’ve used this pattern on many of our cloud functions in an open-source project, Firetable, a spreadsheet-like GUI for Firestore. It’s used to integrate with Algolia, sync data between different collections, maintain a basic version history of documents, and more.
Declarative cloud functions such as these are easily adaptable for different use cases and scale across different collections.