This blog post is part of a series where I share our migration from monolithical applications (each with their own source repository) deployed on AWS to a distributed services architecture (with all source code hosted in a monorepo) deployed on Google Cloud Platform.
Let’s list some of the things we need to manage with a repository:
For some things, such as managing dependencies, services like Greenkeeper may help. However, if a dependency releases a new major version, you have to manually apply that to all repositories and run the tests.
It became clear that none of us enjoyed any of these maintenance tasks and we rather spend the time to make our market research chatbots more valuable to our customers.
Our code is mainly written in Javascript, which brought us to look at Lerna.
Lerna is a tool for managing JavaScript projects with multiple packages.
We decided to take this one step further. Instead of managing our npm packages only, we configured Lerna to also manage our services, which live in the same monorepo.
Our monorepo directory structure is as follows:
.├── lerna.json├── package.json├── packages└── services
The lerna.json
file is straight forward:
{"lerna": "2.4.0","npmClient": "yarn","useWorkspaces": true, // See "Yarn Workspaces" below"packages": ["packages/*", "services/*"],"version": "independent"}
With this configuration, our services can depend on packages and Lerna takes care of symlinking them. For example, we can run yarn add package-z
within the services/service-a
directory and lerna symlinks package-z
properly. No more dealing with yarn link
.
To Lerna, packages/*
and services/*
are considered packages. Most lerna commands support the --scope
flag, but that only works if you follow a strict naming convention for your name
properties in the package.json
files.
We decided to separate packages from services by using different scoped packages. Since packages/*
get deployed to NPM, they use the company default scope (e.g. @my-company
). Services in services/*
use a @my-company-services
scope. Packages and services are further prefixed with web-*
vs svr-*
to distinguish between different types of packages and services.
Lerna is great at managing inter-dependencies and running npm scripts or even arbitrary commands across all packages or subsets thereof.
However, each package and service by default gets their own node_modules
folder. That is a lot of duplication…
The fine folks who give us Yarn released “Workspaces” and kindly enough blogged how to use it with Lerna: https://yarnpkg.com/blog/2017/08/02/introducing-workspaces/
Besides the "useWorkspaces": true
in the lerna.json
, you also have to add "workspaces": ["packages/*", "services/*"]
to your root package.json
file. That’s it.
Now when you run yarn
and lerna bootstrap
, your root node_modules
folder contains close to all npm packages you ever need. This saves both time and disk space. The following showcases the difference between not using Yarn Workspaces and using it in our monorepo. The stats are based on 20 packages managed by Lerna, run on a 2016 MacBook Pro.
Without Yarn Workspaces
+-----------------+--------+| Command | Time |+-----------------+--------+| yarn install | 13.23s || lerna bootstrap | 72.33s |+-----------------+--------+
This adds 96,112 files at a total of 666.4mb to disk.
With Yarn Workspaces
+-----------------+--------+| Command | Time |+-----------------+--------+| yarn install | 17.26s || lerna bootstrap | 3.85s |+-----------------+--------+
This adds 32,008 files at a total of 267.1mb to disk.
Waiting an extra 4 seconds to install the root packages is worth the savings we get with lerna bootstrap
. With a bit of caching on the continuous integration server, things look even better, but I’m getting ahead of myself.
We use Jest, but decided to let Lerna manage the test runner instances. (FYI, Jest comes with a multi-project-runner that may be useful in your use case.)
In our case, we like the --scope
flag Lerna provides to run commands in certain directories only.More importantly, we have a variety of packages and services, some can be used in Node.js, others in the browser and some are isomorphic.
To accommodate for that, we have the following Jest configuration setup:
.├── jest.config.js├── packages│ ├── iso-package│ │ ├── jest.config.js│ ├── svr-package│ │ └── jest.config.js│ └── web-package│ └── jest.config.js├── services│ ├── svr-service│ │ └── jest.config.js│ ├── web-service│ └── jest.config.js└── tests-setup├── polyfill.js└── setup.js
The root-level jest.config.js
contains the base Jest configuration we apply across all packages and services. It looks something like that:
// jest.config.jsmodule.exports = {collectCoverageFrom: ['**/*.js'],resetMocks: true,verbose: true}
A web-*
package or service uses the following jest.config.js
within its root directory:
// packages/web-*/jest.config.js or services/web-*/jest.config.jsconst jestBase = require('../../jest.config.js')module.exports = {...jestBase,coverageThreshold: {global: {statements: 100,branches: 100,functions: 100,lines: 100}},browser: true,setupFiles: ['<rootDir>/../../tests-setup/polyfill.js','<rootDir>/../../tests-setup/setup.js']}
A iso-*
or svr-*
package or service uses the following jest.config.js
within its root directory:
const jestBase = require('../../jest.config.js')module.exports = {...jestBase,coverageThreshold: {global: {statements: 100,branches: 100,functions: 100,lines: 100}},testEnvironment: 'node'}
Notice how we configure the coverageThreshold
on a per package / service level? This allows individual teams to set their own thresholds. Managing that per package / service is significantly simpler than at the monorepo root level.
The root package.json
file contains a "test": "lerna exec yarn test"
script. Each package and service has its own test
script that simply invokes Jest: "test: jest"
. The pattern applies to test:coverage
as well.
We can now use Lerna’s flags to do all sorts of nice things:
yarn test --scope @my-company-services/*
.yarn test:coverage --scope @my-company/web-*
.@my-company/iso-package
package and all packages and services that depend on it: yarn test --scope @my-company/iso-package --include-filtered-dependencies
.Why use **lerna exec**
to execute a npm script when **lerna run**
does exactly that?
From what we encountered, lerna run
swallows the output of the npm scripts. With the --stream
flag, we get the output but it’s neither formatted nor does it have coloured console output.
While I could imagine Jest’s multi-project-runner to be more performant than our solution, we like Lerna’s powerful flags and decided to forgo Jest’s approach. This may very well change as more and more tests get added to the monorepo. (Happy to chat about that if anyone has some thoughts)
No special consideration was necessary. Simply add your config files to the repository root and it works as expected.
The pull request template is configured once in the .github/PULL_REQUEST_TEMPLATE.md
file. It applies across all packages and services.
Compared to multiple repositories, managing pull requests in a monorepo requires a bit more thinking. At the time of this writing, we have not yet decided how we will deal with that. A few notes from initial discussions include:
The benefits of a monorepo immediately were apparent to the team. Prior to that, we used yarn link
to deal with a small SDK we use to integrate with the backend API. It works, if you’re careful and don’t deal with Docker as we do for our local development. Regardless though, it is still a mental burden on each individual developer who works on the SDK.
Getting everything configured took time, I am not going to sugarcoat that. Thanks to an amazing and curious team who showed patience throughout that transition period, we’re now in a place to spend more time building software rather than maintaining source repositories. Thank you!