Does the team or app size affect the release process? Well, it depends. Let’s imagine a startup with one small team. In this case, the team usually makes a feature, and then just releases it. Now, let’s imagine a large project, for example, a banking app, with many teams working on it.
In this case, there should probably be a process, release cycles, and maybe some bureaucracy. Without that, there will be chaos.
So, when does it become clear that it’s time to set up such a process for your app?
In this article, I’ll share my experience of implementing Release Train for the Dodo Pizza app (Android and iOS) and the problems we faced that made our team implement Release Train.
If you are the Team Lead/Tech Lead of an Android or iOS project that is growing fast and you have not managed the release process yet, I hope that our experience will help you.
In 2021, we were already using a Trunk-based Development (TBD) approach in our teams. We covered the code with feature toggles, decomposed tasks, and ran unit and UI tests. Our feature branches didn’t live long, and we had CI working.
The release process was very simple: whoever was ready to roll their feature out, rolled it.
Here’s roughly what our branch workflow looked like. Several teams (gray, blue, orange, and green) worked on different features. Since we were working according to TBD, each feature could live through several consecutive branches.
For example, the gray team made their feature in 4 steps, the blue and orange teams made theirs in 1 step, and the green team made theirs in 2 steps.
When a team finished a feature, they could roll out a release. For example, if the blue team finished a feature, they could do a release. Then, the orange team would finish a feature and do another release.
We had the perfect flow, as it seemed then. It worked great up to a point, but all good things come to an end.
Something went wrong: tough, crowded, and unpredictable
The first problem we encountered was that releases started accumulating a lot of features and became too big.
The teams didn’t always want to release their features right away. The release and regression process was time-consuming and took 3–4 days. So if your feature was small and not urgent, you didn’t always get to release it yourself because probably, some other team would do a release soon, and it would be included in that release. Roughly it looked like this:
This was quite a fragile arrangement, especially when the number of teams started to grow. Many teams developed many small features, and the total volume of new code in each new release became huge. When someone got to release their big feature, they had to release a whole mammoth together with it.
The mammoth-releases resulted in:
We needed to make it so that even if the blue and orange team from the example didn’t want to release, the release would be done somehow.
Every team is unique, and every feature is different. Sometimes, things happened in such a way that several teams would finish their features around the same time. In this case, there was a lot of “wait for me, I’ll merge it tomorrow morning, I promise!” going around.
In the end, such bottlenecks resulted in:
We needed to make 2 crucial changes:
The releasing team shouldn’t need to wait for anyone;
every other team should know when the next release is expected.
Imagine, the blue team made a small feature and expected the orange team to release it soon. But something went wrong, and the orange team didn’t roll out the release either because of some problems of their own.
As a result, the blue team told the business that the feature would be in production soon, but it turned out to be not soon enough. As a result, it is impossible to predict when the feature will be in production.
This doesn’t mean that the blue team is irresponsible. If they have a super important or urgent feature, then of course they will release it themselves. But in other cases, there is no way to tell exactly when the feature will be available to the users.
As you can guess, we were experiencing such issues quite often. We needed to be able to tell exactly when customers would get a feature in production regardless of its size or urgency. All 3 problems (mammoth releases, bottlenecks, and lack of predictability) are closely related and complement each other.
However, probably the most fundamental and important of them all is the lack of predictability. It generates other problems.
We've had enough; it was time to make a change. The Release Train was supposed to help us do that.
The term Release Train means different things: a scheduled release process, or a dedicated team that manages the release process. Here, we’re going to talk about the scheduled release process.
I like the way Release Train is described by Martin Fowler in the “Patterns for Managing Source Code Branches” article, and the definition given by Thoughtworks in their tech radar (maybe it belongs to Martin as well).
This is how we have defined Release Train for ourselves:
Release Train is the process of coordinating releases between teams. All releases happen on a fixed schedule, independent of whether the features are ready or not. The train doesn't wait for anyone. If you are late, you have to wait for the next one.
Let's break it down with a couple of examples using our color-coded teams.
Release Train happens on schedule and does not depend on who has merged what into the main branch. In the example below, the features from the blue and orange teams will be released. The rest will wait for the next train. We could wait a bit longer, but then we'd get a mammoth.
At the same time, the Release Train helps us plan our work more efficiently. Let’s say that the blue team originally planned to finish a feature later. But since everyone knows the release date, they can slightly rearrange their plans to finish the feature early.
Or, on the contrary, they can realize that they will definitely not be on time for the next train, and hence, they can finish the feature safely because they know the whole schedule.
In the example below, the blue team wanted to get to the release and merge all their changes before the release. Otherwise, there might have been a bottleneck.
Most importantly, Release Train gave us predictability by design.
These examples may seem obvious to some, but we solved problems as they arose. When there were no problems with releases, we didn’t bother using Release Train. When problems accumulated, we realized that the time had come.
The first thing we did was write an RFC. An RFC refers to both the process itself and the design document that many companies use before they start working on a project. Some use RFCs specifically, some use ADRs, and some just call them by the more generic term Design Doc.
At Dodo Engineering, we use both RFCs and ADRs.
Our RFC process looked like this:
We drafted an RFC document;
we discussed it in a small group, collected comments, and made adjustments;
then the RFC was communicated to a wider group;
then we implemented it;
after that, we gathered feedback, tracked metrics, and evaluated results.
The structure of the RFC document for our Release Train was as follows:
In drafting the RFC, we relied on the experience of other companies:
First, we came up with this process:
1 iOS and 1 Android developers from one of the feature teams;
2 QA engineers.
Schematically, our Release Train looked like this:
After a month, it became clear that although the first experience felt great,
In 2021, our regression test used to take 3–4 days on average. We managed to shorten it to 2–3 days in 2022, but sometimes, it would exceed that time frame. We continued covering regression cases with e2e tests, but we didn’t have 100% coverage yet. We had about 70% and 60% coverage of regression cases on each platform respectively.
The takeaway from this is that as long as you have regression tests taking a few days to complete, it will likely be uncomfortable to run a release cycle every week.
We ended up moving to 2-week release cycles, and Release Train now looks like this:
1 iOS and 1 Android developers from one of the feature teams;
2 QA engineers.
This is how the process looks if everything goes according to plan:
It all looks like a weekly cycle except there’s plenty of time left for potential hotfixes. This is what it will look like in case of extended regression tests:
No big deal either; there’s still time even for hotfixes.
The main goal for us was to increase predictability. It can be broken down into two parts:
We answered the question “When will there be a release?” by implementing the Release Train process. Now, each team will be able to answer the question “in which release will my feature end up?” independently at the moment when they plan and evaluate the feature.
Before, it was impossible to say for sure because another team might or might not do the release. Now, everything depends only on that team’s own planning.
To further confirm this, we conducted surveys amongst mobile developers, QA, and product managers, where together, with other questions, we asked:
Release Train has also helped us with code freezes and release freezes. We had several of them, in addition to New Year’s Eve (for example, September 1st and some holidays). Now, with Release Train, we don’t have to adjust to these dates by creating release branches, regress testing, and all that. Releases work on schedule; we just open them in the stores later.
Beyond just solving problems, we also measured metrics. Let’s take a look at the main ones.
The first important metric we measured was the Lead Time commitment to release.
This is what the graph looks like. I marked the point when we started the Release Train process with an arrow.
The graph shows that Lead Time went down to somewhere around six days. Is six days a long or short time?
There are
I believe that for standard mobile apps, lead time should ideally aim for half the length of the release cycle. This is equivalent to merging a task into the main branch every day. This said, if the release cycle is 14 days, Lead Time should aim for 7 days.
Another metric we track is the number of bugs per regression. It describes how stable the release candidate is. If the previous release was a long time ago, then we probably created a lot of new code that might potentially contain a high number of bugs, and we might spend more time on regression and fixes.
At one point, this metric was down to three bugs. The specific numbers aren’t that crucial, but overall, you can see that the trend has gone down.
I’ll briefly discuss what metrics were also monitored as part of the Release Train.
We like the current process as we think it has achieved its goals. We also know how to improve it further:
While we were relatively small, we didn’t need a Release Train. When we faced the fact that we couldn’t predict releases, their size, and number, we decided to implement Release Train. At first, we tried weekly release cycles, but because of time-consuming regressions, we had to switch to two-week release cycles. We’ve been living this way ever since.
Now, we have predictability of releases, and metrics show positive dynamics. We plan to increase the coverage of regression cases with e2e-tests, automate the process of working with branches, and optimize the process of translations.
I hope this article and our experience will help you, especially if you have already faced similar issues and they made you think about the release process.
Thank you so much for reading my article. I hope you enjoyed it. If you have any questions or suggestions, drop me a line in the comments, and I’ll be sure to read it!
And
Also published here