Peter Pito

@peterpito

How do you estimate for teams with different development speeds?

You have a new mention from Slack read my email — “how do you estimate when you have multiple teams working towards the same goal, but they have different development speeds and they track the work differently”?

I thought that this was an interesting question, partially addressed to me, partially addressed to those from the Slack group. Since I was mentioned I felt that I ought to say something. I wasn’t sure though that a few sentences would do justice and I was afraid that my answer would get lost amongst other threads. Plus, I was aware of other teams battling with the same question, so I thought that I’d write up an answer.

To set the context, there are three teams at play — one ‘API team’ that builds services and two ‘Front-end (FE)’ teams that build Web applications by consuming those services. The FE teams use User stories to develop and track their work, but the API team don’t develop against those stories; they develop at a group of stories or epic level, creating their own separate tickets to track the work. The API and FE teams work in parallel.

The result is that the FE teams may finish their work on stories quicker but they need to wait for services to be delivered to them before ’truly’ completing those stories. This gets reflected by a group of tickets that will get stuck in ‘development’ until the services are finished; they will move into ‘test’ together.

Another problem is that makes reporting tough because it appears the FE teams are not working quickly and their burnup will ‘jump’ in steps.

It is easy to jump to (wrong) conclusions quickly. Even if I knew the context beforehand by spending a few days with the teams, before answering how to estimate in this case I wanted to make sure that I really understood the problem. I was trying to build a mental view of the situation.

Something else was bothering me. The person who asked the question led it with ‘despite having spent a good half a day with Peter, I’m not entirely clear on how you do this for stories that span full stack, with component teams that all work at different speeds’. While I take some solace that he said that was a good day and despite knowing that we spent discussing other topics not just the estimation, I was disappointed with myself that I’ve left them still puzzled about this topic.

What follows is an attempt to both help me and the team better answer the question.

Visualising the problem first

“The formulation of a problem is often more essential than its solution”, Albert Einstein.

First step I took in answering the question was to truly understand the problem in hand. I didn’t want to jump straight to solutions. I saw this as a flow problem, so I thought best visualise it using flow techniques.

So how does this problem manifests itself? I took the iPad and started drawing some sketches.

I started by imagining how a snapshot in time would look like. The FE teams are busy working on a number of stories.

Snapshot of the FE teams’ Kanban

They have done everything they could, they analysed the stories, developed them, peer reviewed them and tested them using mocks. Then stories got stuck and could not move to the next phase (SIT phase) because the services they required were not ready.

FE Teams’ blocked stories

The services are developed by the API team. They are using their own type of tickets an they batch those tickets in releases with cadences set by them. They choose to release mature services that might be used by more than one client. They feel the need to be in control of their destinies.

The services for our FE teams are not released yet, blocking their stories to continue. These stories will stay in the queue for the SIT phase.

Stories blocked waiting for the API team’s release

When the API team releases those dependant services, then the FE teams’ stories will move to SIT together.

FE stories moving together in one bigger batch
FE stories moved, allowing SIT to commence

I then looked at the same problem from a different angle. This time instead of looking at a snapshot we can look at the work as it progressed over time. We will represent the situation by using a Cumulative Flow Diagram (CFD), which would look something like this:

A CFD representing accumulation of work per work phases over time

Those stuck stories would be represented by the highlighted slice.

Stuck stories appear as a ‘bulge’ on the CFD

This slice highlights two aspects:

  • the height of this slice is bigger than the height of other slices, showing the accumulation the stories before they can enter SIT
  • a ’staircase’ shape in the SIT stage indicates that bigger batches of work enter and leave SIT. While the current batch is being tested, the 2nd is already forming waiting to enter SIT, waiting for their own dependant services.

For simplicity we assume that the stories entering SIT can be tested easily; if this assumption would not stand and SIT testing would take longer, then the situation would complicate even further by resulting in an ever increasing SIT slice. This would have significant impact on the end-to-end flow. How ofter in life do we get an easy passage?

Reporting — addressing the question partially

At this point I was content. I was sure that I had a clear view of the situation, or at least I’m making it clear on what context I’m offering further observations.

With a better view of the situation we can now ask ourselves what else can we do next? For now, lets assume that we don’t want to change the development processes, that we are happy to have the API and FE teams working in the way described above.

Reporting is one area that can be explored. We can increase the visibility of the overall situation showing amongst other things the two different speeds of delivery. This should put to rest at least some of the concerns of a perceived slow FE teams.

We start by looking at the Kanban systems, focusing to delimit the boundaries of work in progresses (WIP) for these systems. This not to be confused with the limiting work in progress concept.

I have used systems in plural form because there are two work systems at play. This might not have been a conscious system design decision and might not be the way the teams look at it, but we are dealing with two systems.

System 1 is the work the FE teams perform until their dependent services are made available to them, typically developing and testing against mocks.

Boundaries of Work system 1 — from Next to INT; items depart from System 1 when they leave the ‘IP sub-column’ of the INT phase

The System 2 is a superset of System 1 and contains in addition the SIT phase. While some could define System 2 as containing only the SIT phase, I define it from Next to SIT. Doing it this way shows a better picture of the entire flow and offers better analysis opportunities to understand the big picture.

Boundaries of Work system 2 — measured from Next to ‘SIT IP’

If we want to produce data driven reports, then it is important to have a system that we can measure. In order to measure a system we need to clearly define what we are going to measure, and in this context we will measure the work in progress. Once we’ve defined the boundaries of System 1 and System 2 we can look at the world using those two lenses. We can produce two reports, one for each system.

Scope of the Report for System 1 is highlighted by the black border
Scope of System 2 is wider

To show the rate of delivery for the FE team we can create a Burnup or Throughput chart for Work System 1. This will show the rate at which the FE teams burn stories up to the Integration phase, settling the allegation of ‘slow’ FE teams. A word of caution though. I would show this report always in the context of a second ‘big picture’ report (see below). Stating a rate of delivery covering only up to the Integration phase can be miss-interpreted or miss-used in many dangerous ways. I do believe that there are other issues at play here, which I’ll explore a bit later on, but for now we are assuming that we don’t want to change any of the ways of working, we just want to improve the reporting.

To look at the big picture, we can produce a 2nd report for the System 2. This will show a slower delivery rate compared to the other report, but a more accurate one from an end-to-end work point of view, a view that takes into account everything needed to complete work. This slower rate is due to wait time on the API team.

Estimation — original question

What about the original question — how do you estimate in this scenario?

Maybe there are other more important questions to answer than this one. Hopefully visualising the problem helps identifying those questions. Here are some that sprung on my mind while I was sketching, grouped around the following themes:

  • decomposition strategy — is the decomposition strategy of the scope right? Is the scope sliced to suit the current company organisation or to help this project to deliver value faster? Would though a different slicing optimise this project, but affect other projects?
  • dependency management — dependencies are terrible because they slow you down. If dependencies can not be avoided by a different decomposition/slicing strategy, then what could be done to minimise the impact? Could the FE teams start the work only when the API team has completed the dependent services?
  • pursuing the right organisational goals — focusing on reducing the SIT cycle time should be a much bigger goal than focusing reducing the INT cycle time (ok, focusing to reduce the cycle time from idea to production deployment and realising the benefits of those deployments are bigger goals, but this is for another time). So the bigger, overarching question is — is the focus on designing work processes geared towards realising faster organisational goals?
The real prize is handed out when value can be realised

However, I feel that once you read the above a ‘but is coming’. ‘Yes, that makes sense, and yes I can see the value in pursuing some of those questions, but I can not change all of that now, at least not overnight’. So with this in mind, how can you help me now? In this case, my answer would be:

  • I’d use some of visualisations presented here to kick off those harder, longer term discussions on debating tough questions around slicing, synchronisation of slices, organisational structures
  • I’d report using two reports, one for System 1 and one for System 2, always shown together
  • to estimate a release I’d produce a forecast based on System 2 using the FE User stories as currency — this will reflect indirectly the wait time on the API team. Release estimations are more accurate if they model what happens in reality.
  • to understand the minimal capacity needed for the FE team I’d forecast using System 1. If there is a spare capacity, I might use that somewhere else.

You have a new mention from Slack “how do you estimate when you have multiple teams working towards the same goal, but they have different development speeds and they track the work differently”? It’s not a simple answer, but hopefully not that complicated either that can not be solved. I hope that this write up helps! Time to go to bed now.

More by Peter Pito

Topics of interest

More Related Stories