Hackernoon logoHow great teams learn from failure. by@ed.a.nunes

How great teams learn from failure.

Author profile picture

@ed.a.nunesEd Nunes

credit: https://bit.ly/2H6WGwo license: (CC BY 2.0)

Mistakes, when handled appropriately, can be pretty great. Your team has identified a failure point, addressed it, and everyone has learned from the experience. Making mistakes is one of the main ways people experience professional growth.

All the benefits of mistakes go away if the team does not handle the failure effectively. Below is an explanation of how teams turn failure into a purely negative experience, followed by strategies for benefiting from failure.

“What are they going to do about it?” I asked my wife after she described a mistake someone made at her work. She explained how senior leadership would perform an investigation that:

  1. Would determine who was at fault for the error, and
  2. Determine what repercussions the perpetrator should receive.

And while this investigation is going on everyone else awkwardly avoids the topic.

This is Bad. And because my wife works in healthcare, it’s also Scary. Unfortunately, this is a common strategy that managers undertake to deal with failure.

Correcting bad behavior

In the above example the people with the bad behavior are the senior leadership conducting the investigation.

The purpose of their investigation was to find a person to blame. This technique causes several problems:

  1. Identifying a person gives the false impression that the point of failure has been addressed, allowing the failure to recur.
  2. Morale is lowered as team members become adversarial towards each other in attempts to shift blame.
  3. Team members are encouraged to work contrary to the goals of leadership, as they attempt to avoid punishment for failure.
  4. Team members do not collaborate on how to correct the problem for fear of getting caught up in the investigation.
  5. The opportunity for the team to learn from the failure is wasted.

The result is a team that is worse off than before the investigation started.

The first priority should be to ensure that the failure does not recur. This can only be done if the cause of the failure is correctly identified, and the systemic fixes are put in place to mitigate a reoccurrence.

Instead of asking the question “Who is to blame?”, the investigators should ask “What can be done to ensure this problem doesn’t recur?” Rather than having senior managers engage in an investigation by interviewing subordinates, management and subordinates should work together to identify and correct systemic problems.

Mistakes happen

One of the problems that misguided managers make is to set an expectation that mistakes are not acceptable. Instead of establishing a culture of excellence on a team, this expectation encourages team members to hide their mistakes. This tendency can turn small learning moments into major problems.

Punishing someone for making a mistake is like punishing someone for sneezing. Mistakes are inevitable. What is not inevitable is identifying mistakes and learning from them.

Failures are always systemic, not personal

People don’t intend to fail. They’d prefer to do good work that others appreciate. Failure happens despite this intention.

The reason that a person may be unable to realize their intention of doing good work is always because of a systemic problem. You can tell if a manager doesn’t get this concept if their conclusion for avoiding a particular failure in the future is to “work harder” or “be more diligent”.

Take this hypothetical example:

An analyst sends out incorrect information to a client. Upon examination, the analyst figures out she ran a query against the wrong database. The correct database would have been obvious from an examination of the query, but the analyst instead relied on an annotation above the query. The correct database is supposed to be identified in an annotation above the query, but this annotation was not properly updated by the last person to update this query. The entire team does not consistently update these annotations, despite knowing it’s their responsibility to do so. The entire team has been working under tight deadlines and has prioritized writing code over keeping documentation up to date.

Th cause of failure in this scenario is not a person. It’s not the analyst that sent out the wrong data, nor is it the analyst that failed to update the annotation. Rather, there are two systemic failures:

  1. annotations above queries can be unreliable, and
  2. the team has failed to keep up with their technical debt.

Fortunately, because failures are systemic it means that solutions are, too. In the above example, one solution might be to remove the annotations since the information can be gathered from the query, itself. Another solution might be to hire another analyst to free up team capacity to update documentation. Disciplining the analysts involved in this failure will not be beneficial to the team.

In the very rare case that failure is the result of a malicious act by an employee, it’s still a systemic problem. Examining hiring practices, employee termination procedures, workplace morale, and other issues can be used to mitigate this sort of failure.

Blame flows up, not down

Because failure is systemic, low-level workers are not in a position to correct the issue, even if the issue is from their own work. Their responsibility should be to communicate failures that occur to their manager or team lead. This requires trust between the worker and the manager, as an employee will be hesitant to identify problems that may be used by management as an indication that the employee is a poor performer.

It is up to management to ensure that failures are addressed in a systemic fashion. This requires establishing a culture of trust between team members and management. Management has no one to blame but themselves if their team is repeating mistakes.

A better way: blameless retrospectives

Once a problem has been identified the people involved with the failure (not just management) should gather together to discuss it.

The output of the meeting should be concrete steps that will reduce the likelihood the the failure recurring. The steps must be specific and actionable (e.g. “Be better about annotating code” is too vague. “Create a code review checklist instructing the reviewer to reject code that lacks annotations” is actionable). The people assigned to the tasks must be given the time to carry them out.

Most importantly, there should be no finger-pointing going on at the meeting. This can be hard, but is easier if everyone understands that mistakes happen and that failure is always systemic, not personal.


People have a strong desire to blame someone in the wake of failure, but that desire is misguided. A team’s leadership must place the good of the team above their desire for retribution, even if that retribution seems justified. Establishing an environment where teams can best learn from their mistakes requires fostering of a culture of trust and excellence. It’s hard work to get going, but it’s an amazing experience to work with a great team.


The Noonification banner

Subscribe to get your daily round-up of top tech stories!