“I am having nightmares that pull (request) 17 is never going to get merged”

A story about whack-a-mole PR comments

A few minutes ago, I came back to the hotel room after wrapping up a great day at Assert(js) Testing Conf in beautiful, cold 🌧 San Antonio. If you weren’t there in 2018, I highly recommend you visit next year!

The conference had a couple of themes: Mocking (or not mocking), unit vs integration tests or whether they’re the same, reducing cyclomatic and cognitive complexity, among other great content.

With my head spinning slightly (mainly due to the extensive amount of knowledge dumped onto us during the day, less so due to the consequences of attending the conference’s after-party…), I sat down and scanned my Slack notifications.

One caught my attention right away:

😴 👻

What is “Pull (Request) 17”?

I like to reference to situations like that as black holes. It’s like a piece of code someone wrote years ago and nobody understands anymore, you don’t want to go anywhere near that.

Well, PR 17 turned into a black hole. It all started on November 1, 2017 — that was 113 days ago.

The beginning

For various reasons, we are moving towards an architecture with smaller, independent services that each take care of a certain functionality of the overall platform.So this profile-variables service is one such service, meant to be developed and deployed in small pieces until it’s feature-complete.

What happened?

In two words: the unthinkable.Originally, the service was developed by a contractor, written in standard Node.js / Express with Docker as the deployment environment in mind. Shortly after the first few commits, we decided to deploy to Google Cloud Functions (GCF) and use PubSub instead (insert a long, unrelated story here). The profile-variables service was picked as the first service to run on GCF.

A team member (let’s call him fictional Jimmy) took over the existing code and started with the migration to GCF. Following best practices, he regularly pushed new code to Github and reviewers left comments. New team members joined the company and also left notes, suggestions, thoughts, etc.

Fictional Jimmy diligently answered questions, replied to comments and applied suggestions made by reviewers.

“I can address them all but every time I address issues it seems like 10 more pop up.” — Fictional Jimmy, today

As of today, PR 17 has 6 contributors and the overall stats look like this:

The end (?)

How could this happen?

Most accidents are the result of multiple things gone wrong. So too is PR 17.

We focused too much on the micro, and didn’t pay attention to the macro.
Processes were followed, but they were either not well defined or possibly followed too strictly.
No alarm bells rang when PR 17 crossed a certain number of comments, commits or files changed.
Tools, automation, CI/CD everything worked as expected, but we did not plan for a situation like the one we encountered in this PR.
We should have not allowed conversations regarding the overall GCF deployment to be part of a pull request.

What do we learn from that?

This brings me back to today’s AssertJS conference and the themes I mentioned earlier.

Speakers also mentioned topics like pull request reviews, automated tests, use tools to help with the process, the list goes on. As I was taking notes, I kept thinking:

“Pull request reviews and no force-push”: ✅ We’ve got that.
“Test in isolation”: ✅ Yep, couldn’t agree more.
“Use tools and bots to be the bad cops”: ✅ Got that in place.

I felt, considered all things, our team is doing a pretty good job at keeping code quality high.

Then, I read fictional Jimmy’s Slack message and it hit me:

Tools, processes, automated test suites, etc. none of that matters if we forget to see the big picture.

Many takeaways I have from the conference focus on improved code quality, fewer bugs, delivering software that works and performs well. These are all super important things each and every team has to think about.

But on top of that, our PR 17 situation taught me that once in a while, as a team, we have to step back and discuss the daily processes, ask each team member to share what they think could be improved. It’s easy to get carried away focusing on the best tool for the job or the most elegant way to write integration tests.

With that in mind, I encourage you to go for a walk, think about your processes and come up with a list of three things you could improve in your workflow to avoid a “PR 17" (synonym for “a black hole”, “your worst nightmare”, or similar).

PS: Thanks to fictional Jimmy for the inspiration for this blog post 🙏!