The Case of the Missing Money: How I Used Detective Skills to Track Down a Bug in an App šŸ”

Written by mikeynyzw | Published 2023/01/10
Tech Story Tags: mobiledebugging | debugging | mobile-app-development | android | technology | hackernoon-top-story | programming | software-development | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-vi | hackernoon-fr | hackernoon-pt | hackernoon-ja

TLDRA popular remittance android app was failing to send money to a specific region. The app was making the usual network requests, but there was no indication of what was causing the problem. It was going to take all of my detective skills, as well as a little bit of luck, to track down the root cause.via the TL;DR App

As an Android developer working on a popular remittance app supporting over 100 thousand users, it is my job to ensure the app runs smoothly and all reported issues are resolved promptly. I was used to getting the occasional support ticket from users reporting issues with the app. But one day, I received a deluge of tickets from people who were failing to send money to a particular region through the app. This was especially concerning because it was close to the holiday season, a time when many people rely on our app to send money to their loved ones.

I knew I had to get to the bottom of this issue as quickly as possible. So I turned to the wisdom of Sherlock Holmes and began my detective work. But as I soon found out, solving this problem was going to be no easy task. It was going to take all of my detective skills, as well as a little bit of luck, to track down the root cause of the bug and get the app working properly again.

The Setup

I started by reproducing the issue on my own device, trying to send money to the affected region. Sure enough, the app froze and displayed an error message saying, "Transaction failed. Please try again later.ā€ IĀ  tried using a profiler in Android Studio to see if there were any performance issues that could be causing the problem. No dice; the app was working as expected.

I quickly checked the logs to see if there was any information there that could help me understand what was going on. Unfortunately, the logs were not very helpful. They showed that the app was making the usual network requests, but there was no indication of what was causing the problem. It seemed, however, an error would occur every time we tried making a POST request to the transactions endpoint but only for that specific region. It seemed no matter what I tried; I couldn't seem to get any closer to solving the mystery.

Next, I pulled the latest code and checked out the production branch to see if any recent commits might be relevant to the problem at hand. I also tried making individual requests using Postman and noticed something peculiar. The request returned a response code of 400, meaning it was a bad request; this normally means the client is not sending all the information required by the backend. It, however, failed to return a meaningful error detailing what data was missing in the request. Given that this request was working before, it seemed like the problem was on the server side.

To test this theory, I used a debugger to dig deeper into the code. I set breakpoints at key points in the code and tried to send a transaction again, this time paying close attention to what was happening under the hood. I checked to see if the request contained all the necessary data that the backend required and even logged out the request and response. Alas, everything was as expected, but I was still getting the Bad request error.

The Plot Thickens

As I continued my investigation, I couldn't help but feel like I was missing something. It seemed like there should be an obvious explanation for the issue, but no matter how hard I looked, I couldn't find it. I could feel like taunting me, my own Moriarty, the unsolved bug.

Just when I was starting to lose hope, I had an idea. I remembered that two weeks before the issue had started, the backend team had pushed out an update to the app's server-side code. Could it be that the update was causing the problem?

I pinged the backend developers on Slack to see if they had any insights into the issue. They told me that they had recently pushed out an update to the server-side code, but they weren't sure if it could be related to the issue I was experiencing. They were currently swamped fixing another issue and could only look into mine later. They mentioned that the update had been rolled out gradually, with only a small percentage of users receiving it at first. Due to a new policy, our rollouts were now phased, and the users would receive it over a period of two weeks. Could it be that the update had caused the issue only for users who received it?

The Lightbulb Moment

I quickly checked my firebase crashlytic logs to see if there was any correlation between the timing of the update and when the users started experiencing the issue. And sure enough, I found a clue!

After some back and forth with the team, my theory was confirmed. The update included a change to the way the server handled certain types of transactions. And it turned out that the change was causing issues specifically for transactions to the affected region. Upon further investigation, I discovered that the backend now required an extra field to be included in transaction requests, a field that had previously been optional. This change had been made due to new regulations in the region, but it had unfortunately been rushed,Ā  poorly documented, and not thoroughly tested. As a result, the field was not included in transaction requests for the affected region, causing the transactions to fail.

I couldn't believe it. After all my detective work, I finally found the root cause of the bug. It had taken a lot of Sherlock-like deduction and some creative thinking, but I had finally solved the case of the missing money.

I immediately reached out to the backend team to let them know what I had discovered. They were shocked to learn that their update had caused the issue and apologized for the oversight. We agreed on a two-pronged solution. The backend team would release a hotfix with a default value for the now mandatory field, allowing users to transact in the meantime while the mobile app team released an updated version of our android app that would ask for this extra information.

Conclusion

Solving this bug was a challenging and rewarding experience. It reminded me of the importance of thinking creatively and not being afraid to try different approaches when debugging an issue. And just like the strong bond between Sherlock and Watson, it also reminded me of the power of collaboration and teamwork ā€“ without the help of the backend team, I might never have been able to solve this problem.

I hope that this story of my detective work will serve as a reminder to other developers to always be on the lookout for clues and to never give up on solving a challenging problem. As Sherlock Holmes once said, "Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth." With that in mind, I know that any bug can be conquered, no matter how tricky it may seem at first.


Written by mikeynyzw | Changing the world, one line of code at a time
Published by HackerNoon on 2023/01/10