Software engineers place a lot of emphasis on code performance. There is often tension between non-engineers who want more time to be spent on building new features and engineers who want to spend time on improving performance. The app seems perfectly fast to you. So why do engineers care so much about shaving a few milliseconds off of something that already seems to be pretty good?
Performance absolutely impacts your company’s bottom line. Customers won’t tell you directly, “if you improve the homescreen’s landing page by 200 milliseconds, I’ll try your app again,” but subtle performance improvements will make users more likely to use your app in the future.
Amazon conducted an experiment early on that found that every 100 milliseconds (that’s a tenth of a second) in page load time cost them 1% in sales (source). That would add up to billions in lost revenue for a one second of increased latency (source). I saw this firsthand when I worked at the company and saw how much importance the company placed on maintaining and improving performance of the webpage; people were even fired over latency increases.
You don’t need to be a billion dollar company for this to matter; a slower app will yield higher attrition, lower conversion, lower user satisfaction, and lower virality. Kissmetrics analyzed user behavior and found that 40% of users will abandon a website that takes 3 seconds to load and that each incremental 1 second delay decreases customer satisfaction by 16% and conversion by 7%. I don’t have to tell you that a 7% conversion drop is huge, especially if it compounds over time.
If that’s not enough to convince you that speed matters, an added consideration for websites is that search engines measure sites’ speed and prioritize faster pages in their rankings. In other words, it directly affects SEO.
Code speed is a major focus of any computer science curriculum, so “code should be efficient” has been drilled deeply into our brains. Engineers have a visceral reaction to unoptimized code and we take pride in making code more performant. Slow code is considered a form of technical debt that will gnaw at engineers until it’s improved. Yes, this inherent bias favoring the importance of making code fast is sometimes taken too far — the phrase “premature optimization” refers to expending effort on improving things that don’t matter all that much or won’t matter for a long time — but in general, it’s a good thing that makes our applications better.
The speed of some code is proportional to scale. It may be seem fine now, but if you double or 10X your userbase as you hope to, it won’t be fine.
Let’s say you’re building the next Twitter, and it takes just 10 millseconds to load the tweets of a single user someone is following. That’s reasonably fast. But suppose the speed is proportional to the total number of users that person is following. Once a user follows 100 people — something you probably want them to do — that adds up to a second. Following 1,000 users, which is not rare for a typical user on Twitter? That’s 10 seconds. The power user who follows 5,000 people would be waiting almost a minute every time they interact with your app. They won’t be power users for long. You might not even realize these power users exist and are having such a terrible experience until it’s too late. Experienced engineers are always thinking about extreme cases, and they might push for optimizing a situation that you yourself haven’t experienced while using the application.
The database query that your engineers say performs terribly may only take 20 milliseconds, but it is run 15 times for each API call. And your app needs to make 5 such API calls every time the app loads, and then frequently while the app is open. It quickly adds up. Pragmatic engineers will focus more on high ROI performance optimizations like this one rather than the slower query that is only run once per hour.
To oversimplify a bit, the longer it takes to run a certain piece of code on your server, the fewer resources that server has to serve other users. For some classes of optimizations, decreasing the execution time of some code or some database query from 10 milliseconds to 5 milliseconds doubles your capacity; eg. you can serve more customers with less hardware. This is cheaper for your company and it delays some scaling limitations from becoming a problem (see my article on why engineers worry about scaling to learn why scaling is sometimes nonlinear). Even when speed does not directly impact users, such as big data processing tasks that run in the background, a small optimization could save hours of computing time and thousands of dollars for your company.
Perhaps your app loads perfectly in your Silicon Valley office with expensive broadband Internet access on the latest iPhone, but it might be a lot slower for users with less ideal circumstances. In some cases, there’s not much that can be done: you can’t force users to upgrade or carriers to improve their network speed. But apps can optimize their code to run faster on low end devices and efforts can be made to reduce or compress the data being transmitted.
Writing performant code is among the most important and most specialized things your engineers do. It’s something we always think about when we write any code. And for good reason: it matters to your business. Sure, it can be taken too far; you can’t spend all of your engineering resources on performance improvements all the time. But listen to your engineers when they say something needs to be sped up and consider the implications above before dismissing a performance problem as no big deal.
Hope you found this post useful! Don’t forget to follow me here on Medium, my blog WTF Is My Engineer Talking About, Facebook, Twitter, or LinkedIn. And please send feedback and topic suggestions via e-mail.