The Impact of Web Performance
February 6, 2020 — 2000 words and lots of graphs
In this post, I’ll discuss what I did at ALDO to measure the revenue impact of web performance without having to spend time making performance improvements.
I’m going to highlight that performance is a spectrum.
And I’m going to show that rendering performance is more important than any of the Lighthouse page load metrics(*). (By a large margin!)
What we know
You’ve probably seen a lot of these knowledge bombs already:
- Every 100ms improvement brings Walmart up to 1% incremental revenue
- Cloudflare reports that going from 5.7s to 2.4s load time triples the conversion rate
- Akamai reports that a two-second delay in web page load time increase bounce rates by 103 percent
I could go on… But you get the gist:
Fast = $
Or “Fast is cash.”
What we don’t know
Are we supposed to obsess over page load times when there’s only one page loaded on what might amount to a 40-minute shopping session?
So the first question becomes:
Is it true… for us?
Like I said earlier, performance is a spectrum, and we can get an answer backed by data within a day.
How? Well, some of our users already have a slow experience, and some of them already have a fast experience (you can think of how Apple throttles iPhones when their battery reaches end of life). This means we can bucket people by experience level and observe the impact conversion rates.
Following the trail paved by Lighthouse and the Chrome UX Dashboard, we followed User Centric Performance Metrics and sent the following data as events in GA:
- First Paint - (Chrome Only) That’s the time it takes to see anything
- Time to Interactive - (Chrome Only) Time it takes until the page responds to user interactions within 50 milliseconds
- Time to App Load - (Custom, All Browsers) That’s the time it takes for the app to load and
React.hydrateto complete. At this point, the app responds to user interactions, but maybe not under 50 milliseconds.
And this is the kind of results we observed:
Which means… it’s kind of(†) true!
Fast = $
In fact, what we observed was the following:
On mobile, per session, users who experienced fast load times bring 17% more revenue than average, and 78% more revenue than slow.
So, how many users are in each category? How much incremental revenue would we make by moving users from the slow to average, and average to fast?
Assuming a constant conversion rate per group, we’re looking at the following incremental revenue improvements:
A 9% revenue increase would be sweet indeed. Do we think we can achieve that? What about the effort? That’s a topic for later (keep reading).
What about the other 39 minutes 50 seconds?
I mentioned earlier that some of ALDO’s users have shopping sessions that last for more than 40 minutes. On a SPA, we can imagine that the impact of a faster page load is not felt anymore. So how can we measure the impact of performance for the other 99.58% of the time a user spends on the site?
Well… it’s a road less traveled and, before I lose you, I’ll drop this bomb:
On mobile, per session, users who experienced fast rendering times bring 75% more revenue than average and 327% more revenue than slow.
327%!?!?!? That’s 4 times as much revenue! By Golly!
Got you back? Good. So what do you do? How do you actually measure the “quality” of your user experience? Google didn’t define one for us, SPA folks. We don’t have a metric that tells us: If that number is below that number, you’re having a good time!
What do you do? You invent one!
aldoshoes.com sells shoes. There are product listings. There are product pages. Other than swiping left and right on the occasional carousel, most of the interactions users have with the site is clicking on things.
Inspired by the measure of smoothness in video games, I decided to measure the number of frames a user was able to have in the following second after clicks and store the value as a frame rate.
It’s the frames per second (FPS) for one second after clicking on things. 60 is the max, 0 is awful.
Don’t know what that’s like? Click on the buttons below to feel what different values are like:
With this definition, I broke down users into three categories:
- 40-60 FPS
- 20-40 FPS
- 0-20 FPS
I broke them down like that because I didn’t know what to expect. So even thirds seemed like a good idea(‡).
Here’s what we observed:
What we saw was kind of insane. Like I said a bit earlier, 4x as much revenue per session in the fast group vs the slow group. That’s not negligible. In fact, it’s even crazier on desktop:
On desktop, per session, users who experienced fast rendering times bring 212% more revenue than average and 572% more revenue than slow.
So, how did we fare? How many users were in each category? How much incremental revenue would we make by moving users from slow to average, and average to fast?
Again, assuming a constant conversion rate per group, we’re looking at the following incremental revenue improvements:
For some companies, a 5.25% increase in revenue is no joke. 21%? Everyone is going nuts. So we got a go-ahead on making rendering performance improvements on the website.
Four months later, did it happen? Well, these are the actual graph of our FPS groups over time:
It’s closer to promoting 1/3 of the users. We did OK :)
How did we do it? We did stuff like this:
Combing through the most used interactions with the React profiler and optimizing them
- Opening/closing the site navigation
- Opening/closing the cart
- Adding to cart
- Loading a list of products
- Infinite Scroll
- Optimizing Reselect selectors to prevent useless re-rendering.
- Introducing a
shouldNotUpdateStoreRedux action creator & middleware to prevent some of the saga actions from causing useless re-renders.
- Adding the
enqueuesaga effect to batch Redux actions into a single store update on the next frame.
What about causality?
I hear you, older person in your rocking chair, brandishing your cane above your head left and right like a sword and screaming at the top of your lungs “Correlation does not imply causation!”
You’re right. Also, if you’ve been paying attention, I’ve been pointing out where I was making assumptions earlier.
In fact, if you were to challenge me and make the argument that users who have expensive devices spend more, that it’s because they have more expensive devices that they have better performance, thus a higher buying power, and that the causality actually goes in the other direction, I’d actually tell you that you are right.
You’d make that argument, and I’d tell you: “Let’s look at the data, we have it!”.
iPhone X/Xs perform better than iPhones 8/7/6/6S and have a greater revenue/session. I don’t think anybody is surprised.
But, let’s not forget that the same iPhone could have a dead battery — and we all know what Apple does to iPhones that have a battery near end of life. Throttling CPU power has a huge impact on performance. So that same expensive phone could have a slow or fast experience depending on the state of its battery.
So what do we do? We go down deeper! We slice the data and only look at the revenue/session by rendering category for one device.
Here are the results for iPhone 8 Plus(§) (results are similar for all iPhone categories):
What does this tell us? Well, for one, the smoothness of your animations matters. Second, the same thing I’ve been repeating over and over:
Fast = $
Now, you’ll see me circling back to what I skipped over earlier: where should you be spending your efforts?
Effort and ROI
Not all businesses are equal. Some are operating on millions of users per months, some aren’t. For some, an incremental revenue of single-digit percents is a big deal. For startups, it’s all about those triple digits.
Moreover, not all performance work is created equal. At ALDO, we ran out of quick wins when it came to page load performance. Their app is already code-split (although not architected to be code-split), our product images are responsive and lazy-loaded. Our components are loaded lazily.
There are still a couple of things we could work on, like splitting the CSS, splitting the sagas and reducers, using webp, responsive videos. But most of them are not quick wins, and they would need a good amount of effort. Moreover, they would bring at the most 9.3% in incremental revenue.
On the flip side, back in October, rendering performance was something we had never focused on. Nobody was really talking about it, so there must be nothing there. It was only after we started measuring that we saw the potential.
And this is what I want you to get out of this essay. Google is probably right in saying that first paint, time to interactive correlates well with conversion on most websites.
But maybe you’re not most websites. Maybe you’re running a SPA, maybe there’s stuff that could bring back more bang for your buck. The only way to tell is to measure. And the most beautiful thing is that, since performance is a spectrum, you can do that before making performance improvements!
And, you know what? Maybe your company is like most websites! Maybe Google’s advice applies perfectly to your company, and that’s OK! That’s actually awesome because that’s what most tools, checklists, and companies out there are helping you measure and improve. Good for you :)
“Kind of” because we actually did not observe a correlation between Google’s Time to Interactive metric and revenue. In fact, most of the time slow meant better conversions for that metric!
Moreover, on tablet and desktop, the first paint and time to app load correlation seemed a lot weaker. (Go back)
In reality, it’s a little more continuous than that, but I like the simplicity of having 3 buckets because I find it a lot more evocative and easier to reason about. When breaking it down a bit further, I also noticed that the 50-60 FPS group brought in a lot more than the 20-50 FPS group. Perhaps I should change the definition of the groups but I like the simplicity of splitting it into thirds. (Go back)
In case you’re wondering how I’m able to guess which Apple device users use in GA, we can cross-reference `device.screenResolution` and the `device.mobileDeviceBranding` (980x1000 is iPhone X or Xr, and so on.) (Go back)
I’ve been reading on this topic while writing this essay, and thanks to Tammy Everts (@tameverts) and her book Time is Money: The Business Value of Web Performance (not an affiliate link), I gave another look at load speed vs business value. Turns out the Lighthouse buckets don’t paint the whole picture.
When we bucket users in 1-second intervals, we can see a much clearer correlation between load speed and business value:
And it seems like that’s attributed to how page speed affects the bounce rate:
Yet, the conclusion that working on rendering performance had better ROI for us remains the same. It seems unlikely to be able to bring down users who have 5-6 seconds of page load down to 1 second or less.
And that would, at best, halve their bounce rate.