When users interact
By Philip Tellis on
Table of Contents
This post is mirrored on the 2024 Performance Advent Calendar.
When looking at the Core Web Vitals, we often try optimizing each independently of the others, but that's not how users experience the web. A user's web experience is made up of many metrics, and it's important to look at these metrics together for each experience. Real User Measurement (RUM) allows us to do that by collecting operational metrics in conjunction with user actions, the combination of which can tell us whether our pages actually meet the user's expectations.
In this experiment, I decided to look at each of the events in a page's loading cycle, and break that down by when the user tried interacting with the page. For those interactions, I looked at the Interaction to Next Paint, and the rate of rage clicking to get an idea of user experience and whether that experience may have been frustrating or not.
Before I jump into the charts, I should note an important caveat about the data. This analysis was done using RUM data collected by Akamai's mPulse product which collects data at or soon after page load. Not all page views resulted in an interaction before data was collected. Most of the analysis was restricted to page views where we had at least one interaction prior to data collection. We see on average, between 2-25% of beacons collected (across sites) had an interaction. Most sites had a recorded interaction on about 10% of beacons. I also separately looked at data collected during page unload/pagehide and while it captured more interactions, it did not have a noticeable effect on the results.
Each of the following charts is from a different website in mPulse's dataset.
Exploring the chart
Let's now look at the various features of this chart.
The chart shows multiple dimensions of data projected onto a 2D surface, so some parts of it will appear wonky. We'll walk through that in this section.
Event labels
The first thing we'll describe are the events. These are the vertical colored lines with labels to their right. These represent transition events in the page load cycle. The events we include are:
- First Paint (FP)
- First Contentful Paint (FCP)
- Largest Contentful Paint (LCP)
- Time to Visually Ready (TTVR)
- Time to Interactive (TTI)
- Page Load Time
You may have already noticed that in this particular chart, First Paint is after First Contentful Paint, which is counter-intuitive. The reason we see this is that the number of data points with First Paint on them is different from those with First Contentful Paint. Safari and Firefox, for example, support FCP but not FP. When aggregating these points, the same percentile value when applied to two data sets will likely get you values from two different experiences. This effect is more prominent when the sample sizes are different. In general we would not expect the delta to be too far off, and in the data I've looked at, it hasn't been more than 50ms off.
The events to keep an eye on are the Largest Contentful Paint or Time to Visually Ready, the Time to Interactive, and the delta between them. LCP is not currently supported on Safari, so we use boomerang's cross-browser calculation of TTVR in those cases.
Time to Interactive is considered a lab measurement, but boomerang
measures it in a cross-browser manner during RUM sessions, and passes that data back to mPulse. It is
approximately the time when interactions with the page are expected to be smooth due to no more long animation frames and blocking time.
The next thing to note are that these events are positioned on this projection based on when they occurred relative to interactions as well as when they occurred relative to page load time. By definition this means that all interactions should show up after LCP but it may show up differently on the chart due to the projection from multiple dimensions down to two. There's also the fact that TTVR calculations do not stop at first interaction, so on browsers that do not support LCP, we may see interactions before the proxy for that event.
The absolute value of each event is calculated across the entire dataset, even on pages without intereactions, so it might look like events aren't placed where their values dictate they should be, however the percentage of users interacting before & after an event is always correct.
The last label to take note of is the fraction of users that interacted before boomerang
considered the page to be interactive. In this case, it's 12% of users.
Data distributions
There are a few different distributions shown on this chart, (and even more when we look at the mouseover in the chart above).
The blue area chart is the population density. It shows, for every 5% interval of the page load time, how many users first interacted with the page at that point in the page's loading cycle.
The blue dots that trace the population density chart show the median Interaction to Next Paint value for all of those interactions. Keep in mind that INP is not
supported on Safari, whereas boomerang
's own measurements for TTI do work across browsers.
The vertical position of the red dots shows the probability that interactions at that time resulted in rage clicks while the size of the red dots shows the intensity of these rage clicks. Rage clicks are collected across browsers.
The thin orange line shows Frustration Index for users that interacted within that window.
We also have the median Total Blocking Time for each of these interactions, though that's only visible in the live versions of these charts and not in most of the screenshots posted here.
In this second chart, we see that 59% of users interacted with the site before it became interactive. Its TTI is further from the LCP time compared to the first site.
Insights from the data
When we look at this data across websites, we see the same patterns. Users expect to be able to interact with the site once the page is largely visible, however, the user experience for interactions is sub-optimal until the time to interactive which can be much later in the page's loading cycle.
In most cases we see a high Total Blocking Time in the period between LCP and TTI, resulting in a slow INP, and higher probability of rage clicking.
When looking to optimize a site for user experience, we shouldn't look at each metric in isolation. A really fast LCP is a great first user experience, but it's also a signal to the user that they can proceed with interacting to complete their task. It's important that the rest of the page be ready for those interactions and keep up the good experience.
The elephant in the room
As an aside, has anyone else noticed that these charts almost always look like a sleeping elephant (or maybe a hat)? I've seen very few sites where this isn't the case, so I looked into that pattern.
The population distribution pattern we see is a gradual curve increasing, then a dip that looks like the elephant's neck, then a bump that could be its ears, a sharp dip and long flat region that could be its trunk.
It could well be a Normal distribution if it weren't for the dip and spike right around PLT.
The drop-off after OnLoad is expected. boomerang.js
sends a beacon on or soon after page load (sites
can configure a beacon delay of a few seconds to capture
post-onload events). This results in a drop-off in data with interactions after onload. The post onload interactions are on pages that are faster than the average.
The strange pattern is the spike in interactions just at or after onload (it's sometimes at 100% and sometimes at 105%). The dip at 95% & 100% shows up on most, but not all sites, but the spike shows up everywhere.
I looked closer at the data around those buckets and there is very little difference in terms of experience. The page load time, LCP time, TTI time, etc. are all very similar at the 25th and 75th percentile (in other words, the experiences are comparable). The only difference is that more users prefer to interact with the site just after the onload event has fired than just before it. It's not a big delay - about 200-400ms on average across sites, but it does look like some portion of users still wait for the loading indicator to complete before they interact.
Conclusions
In conclusion, I think there's a lot to be learned from looking at when your users interact with your site. Which parts of the page have finished loading when that interaction happens? What's still in flight? What do they experience? Is there too much of a delay between your LCP and the site becoming usable?
A good loading experience needs your page to transition from state to state smoothly without too much delay between states. Looking at the loading Frustration Index can identify pages where this isn't the case.
When comparing different events on the page, look at the aggregate of deltas rather than the delta of aggregates.
And lastly, keep an eye out for that elephant.
References
Glossary on Mozilla Developer Network
Web Vitals on Google's Web.Dev
Implementations in mPulse
- boomerang
- Frustration Index by Tim Vereecke
- Time to Visually Ready in boomerang
- Time to Interactive in boomerang
- Monitoring Interactions
- Frustration Index in mPulse
Other useful links
- Paul Calvano on different performance metrics
- The RUM Archive to do your own RUM analysis.