Skip to content

Web QoE Scoring Model

Note

This model is currently in beta. We welcome feedback on the scoring methodology and curve parameters, and we are currently evaluating it in practical situations.

The Web QoE Score is an overall Quality of Experience score for web page loads, expressed on a 0–100 scale. It combines multiple web performance metrics into a single value using Lighthouse-style log-normal scoring curves. The score is provided as the statistic value web_qoe_score.

Background

The scoring methodology is similar to Google's Lighthouse performance scoring, but slightly adapted from it. In that model, which Google continuously updates, each input metric is scored individually against a log-normal cumulative distribution function, then the per-metric scores are combined as a weighted average. The key parameters for each curve are:

  • p10: the metric value that produces a score of approximately 90 (the "good" threshold)
  • median: the metric value that produces a score of approximately 50

Google provides a Lighthouse scoring calculator online. An graphing function was made available from Google for exploring the log-normal curve behavior.

Our model uses the following web performance metrics as inputs.

Notably, we do not use Total Blocking Time (TBT) as an input metric, as it is not directly measurable in all environments. Similarly, the Speed Index (SI) is not used, as it is difficult to measure.

Metric Sets

In our case, the model has been adapted to work with different sets of input metrics, depending on what data is available from the platform. The full set includes FCP, LCP, CLS, INP, and TTFB; but if TTFB is unavailable, the model can still produce a score using just FCP, LCP, CLS, and INP. If only FCP and LCP are available, a separate 2-metric scoring method is used with curves aligned to Web Vitals thresholds.

Therefore, the scoring function accepts exactly one of three predefined metric combinations; any other combination produces no score.

  • Full (5 metrics): FCP + LCP + CLS + INP + TTFB — uses Lighthouse v10 curves and weights.
  • 4-metric: FCP + LCP + CLS + INP (no TTFB) — same Lighthouse curves; TTFB's weight is redistributed proportionally among the remaining four metrics.
  • 2-metric: FCP + LCP only — uses Web Vitals threshold-aligned curves with equal weights (see below).

Scoring Curves

Full and 4-metric sets

The scoring curves for the full and 4-metric sets are as follows:

Metric PC median PC p10 Mobile median Mobile p10 Weight
FCP 1600 ms 934 ms 3000 ms 1800 ms 0.10
LCP 2400 ms 1200 ms 4000 ms 2500 ms 0.25
CLS 0.25 0.1 0.25 0.1 0.25
INP 500 ms 200 ms 500 ms 200 ms 0.25
TTFB 1800 ms 800 ms 1800 ms 800 ms 0.15

When TTFB is unavailable (4-metric set), its weight of 0.15 is redistributed proportionally among FCP, LCP, CLS, and INP. The resulting effective weights are approximately FCP 0.12, LCP 0.29, CLS 0.29, INP 0.29.

Generally, mobile device types use more lenient FCP/LCP curves than PC, reflecting the typically slower rendering on mobile devices.

Two-metric scoring

When only FCP and LCP are available, the Lighthouse curves are too harsh to use directly. This is because they were calibrated for a multi-metric set where CLS, INP, and TTFB — which tend to score high at their "good" thresholds — compensate for the steeper FCP/LCP curves. Using the Lighthouse FCP/LCP curves alone would give a score of only ~45 for a page that meets all Web Vitals "good" thresholds.

However, in some cases, only FCP and LCP are available, e.g. when interactivity cannot be simulated in automated measurement scenarios. So instead, the 2-metric set uses curves aligned to the Web Vitals thresholds directly, with equal weights:

Metric median (= "poor" threshold) p10 (= "good" threshold) Weight
FCP 3000 ms 1800 ms 0.50
LCP 4000 ms 2500 ms 0.50

These curves are the same for PC and mobile. This produces scores that align with the standard quality tiers: approximately 90 at the "good" boundary, 50 at "needs improvement".

Note

The 2-metric score only reflects loading performance (FCP + LCP). It cannot capture interactivity or visual stability issues. A page with fast paint times but terrible responsiveness will score well in the 2-metric set.

Example Scores

The following table shows example scores for the PC device type across representative performance scenarios.

Scenario Full (5) 4-metric 2-metric
Excellent 99 98 100
Good (WV thresh) 74 71 90
Needs improvement 37 35 50
Poor 13 11 12

The "Good (WV thresh)" row uses the Web Vitals "good" threshold for each metric (FCP=1.8 s, LCP=2.5 s, CLS=0.1, INP=0.2 s, TTFB=0.8 s).

Scenario FCP LCP CLS INP TTFB Full 4-met 2-met
Fast CDN-served landing page 0.6s 0.9s 0 0.03s 0.15s 99 99 100
Well-optimized news site 1.2s 2.0s 0.05 0.12s 0.4s 87 85 98
Typical corporate website 2.0s 3.0s 0.12 0.25s 0.9s 66 63 81
Heavy SPA (React/Angular) 2.8s 3.5s 0.03 0.40s 0.6s 62 56 61
Ad-heavy media site 3.5s 5.0s 0.35 0.80s 1.2s 28 20 31
Slow shared hosting blog 4.5s 7.0s 0.20 0.60s 2.5s 31 31 11
Overloaded e-commerce site 5.5s 8.0s 0.45 1.20s 3.5s 10 10 5
Broken/failing site 12s 18s 1.50 3.0s 6.0s 1 0 0

Here's a comparison of the scores for PC vs. mobile device types across the same scenarios, using the full and 2-metric sets.

Scenario PC Full Mobile Full PC 2-met Mobile 2-met
Well-optimized news site 87 98 98 98
Typical corporate site 66 83 81 81
Heavy SPA 62 77 61 61
Slow shared hosting 31 33 11 11