Skip to content

MOS Troubleshooting Guide

This guide helps you diagnose and resolve video playout issues using Surfmeter's KPI and KQI metrics. The guidance is organized by stakeholder perspective – whether you're an ISP optimizing network performance or an OTT provider tuning player behavior.

Where to Start

First, assess the severity:

MOS Score Interpretation
≥ 4.0 Excellent. Minor optimization opportunities may exist.
3.0–4.0 Good, but worth investigating potential improvements.
2.0–3.0 Noticeable degradation. Requires attention.
< 2.0 Critical issues. Immediate investigation needed.

Next, identify the root cause category:

  • Stalling issues: Check p1203_stalling_quality (O.23). Values below 4 indicate buffering is the primary concern.
  • Quality issues: Check p1203_overall_audiovisual_quality (O.35). Values below 4 point to encoding or bitrate selection problems.
  • Network issues: Examine video_response_time and content_server_hostname, then correlate with network tests (ping, traceroute, etc.).

For ISPs and Network Operators

As an ISP, you have limited control over player behavior, but you can optimize your network to improve QoE. Focus on identifying congestion points, optimizing CDN routing, and ensuring adequate bandwidth.

Stalling Issues (Low O.23)

Metric What to Look For
initial_loading_delay Values above 3–5 seconds suggest first-mile connectivity issues. Check your service baseline, as acceptable values vary.
total_stalling_time, number_of_stalling_events Any stalling indicates problems – users should not experience mid-stream interruptions.
average_buffer_length, min_buffer_length Low values suggest insufficient bandwidth. See the bufferTrace for buffer levels over time.
video_response_time High values indicate server or routing problems.
content_server_ip_address, content_server_as Use these to identify problematic CDN nodes.

Recommendations:

For slow startup:

  • Check DNS resolution times in your network
  • Investigate first-hop latency and congestion
  • Consider CDN cache warming for popular content

For frequent stalling with low buffer levels:

  • Analyze bandwidth utilization during peak hours
  • Check for congestion at peering points
  • Review QoS policies for video traffic

For slow server response:

  • Investigate routing to specific ASNs (use content_server_as)
  • Review peering agreements with content providers
  • Discuss local CDN deployment with providers

Quality Issues (Low O.35)

Low video quality typically results from insufficient bandwidth or conservative ABR behavior on the OTT side.

Metric What It Indicates
average_video_bitrate vs. available bandwidth Bitrate significantly below available bandwidth suggests player misconfiguration – an OTT-side issue.
largest_played_video_size vs. initial_resolution A large gap indicates the player is ramping up slowly, possibly due to past network issues.
quality_switch_down_count vs. quality_switch_up_count More than 1–2 switches per session causes visible quality fluctuations.

Recommendations:

For low bitrate despite adequate bandwidth:

  • The player may be reacting to past packet loss or jitter – investigate those patterns
  • Check whether traffic shaping policies affect video streams

For excessive quality switching:

  • Network instability causes ABR oscillation
  • Check for Wi-Fi interference or cellular handover issues
  • Review bufferTrace to see if buffer levels fluctuate frequently

Advanced Analysis

Useful correlations to explore:

  • Cross-reference content_server_hostname with ping times to identify problematic CDN nodes
  • Analyze performance by time of day to find capacity constraints
  • Compare metrics across customer segments (residential vs. business, fiber vs. DSL)

For OTT and Content Providers

As an OTT provider, you have limited control over the network, but you control the player and encoding – significant leverage for improving QoE.

Stalling Issues (Low O.23)

Metric What It Indicates
average_buffer_length + bufferTrace Buffer should fill initially and remain high. Frequent dips indicate bandwidth or buffer management problems.
initial_loading_delay vs. initial_resolution Low initial quality should yield fast startup. High initial quality means slower startup – aim for under 3 seconds on average.
p1203_max_mos_ratio Shows potential vs. actual quality. Values well below 1.0 indicate suboptimal ABR decisions.

Recommendations:

For slow startup:

  • Start with a lower quality tier for faster time-to-first-frame
  • Pre-load content before rendering the player

For buffer management issues:

  • Analyze bufferTrace patterns to optimize target buffer levels
  • Tune ABR buffer-based decision thresholds
  • Implement more aggressive prefetching for likely content

Quality Selection Issues

Metric What It Indicates
number_of_quality_switches More than 1–2 per session is visible to users. The player should settle into a stable quality.
initial_resolution vs. largest_played_video_size A large gap suggests tight bandwidth or overly conservative startup behavior.
average_video_bitrate vs. p1203_max_theoretical_mos Bitrate well below theoretical max indicates bandwidth constraints or cautious ABR.

Recommendations:

For conservative quality selection:

  • Consider loosening ABR thresholds
  • Improve bandwidth estimation
  • Ensure your quality ladder covers the full bandwidth range
  • Implement quality ramping for stable connections

For excessive quality switching:

  • Reduce sensitivity to short-term bandwidth fluctuations
  • Add hysteresis to prevent oscillation (require larger changes before switching)
  • Consider user preferences: short clips may favor faster playback; movies may favor higher initial quality

Encoding Considerations

Metric What It Indicates
p1203_average_video_quality (O.22) Break down by codec and resolution to identify encoding issues.
dropped_frames High values indicate decoding complexity issues on test hardware. Generally not a concern for Surfmeter, which infers quality from codec/bitrate/fps/resolution.

Note on P.1203 Mode 0: This model does not account for content complexity. A low-motion scene at lower bitrate may score similarly to a high-action scene. For proper encoding evaluation, use a full-reference metric or a bitstream model like P.1204.3.

For low quality scores despite high bitrates:

  • Evaluate codec choice: H.264 vs. H.265 vs. AV1 vs. VP9

CDN Performance

CDN issues affect both ISPs and OTTs, but require different approaches.

ISPs should focus on routing efficiency to CDN nodes.

OTTs should focus on CDN selection and failover strategies.

Key metrics:

  • content_server_hostname and content_server_as distribution – identify problematic nodes
  • video_response_time by CDN node
  • Geographic performance patterns

Joint opportunities:

Correlating player behavior with network characteristics can reveal optimal quality ladders for specific conditions, or inform network-aware strategies for challenging environments (satellite, mobile, rural).


Capturing Network Data

When standard metrics don't tell the full story, enable network request logging to capture detailed timing, server IPs, and response data for every request the browser makes.

In Surfmeter Automator, use the --logNetworkRequests option:

./surfmeter-automator-headless startStudy --studyId YOUR_STUDY --logNetworkRequests

This enables additional metrics like video_response_time and content_server_hostname. For persistent issues, you can also send this data to the server with --sendNetworkRequests for analysis via the Export API.

See Logging Network Requests for setup details, or the Command Reference for all available options.


Analysis Best Practices

Use percentiles, not just averages. P95 and P99 values are more meaningful for SLA definitions. Outliers happen; consistent performance matters.

Segment your data. Break down by network type, region, and access technology. Problems often hide in specific cohorts.

Track trends over time. A single measurement is a data point. Trends reveal systematic issues.

Control for content. When benchmarking over time, measure the same VoD content repeatedly. Otherwise, MOS fluctuations may reflect content complexity rather than network changes.

Use p1203_max_mos_ratio as a benchmark. It shows how close you are to optimal quality. Compare across services and conditions to identify improvement opportunities.