Anomalies¶

The Anomalies report tells you automatically when something looks wrong. Surfmeter watches your measurements around the clock, learns what each service normally looks like, and raises a flag when quality drops — so you can spot a struggling service or a misbehaving probe without digging through charts yourself.

You can reach the Anomalies report from Reports > Anomalies in the sidebar.

Note

This entry only appears when anomaly detection is enabled for your organization. Ask us if you're interested in trying it out!

How anomaly detection works¶

A background process runs every 30 minutes. For each monitored service it builds a statistical picture of what is normal — a baseline — from a rolling window of recent measurements (by default the last 7 days), then checks whether current values have drifted away from that normal.

Each probe builds its own baseline, separately for every service and target host. A probe on a slow connection has a slow baseline, so it is only flagged when it gets worse than its own usual behaviour — not just for being slower than a probe on a fast line.

Detection uses two complementary passes:

Spike looks at a short window (roughly one hour, adapted to how often the probe measures) to catch sudden problems — something just got worse.
Shift looks at a longer window (about 6 hours) to catch slow, lasting drift.

A separate data quality pass catches values that are physically impossible for the metric — for example a MOS score above 5 or a negative latency. These point to a measurement bug rather than a real service problem.

Probe data and Player SDK data¶

Two kinds of data feed the detector. Most of this page describes probe data: measurements from our own probes (the dedicated devices running the Surfmeter agent). The system also watches Player SDK data: quality reported by the SDK built into real end-users' video players, pooled across everyone watching a given domain. Because there is no single probe behind it, Player SDK data is checked as a whole and its anomalies are reported as fleet-wide. The per-probe, cross-probe, and diagnosis ideas below apply to probe data only.

Comparing a probe to itself, and to its peers¶

The passes above compare each probe against its own history. We call that per-probe detection, and it is good at catching a problem that is local to one probe.

But "this probe got slower" raises a question: is the probe itself having trouble, or is the service slow for everyone? To answer that, the system also runs a cross-probe check. It compares how far each probe has drifted from its own normal against how far the other probes measuring the same target have drifted. A probe that stands out from its peers most likely has a localized problem of its own.

The useful part is what cross-probe detection ignores. If a service genuinely degrades, every probe measuring it drifts by a similar amount — so no single probe stands out and no cross-probe anomaly is raised. This stops a real service outage from being mistaken for dozens of separate probe problems. Cross-probe detection needs at least three probes measuring the same target, so it has something to compare against.

What the pattern tells you (Diagnosis)¶

Putting the two views together, the system tags each probe-side anomaly with a diagnosis that suggests where the problem most likely sits:

Probe-local – one probe is affected on one target. Suggests an issue specific to that probe reaching that target, such as its routing or DNS for that one service.
Probe infrastructure – one probe is affected across several targets. Suggests a problem with that probe's own connection or hardware rather than any one service.
Service-wide – several probes are affected on the same target. Suggests a problem with that service, or the network that delivers it.
Network-wide – several probes are affected across several targets at once. Suggests a broad upstream or internet-provider problem.

The diagnosis is the quickest way to tell "one of our probes is having a bad day" apart from "the service itself is degraded".

Episodes¶

When the system detects an anomaly, it opens an episode: a single record that tracks the issue from start to resolution. If the same anomaly keeps appearing in later 30-minute checks, the existing episode is extended rather than piling up duplicate events. Episodes close automatically once the anomaly stops appearing.

Anomaly list¶

The list page shows all detected anomaly episodes within the selected time range.

Time range and interval¶

Use the time range picker and interval selector at the top to adjust the reporting period. The list shows all episodes that overlap the selected window, including long-running episodes that started before the window but are still active within it.

Filters¶

Filter controls let you narrow down which anomaly events are displayed. They are grouped to put the things you are most likely to act on first, with the detection mechanics further down.

Overview:

Status – Active (new) or closed (resolved) episodes, or both. Defaults to new so current issues show first.
Severity – Warning or critical. Critical events are those that escalated by persisting, or that started with an extreme deviation.
Diagnosis – Where the problem most likely sits: Probe-local, Probe infrastructure, Service-wide, or Network-wide (see What the pattern tells you).
Source – Whether the anomaly comes from our own probes or from Player SDK data (real end-users).

Detection (how it was found):

Detection Type – Per-probe, Cross-probe, Fleet-wide, or Data quality.
Detection Pass – Spike, shift, or data quality.
Detection Method – The statistical method behind the flag (MAD, IQR, percentile, proportion, range check).
Confidence – High, medium, or low, reflecting how much the signal can be trusted given the number of measurements and how many probes were available to compare against.
Deviation Direction – Whether the value went above or below normal (or above/below the physical bounds for data quality events).

Subject (what was measured):

Measurement Type – Video, Web, Network, Speedtest, or Conferencing.
Subject – The service (e.g. "netflix", "youtube"). For network measurements this is the technology.
Statistic Name – The specific metric (e.g. p1203_overall_mos, download).
Domain – Target domain.
Hostname – Target hostname.

Client and location:

Client Label – Client device name.
ISP – Internet provider name.
Country / City – Where the probe is located.

You can hover over any filter badge to see a plain-language explanation of what the value means and how it relates to detection.

Note

Client and location filters (ISP, country, city) only match per-probe anomalies. Fleet-wide (Player SDK) anomalies are aggregated across a whole domain and do not carry these fields.

Timeline chart¶

A stacked bar chart shows the number of new anomaly episodes over time, broken down by a grouping dimension you can choose from a dropdown. For example, group by subject to see which services generated the most anomalies, by severity to see the ratio of warnings to critical events, or by diagnosis to see at a glance whether issues are mostly probe-side or service-side.

The chart buckets episodes by their start time (first_seen_at), so a long-running episode contributes one bar at its onset rather than appearing in every interval.

Events table¶

Below the chart, a paginated table lists individual anomaly episodes. By default, episodes are sorted by last activity (most recently active first). Each row shows:

Status and severity badges
The diagnosis (Probe-local, Service-wide, and so on; data-quality and fleet-wide events show their detection type instead)
When the episode started and was last seen (or resolved)
How many times the anomaly re-fired (occurrence count)
The measurement type, service subject or domain, and client
The affected metric and metric vs. baseline values
A brief explanation of the anomaly

Click any row to open the detail page for that episode.

Resolving anomalies¶

You can manually mark anomaly episodes as resolved directly from the list. Select one or more episodes using the row checkboxes, then click the Resolve selected button in the toolbar above the table. A confirmation dialog asks you to confirm the action. Once resolved, the episodes move out of the default "new" filter view.

This is useful for known false positives, expected maintenance windows, or other cases where the automatic cleanup has not yet closed the episode.

Note

Only users with the admin, editor, or organization admin role can resolve anomalies.

Anomaly detail page¶

Clicking an anomaly row in the list opens its detail page, which provides the full context for the episode.

Info card¶

The top section shows:

Severity and status badges
A plain-language explanation of what was detected
The measurement type, subject, and hostname/domain
Timeline: when the episode started, when it was last seen or resolved, and how many times it re-fired
Detection metadata: detection type, pass, method, source, confidence, the diagnosis (for probe anomalies), and sample count. Hover over each badge for a brief explanation of what the value means.

Origin card¶

Shows the client that triggered the anomaly (linked to the client's detail page), along with ISP and location information when available. Fleet-wide anomalies display a "Fleet-wide" badge instead of a client link.

Metric vs. baseline card¶

A visual comparison of three key values:

Metric value – The measured value that triggered the anomaly, highlighted in red with a directional arrow.
Baseline value – The expected value learned from recent history. Hover for an explanation of what the baseline represents for this detection type.
Threshold – The cutoff that was crossed. For per-probe and fleet-wide anomalies, this is the baseline plus a multiple of the recent spread; for data quality anomalies, it is an absolute physical bound. Cross-probe anomalies are judged against the other probes rather than a fixed cutoff, so they show no threshold.

KPI context chart¶

A time-series chart of the affected metric, centered on the episode window. The chart shows the same metric for the same client and service, spanning from before the episode started to after it was last seen. This lets you see the degradation in context: the normal behavior before the episode, the anomaly itself, and the recovery (if any).

Affected measurements table¶

Below the chart, a table lists the individual measurements that fell within the episode window and scope. This mirrors what you would see in the Measurements Explorer if you filtered to the same client, service, and time range.

Actions¶

From the detail page you can:

View in Explorer – Jump to the Measurements Explorer pre-filtered to the same client, service, and time window.
Resolve – Mark this single episode as resolved (same confirmation flow as the list page).
Copy JSON – Copy the raw anomaly event data to your clipboard for further analysis.

Severity levels¶

Anomalies are classified as either warning or critical:

Warning is the initial severity for any anomaly that crosses the detection threshold.
Critical means the anomaly has either shown an extreme initial deviation or has persisted across several checks (roughly 2.5 hours for spike events, 3 hours for shift events), indicating a sustained issue rather than a passing blip. Once an episode reaches critical severity, it stays critical even if the deviation eases, preventing flapping between severity levels.

Episode lifecycle¶

Each anomaly is tracked as an episode with the following lifecycle:

New – The episode is active. The anomaly has been detected and is still appearing in later checks.
Resolved – The episode is closed. This happens automatically when the anomaly stops appearing for a configurable timeout (default: 60 minutes, i.e. two missed checks). You can also resolve episodes manually from the list or detail page.