Skip to content

Advanced Examples

This page shows practical examples of using the Export API with Python one-liners and uv for quick data extraction.

Prerequisites

Set the following environment variables before running any of the examples:

export EXPORT_API_KEY="your-api-key"
export EXPORT_API_URL="https://surfmeter-server.<customer>-analytics.aveq.info/export_api/v1"

Replace <customer> with your customer name.

Dump All Measurements for Specific Clients

Dump all measurement data for clients whose client.label starts with a specific prefix (e.g., "some-label"):

uv run --with requests python -c "
import requests, json, os
r = requests.post(
    os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
    headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
    json={'body': {'query': {'prefix': {'client.label.keyword': 'some-label'}}}},
    stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
"

To save to a file:

uv run --with requests python -c "
import requests, json, os
r = requests.post(
    os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
    headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
    json={'body': {'query': {'prefix': {'client.label.keyword': 'some-label'}}}},
    stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
" > measurements.ndjson

Filter by Time Range (Last Month)

To get only data from the last month, add a range filter using a bool query:

uv run --with requests python -c "
import requests, json, os
r = requests.post(
    os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
    headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
    json={'body': {'query': {'bool': {'must': [
        {'prefix': {'client.label.keyword': 'some-label'}},
        {'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}}
    ]}}}},
    stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
" > measurements.ndjson

Time Range Syntax

Elasticsearch supports flexible time ranges:

  • now-1M/d = 1 month ago, rounded to day
  • now-7d = 7 days ago
  • now-24h = 24 hours ago
  • now/d = today, rounded to start of day

Convert to CSV

To convert the NDJSON output to CSV, use pandas. This example extracts key video quality metrics:

uv run --with requests,pandas python -c "
import requests, json, os, pandas as pd

r = requests.post(
    os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
    headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
    json={'body': {'query': {'bool': {'must': [
        {'prefix': {'client.label.keyword': 'some-label'}},
        {'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}}
    ]}}}},
    stream=True
)

data = [json.loads(line) for line in r.iter_lines() if line]
df = pd.json_normalize(data)
df.to_csv('measurements.csv', index=False)
print(f'Saved {len(df)} rows to measurements.csv')
"

Extract Specific Columns

To extract only specific columns (e.g., for video quality analysis):

uv run --with requests,pandas python -c "
import requests, json, os, pandas as pd

r = requests.post(
    os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
    headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
    json={'body': {'query': {'bool': {'must': [
        {'prefix': {'client.label.keyword': 'some-label'}},
        {'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}},
        {'term': {'type.keyword': 'VideoMeasurement'}}
    ]}}}},
    stream=True
)

data = [json.loads(line) for line in r.iter_lines() if line]
df = pd.json_normalize(data)

# Select key video quality columns
cols = [
    'id', 'created_at', 'client.label', 'study.subject',
    'statistic_values.initial_loading_delay',
    'statistic_values.total_stalling_time',
    'statistic_values.stalling_ratio',
    'statistic_values.p1203_overall_mos',
    'statistic_values.average_video_bitrate'
]
# Only keep columns that exist
cols = [c for c in cols if c in df.columns]
df[cols].to_csv('video_quality.csv', index=False)
print(f'Saved {len(df)} rows to video_quality.csv')
"

Full Script with Error Handling

For production use, here's a more robust script:

#!/usr/bin/env python3
# /// script
# dependencies = ["requests", "pandas"]
# ///
"""
Dump Surfmeter measurements to CSV.

Usage:
    export EXPORT_API_KEY="your-key"
    export EXPORT_API_URL="https://surfmeter-server.<customer>-analytics.aveq.info/export_api/v1"
    uv run dump_measurements.py --label-prefix some-label --days 30 --output measurements.csv
"""

import argparse
import json
import os
import sys

import pandas as pd
import requests


def main():
    parser = argparse.ArgumentParser(description="Dump Surfmeter measurements to CSV")
    parser.add_argument("--label-prefix", required=True, help="Client label prefix to filter")
    parser.add_argument("--days", type=int, default=30, help="Number of days to look back")
    parser.add_argument("--type", help="Measurement type (e.g., VideoMeasurement)")
    parser.add_argument("--output", default="measurements.csv", help="Output CSV file")
    args = parser.parse_args()

    api_key = os.environ.get("EXPORT_API_KEY")
    api_url = os.environ.get("EXPORT_API_URL")

    if not api_key or not api_url:
        print("Error: Set EXPORT_API_KEY and EXPORT_API_URL environment variables", file=sys.stderr)
        sys.exit(1)

    # Build query
    must_clauses = [
        {"prefix": {"client.label.keyword": args.label_prefix}},
        {"range": {"created_at": {"gte": f"now-{args.days}d/d", "lte": "now/d"}}},
    ]
    if args.type:
        must_clauses.append({"term": {"type.keyword": args.type}})

    query = {"body": {"query": {"bool": {"must": must_clauses}}}}

    # Fetch data with scrolling
    print(f"Fetching measurements from last {args.days} days...", file=sys.stderr)
    r = requests.post(
        f"{api_url}/search?scroll=1m",
        headers={"X-API-KEY": api_key, "Content-Type": "application/json"},
        json=query,
        stream=True,
    )
    r.raise_for_status()

    data = [json.loads(line) for line in r.iter_lines() if line]

    if not data:
        print("No measurements found", file=sys.stderr)
        sys.exit(0)

    df = pd.json_normalize(data)
    df.to_csv(args.output, index=False)
    print(f"Saved {len(df)} rows to {args.output}", file=sys.stderr)


if __name__ == "__main__":
    main()

Save this as dump_measurements.py and run with:

uv run dump_measurements.py --label-prefix some-label --days 30 --output measurements.csv

Tips

  • Streaming: The scroll=1m parameter enables streaming for large datasets, avoiding pagination limits
  • Exact matching: Use .keyword suffix for exact text matches (e.g., client.label.keyword)
  • Memory: For very large datasets, process the NDJSON stream line-by-line instead of loading all data into memory
  • Compression: Save large files as .csv.gz by using df.to_csv('file.csv.gz', index=False, compression='gzip')