Advanced Examples¶
This page shows practical examples of using the Export API with Python one-liners and uv for quick data extraction.
Prerequisites¶
Set the following environment variables before running any of the examples:
export EXPORT_API_KEY="your-api-key"
export EXPORT_API_URL="https://surfmeter-server.<customer>-analytics.aveq.info/export_api/v1"
Replace <customer> with your customer name.
Dump All Measurements for Specific Clients¶
Dump all measurement data for clients whose client.label starts with a specific prefix (e.g., "some-label"):
uv run --with requests python -c "
import requests, json, os
r = requests.post(
os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
json={'body': {'query': {'prefix': {'client.label.keyword': 'some-label'}}}},
stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
"
To save to a file:
uv run --with requests python -c "
import requests, json, os
r = requests.post(
os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
json={'body': {'query': {'prefix': {'client.label.keyword': 'some-label'}}}},
stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
" > measurements.ndjson
Filter by Time Range (Last Month)¶
To get only data from the last month, add a range filter using a bool query:
uv run --with requests python -c "
import requests, json, os
r = requests.post(
os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
json={'body': {'query': {'bool': {'must': [
{'prefix': {'client.label.keyword': 'some-label'}},
{'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}}
]}}}},
stream=True
)
[print(line.decode()) for line in r.iter_lines() if line]
" > measurements.ndjson
Time Range Syntax
Elasticsearch supports flexible time ranges:
now-1M/d= 1 month ago, rounded to daynow-7d= 7 days agonow-24h= 24 hours agonow/d= today, rounded to start of day
Convert to CSV¶
To convert the NDJSON output to CSV, use pandas. This example extracts key video quality metrics:
uv run --with requests,pandas python -c "
import requests, json, os, pandas as pd
r = requests.post(
os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
json={'body': {'query': {'bool': {'must': [
{'prefix': {'client.label.keyword': 'some-label'}},
{'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}}
]}}}},
stream=True
)
data = [json.loads(line) for line in r.iter_lines() if line]
df = pd.json_normalize(data)
df.to_csv('measurements.csv', index=False)
print(f'Saved {len(df)} rows to measurements.csv')
"
Extract Specific Columns¶
To extract only specific columns (e.g., for video quality analysis):
uv run --with requests,pandas python -c "
import requests, json, os, pandas as pd
r = requests.post(
os.environ['EXPORT_API_URL'] + '/search?scroll=1m',
headers={'X-API-KEY': os.environ['EXPORT_API_KEY'], 'Content-Type': 'application/json'},
json={'body': {'query': {'bool': {'must': [
{'prefix': {'client.label.keyword': 'some-label'}},
{'range': {'created_at': {'gte': 'now-1M/d', 'lte': 'now/d'}}},
{'term': {'type.keyword': 'VideoMeasurement'}}
]}}}},
stream=True
)
data = [json.loads(line) for line in r.iter_lines() if line]
df = pd.json_normalize(data)
# Select key video quality columns
cols = [
'id', 'created_at', 'client.label', 'study.subject',
'statistic_values.initial_loading_delay',
'statistic_values.total_stalling_time',
'statistic_values.stalling_ratio',
'statistic_values.p1203_overall_mos',
'statistic_values.average_video_bitrate'
]
# Only keep columns that exist
cols = [c for c in cols if c in df.columns]
df[cols].to_csv('video_quality.csv', index=False)
print(f'Saved {len(df)} rows to video_quality.csv')
"
Full Script with Error Handling¶
For production use, here's a more robust script:
#!/usr/bin/env python3
# /// script
# dependencies = ["requests", "pandas"]
# ///
"""
Dump Surfmeter measurements to CSV.
Usage:
export EXPORT_API_KEY="your-key"
export EXPORT_API_URL="https://surfmeter-server.<customer>-analytics.aveq.info/export_api/v1"
uv run dump_measurements.py --label-prefix some-label --days 30 --output measurements.csv
"""
import argparse
import json
import os
import sys
import pandas as pd
import requests
def main():
parser = argparse.ArgumentParser(description="Dump Surfmeter measurements to CSV")
parser.add_argument("--label-prefix", required=True, help="Client label prefix to filter")
parser.add_argument("--days", type=int, default=30, help="Number of days to look back")
parser.add_argument("--type", help="Measurement type (e.g., VideoMeasurement)")
parser.add_argument("--output", default="measurements.csv", help="Output CSV file")
args = parser.parse_args()
api_key = os.environ.get("EXPORT_API_KEY")
api_url = os.environ.get("EXPORT_API_URL")
if not api_key or not api_url:
print("Error: Set EXPORT_API_KEY and EXPORT_API_URL environment variables", file=sys.stderr)
sys.exit(1)
# Build query
must_clauses = [
{"prefix": {"client.label.keyword": args.label_prefix}},
{"range": {"created_at": {"gte": f"now-{args.days}d/d", "lte": "now/d"}}},
]
if args.type:
must_clauses.append({"term": {"type.keyword": args.type}})
query = {"body": {"query": {"bool": {"must": must_clauses}}}}
# Fetch data with scrolling
print(f"Fetching measurements from last {args.days} days...", file=sys.stderr)
r = requests.post(
f"{api_url}/search?scroll=1m",
headers={"X-API-KEY": api_key, "Content-Type": "application/json"},
json=query,
stream=True,
)
r.raise_for_status()
data = [json.loads(line) for line in r.iter_lines() if line]
if not data:
print("No measurements found", file=sys.stderr)
sys.exit(0)
df = pd.json_normalize(data)
df.to_csv(args.output, index=False)
print(f"Saved {len(df)} rows to {args.output}", file=sys.stderr)
if __name__ == "__main__":
main()
Save this as dump_measurements.py and run with:
Tips¶
- Streaming: The
scroll=1mparameter enables streaming for large datasets, avoiding pagination limits - Exact matching: Use
.keywordsuffix for exact text matches (e.g.,client.label.keyword) - Memory: For very large datasets, process the NDJSON stream line-by-line instead of loading all data into memory
- Compression: Save large files as
.csv.gzby usingdf.to_csv('file.csv.gz', index=False, compression='gzip')