Skip to content

ITU-T P.1204

The ITU-T Recommendation P.1204 is a family of standards that specifies video quality models for sequences of up to 4K/UHD resolution. Unlike P.1203, which predicts integral quality for longer streaming sessions, P.1204 focuses on short-term video quality assessment for video segments of 5–10 seconds duration. P.1204 is not a successor to P.1203, but rather provides alternative Pv (video quality estimation) modules that can be used standalone or integrated with P.1203.3 for session-level QoE prediction.

The P.1204 series was developed through a competition within ITU-T Study Group 12 in collaboration with the Video Quality Experts Group (VQEG), using a large subjective test dataset of approximately 5,000 test sequences rated by human viewers. For details on the development process, see Raake et al.: Multi-Model Standard for Bitstream-, Pixel-Based and Hybrid Video Quality Assessment of UHD/4K: ITU-T P.1204.

P.1204 Model Types

The P.1204 standard series comprises several model types, each using different input information (see Model Classification):

Recommendation Released Model Type Reference Required Input Information
P.1204.1 2025 Bitstream Mode 0 (Metadata) No Bitrate, resolution, framerate, codec
P.1204.3 2020 Bitstream Mode 3 No Full bitstream (QP, frame sizes, motion vectors)
P.1204.4 2020 Pixel-based (RR/FR) Yes (reduced) Reference + processed pixels
P.1204.5 2020 Hybrid (NR) No Metadata + processed pixels

All models support H.264, H.265, and VP9, with resolutions from 240p–2160p and frame rates between 15–60 fps. The P.1204 models output quality predictions on the 5-point ACR MOS scale, providing both per-segment scores (O.27) and per-one-second scores (O.22) suitable for integration with P.1203.3.

In the following sections, we provide an overview of the two main P.1204 models that can used in practice when working with Surfmeter tools: P.1204.1 (metadata-based) and P.1204.3 (bitstream-based).

ITU-T P.1204.1

ITU-T Rec. P.1204.1 specifies a metadata-based (Mode 0) video quality model that uses only basic encoding parameters—bitrate, resolution, framerate, and codec type—to predict video quality. This makes it suitable for scenarios where deep bitstream inspection is not possible, such as encrypted streams or lightweight monitoring applications.

The model architecture follows P.1204.3's degradation-based approach, computing three degradation components:

  • Quantization degradation (Dq): Estimates encoding quality loss by synthesizing the quantization parameter (QP) from bitrate, resolution, and framerate metadata
  • Upscaling degradation (Du): Models quality loss from spatial upscaling when encoding resolution is lower than display resolution
  • Temporal degradation (Dt): Accounts for jerkiness when encoding framerate is lower than display framerate

P.1204.1 has been validated for H.264, H.265, and VP9 codecs, with resolutions up to 4K/UHD-1 (3840×2160) for PC/TV displays and QHD (2560×1440) for mobile/tablet devices, and framerates up to 60 fps.

ITU-T P.1204.3

ITU-T Rec. P.1204.3 specifies a bitstream-based (Mode 3) video quality model that performs deep inspection of the encoded video stream. Unlike the metadata-based P.1204.1, this model extracts detailed information from the bitstream including quantization parameters (QP), frame sizes, frame types, and motion vectors. The model computes quality predictions using per-frame QP values (directly extracted from the bitstream rather than estimated), enabling more accurate assessment of compression artifacts. It also incorporates frame-level complexity effects based on motion information.

P.1204.3 achieves higher prediction accuracy than P.1204.1 due to its access to actual encoding parameters, even outperforming the state-of-the art VMAF model in some cases. It has been validated for H.264, H.265, and VP9 codecs, with resolutions up to 4K/UHD-1 (3840×2160) and framerates between 15–60 fps.

For technical details, see Rao et al.: P.1204.3 – An ITU-T Recommendation for Bitstream-based Video Quality Assessment Supporting Modern Video Codecs and Resolutions.

Note

P.1204.3 is not used in Surfmeter by default due to its higher computational requirements. The metadata-based P.1204.1/AVQBits|M0 model provides a good balance between accuracy and performance for most use cases.