A New Day Dawning... HDR Delivery
A look at the proposed distribution methods for HDR
May 27, 2016
By Jim DeFilippis
LOS ANGELES—My last post introduced high dynamic range concepts and some background to the technology. We discussed the concept of ‘whiter whites’ and ‘darker shadows’ with the ability of modern display technology that can not only output more light but also increase the dynamic range of the displayed image by reducing the minimum black level.
SMPTE standardized a HDR EOTF (electronic to optical transfer function) called PQ (perceptual quantization) as ST-2084. The PQ transfer function has been optimized to cover a wide range of light values (from .0001 to 10,000 cd/m2) while minimizing the visual effect of 10-bit or 12-bit quantization (contouring).
We touched on an alternate approach called hybrid log gamma (HLG) transfer curve. HLG is not associated with a specific light value(s) but rather is a relative light value based on an assumed dynamic range and peak white. HLG can be used as an image capture curve as well as the final display transfer curve. While HLG has no metadata associated with the HDR signal, one has to have an agreed upon peak white reference value (typically 1000 cd/m2) for a display to be able to process the HLG HDR signal and render the image appropriately for the given display capabilities.
I promised to talk about the distribution of HDR video over a variety of channels such as Blu-ray, OTT, OTA, satellite and cable in the this next article. Each mode of distribution has it’s own unique challenges and options to delivery of video content.
A common element for the delivery of HDR is HEVC (high-efficiency video coding). The latest video codec from MPEG not only has the ability to encode 4K video but enables full 10-bit resolution to the consumer display. HEVC also supports wide color gamut, defined in BT.2020. However, HEVC still relies on the non-constant luminance equations (YCrCb) that are defined in BT.1886. HEVC has the ability to signal HDR metadata in a variety of methods (SEI and VUI metadata) to assist the HDR display to properly process the HDR imagery. While MPEG-4 AVC does have a mode to support 10-bit encoding, it has not been deployed in the consumer product space, thus limiting HDR consumer delivery to 8-bits.
I’ll go over the different methods of distribution for HDR in the order that they have been adopted and/or proposed:
BLU-RAY
The Blu0-ray spec was amended for 4K (UHDTV) including HDR and wide color (BT 2020). This spec, known as HDR10 is summarized as:
— HEVC Main level encoding
— 10-bit
— (PQ) EOTF (HDR)
— BT.2020 color space (wide color)
— 4:2:0 subsampling (YCrCb)
This format for 4K/HDR has also been adopted by some over-the-top (OTT) delivery platforms including Netflix and Amazon. HDR10 is one of the simpler methods of HDR distribution but does require static metadata (MaxFLL and MaxCALL) to inform the display device of the average brightness as well as peak brightness values. HDR 10 is not backward compatible for non-HDR displays, although some Blu Ray players may provide conversion to SDR if the detected display is not HDR compatible. For the OTT services, based on the type of display, the appropriate format is streamed to the display.
LIVE LINEAR BROADCAST OF HDR
While HDR10 works for optical media and internet delivery of content, for broadcast channels HDR10 has some drawbacks with respect to live broadcasting:
— Requires static HDR meta data
— Requires ‘two layer’ approach (simulcast of HDR and SDR).
A joint proposal from Samsung, Sharp and Qualcomm support use of HDR10 for ATSC 3.0. Here are the other proposals being considered by S34-1 of the ATSC 3.0 Technical Standards Group:
HYBRID LOG GAMMA (HLG), PROPOSED BY BBC AND NHK
As mentioned above, HLG uses a dual curve approach, gamma in the dark region and a log function for the bright region. By optimizing the coefficients of the HLG equation, the tone mapping for HDR and SDR can be accommodated without metadata or additional processing.
However in practice it has been shown that the optimization leads to limitations in terms of the overall dynamic range of the HLG HDR signal to protect the SDR signal. In addition, while there is no defined meta data, there needs to be an assumed reference peak white level so that the displayed image tonal range can match the image as ‘graded’ by the video operator. Finally, there is the challenge of color space conversion between BT.2020 and BT.709 (HDTV color gamut).
HLG is documented in ARIB standard B67 and will be included in an update to ITU BT.1886.
DOLBY VISION
Dolby’s proposal is based upon the use of the PQ EOTF curve and optionally a new color space with a new set of color difference equations called ITP (Intensity, Tritanope, Protanope). ITP, compared to the established YCrCb color space, has three components, I (lightness, similar to Y’), CP(red-green dimension, similar to C’r) and CT( yellow-blue dimension, similar to C’b). The underlying color space is based on LMS, which is based on long-medium-short cone color response of human vision. Dolby summarizes this format as ITP-PQ. The key benefit of ITP is the property of isoluminance that minimizes the chroma/luma cross-talk that can happen with the classic Y’Cr’Cb’ non-constant luminance approach as well as ‘linearize’ hue versus saturation.
In the encoding process, the ST 2084 PQ-based HDR video (along with static ST 2086 metadata) is converted to ITP-PQ space and subsampled to 4:2:0. Adaptive reshaping is applied prior to the HEVC encoder to improve compression efficiency. Specific Dolby metadata is combined with the converted HDR signal and transmitted as part of the HEVC encoding (SEI messages).
On decoding, the signal is processed through tonal mapping and then converted to full 4:4:4 color. This reconstructed 10-bit/4:4:4 HDR/BT.2020 signal is converted from 10-bit to 12-bit and then reverse ITP matrix is applied to output HDR RGB.
Additional metadata (ST 2094) can be created to provide tonal mapping of full range HDR signals to displays with constrained HDR performance or to SDR displays, either for professional or consumer conversions.
PRIME SINGLE, PROPOSED BY TECHNICOLOR/PHILIPS
Prime Single is a single layer approach that converts the HDR signal to a SDR signal with dynamic metadata to provide both tone re-mapping as well as color gamut correction. Prime Signal supports PQ, HLG, Log or SDR input video signals. At the decoding side, Prime Signal takes the SDR as decoded and with the tone mapping and CRI color correction metadata can provide HDR outputs signals (PQ or HLG) as well as a native SDR (without any further processing or meta data).
This approach provides a backward compatible SDR output, which is determined by the pre-processing encoding process. The tone mapping and CRI metadata is used to re-create the HDR signal from the SDR. The Prime Signal metadata is carried within the HEVC data stream as SEI messages, with an ability to update on a frame by frame basis.
ERICSSON PROPOSAL
Ericsson has proposed a pre-processing approach to a HDR10 HDR signal to mitigate errors caused by 4:2:0 subsampling in the HEVC process. Basically the pre-process calculates the error due to conversion of RGB to YCrCb, quantization to 10-bit and downsample to 4:2:0 and then compensates the luma samples to minimize errors on the decode/reconstruction end.
In addition there is an optimization of the HEVC QP Chroma offset values to mitigate chroma errors.
No support for conversion to SDR or BT.709 color space.
QUALCOMM PROPOSAL
Pre-processing analysis of the input HDR signal (HDR 10) to create a set of dynamic range adjustment parameters to minimize errors in the HEVC (4:2:0 YCrCb) encoding. These parameters are carried in private SEI messages inside of the HEVC bit stream. Similar to Ericsson’s proposal, no support for conversion to SDR/BT.709 color space.
SUMMARY
While there are common features between the HDR proposals, there are different approaches to fitting the full HDR signal into the limitations of HEVC as well as providing a solution for multiple display and production formats. Key to evaluating these proposals will be the head to head evaluation of each proposal scheduled for this June at the ATSC 3.0 S34-1 committee meeting.
A look at the proposed distribution methods for HDR
May 27, 2016
By Jim DeFilippis
LOS ANGELES—My last post introduced high dynamic range concepts and some background to the technology. We discussed the concept of ‘whiter whites’ and ‘darker shadows’ with the ability of modern display technology that can not only output more light but also increase the dynamic range of the displayed image by reducing the minimum black level.
SMPTE standardized a HDR EOTF (electronic to optical transfer function) called PQ (perceptual quantization) as ST-2084. The PQ transfer function has been optimized to cover a wide range of light values (from .0001 to 10,000 cd/m2) while minimizing the visual effect of 10-bit or 12-bit quantization (contouring).
We touched on an alternate approach called hybrid log gamma (HLG) transfer curve. HLG is not associated with a specific light value(s) but rather is a relative light value based on an assumed dynamic range and peak white. HLG can be used as an image capture curve as well as the final display transfer curve. While HLG has no metadata associated with the HDR signal, one has to have an agreed upon peak white reference value (typically 1000 cd/m2) for a display to be able to process the HLG HDR signal and render the image appropriately for the given display capabilities.
I promised to talk about the distribution of HDR video over a variety of channels such as Blu-ray, OTT, OTA, satellite and cable in the this next article. Each mode of distribution has it’s own unique challenges and options to delivery of video content.
A common element for the delivery of HDR is HEVC (high-efficiency video coding). The latest video codec from MPEG not only has the ability to encode 4K video but enables full 10-bit resolution to the consumer display. HEVC also supports wide color gamut, defined in BT.2020. However, HEVC still relies on the non-constant luminance equations (YCrCb) that are defined in BT.1886. HEVC has the ability to signal HDR metadata in a variety of methods (SEI and VUI metadata) to assist the HDR display to properly process the HDR imagery. While MPEG-4 AVC does have a mode to support 10-bit encoding, it has not been deployed in the consumer product space, thus limiting HDR consumer delivery to 8-bits.
I’ll go over the different methods of distribution for HDR in the order that they have been adopted and/or proposed:
BLU-RAY
The Blu0-ray spec was amended for 4K (UHDTV) including HDR and wide color (BT 2020). This spec, known as HDR10 is summarized as:
— HEVC Main level encoding
— 10-bit
— (PQ) EOTF (HDR)
— BT.2020 color space (wide color)
— 4:2:0 subsampling (YCrCb)
This format for 4K/HDR has also been adopted by some over-the-top (OTT) delivery platforms including Netflix and Amazon. HDR10 is one of the simpler methods of HDR distribution but does require static metadata (MaxFLL and MaxCALL) to inform the display device of the average brightness as well as peak brightness values. HDR 10 is not backward compatible for non-HDR displays, although some Blu Ray players may provide conversion to SDR if the detected display is not HDR compatible. For the OTT services, based on the type of display, the appropriate format is streamed to the display.
LIVE LINEAR BROADCAST OF HDR
While HDR10 works for optical media and internet delivery of content, for broadcast channels HDR10 has some drawbacks with respect to live broadcasting:
— Requires static HDR meta data
— Requires ‘two layer’ approach (simulcast of HDR and SDR).
A joint proposal from Samsung, Sharp and Qualcomm support use of HDR10 for ATSC 3.0. Here are the other proposals being considered by S34-1 of the ATSC 3.0 Technical Standards Group:
HYBRID LOG GAMMA (HLG), PROPOSED BY BBC AND NHK
As mentioned above, HLG uses a dual curve approach, gamma in the dark region and a log function for the bright region. By optimizing the coefficients of the HLG equation, the tone mapping for HDR and SDR can be accommodated without metadata or additional processing.
However in practice it has been shown that the optimization leads to limitations in terms of the overall dynamic range of the HLG HDR signal to protect the SDR signal. In addition, while there is no defined meta data, there needs to be an assumed reference peak white level so that the displayed image tonal range can match the image as ‘graded’ by the video operator. Finally, there is the challenge of color space conversion between BT.2020 and BT.709 (HDTV color gamut).
HLG is documented in ARIB standard B67 and will be included in an update to ITU BT.1886.
DOLBY VISION
Dolby’s proposal is based upon the use of the PQ EOTF curve and optionally a new color space with a new set of color difference equations called ITP (Intensity, Tritanope, Protanope). ITP, compared to the established YCrCb color space, has three components, I (lightness, similar to Y’), CP(red-green dimension, similar to C’r) and CT( yellow-blue dimension, similar to C’b). The underlying color space is based on LMS, which is based on long-medium-short cone color response of human vision. Dolby summarizes this format as ITP-PQ. The key benefit of ITP is the property of isoluminance that minimizes the chroma/luma cross-talk that can happen with the classic Y’Cr’Cb’ non-constant luminance approach as well as ‘linearize’ hue versus saturation.
In the encoding process, the ST 2084 PQ-based HDR video (along with static ST 2086 metadata) is converted to ITP-PQ space and subsampled to 4:2:0. Adaptive reshaping is applied prior to the HEVC encoder to improve compression efficiency. Specific Dolby metadata is combined with the converted HDR signal and transmitted as part of the HEVC encoding (SEI messages).
On decoding, the signal is processed through tonal mapping and then converted to full 4:4:4 color. This reconstructed 10-bit/4:4:4 HDR/BT.2020 signal is converted from 10-bit to 12-bit and then reverse ITP matrix is applied to output HDR RGB.
Additional metadata (ST 2094) can be created to provide tonal mapping of full range HDR signals to displays with constrained HDR performance or to SDR displays, either for professional or consumer conversions.
PRIME SINGLE, PROPOSED BY TECHNICOLOR/PHILIPS
Prime Single is a single layer approach that converts the HDR signal to a SDR signal with dynamic metadata to provide both tone re-mapping as well as color gamut correction. Prime Signal supports PQ, HLG, Log or SDR input video signals. At the decoding side, Prime Signal takes the SDR as decoded and with the tone mapping and CRI color correction metadata can provide HDR outputs signals (PQ or HLG) as well as a native SDR (without any further processing or meta data).
This approach provides a backward compatible SDR output, which is determined by the pre-processing encoding process. The tone mapping and CRI metadata is used to re-create the HDR signal from the SDR. The Prime Signal metadata is carried within the HEVC data stream as SEI messages, with an ability to update on a frame by frame basis.
ERICSSON PROPOSAL
Ericsson has proposed a pre-processing approach to a HDR10 HDR signal to mitigate errors caused by 4:2:0 subsampling in the HEVC process. Basically the pre-process calculates the error due to conversion of RGB to YCrCb, quantization to 10-bit and downsample to 4:2:0 and then compensates the luma samples to minimize errors on the decode/reconstruction end.
In addition there is an optimization of the HEVC QP Chroma offset values to mitigate chroma errors.
No support for conversion to SDR or BT.709 color space.
QUALCOMM PROPOSAL
Pre-processing analysis of the input HDR signal (HDR 10) to create a set of dynamic range adjustment parameters to minimize errors in the HEVC (4:2:0 YCrCb) encoding. These parameters are carried in private SEI messages inside of the HEVC bit stream. Similar to Ericsson’s proposal, no support for conversion to SDR/BT.709 color space.
SUMMARY
While there are common features between the HDR proposals, there are different approaches to fitting the full HDR signal into the limitations of HEVC as well as providing a solution for multiple display and production formats. Key to evaluating these proposals will be the head to head evaluation of each proposal scheduled for this June at the ATSC 3.0 S34-1 committee meeting.
Comment