Announcement

Collapse
No announcement yet.

Chris Chinnock on Charles Poynton

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Chris Chinnock on Charles Poynton

    A Day with Charles Poynton
    DISPLAY DAILY Chris Chinnock 4 hours 14 mins ago Hits: 125
    Email

    I recently had the opportunity to attend a workshop with Charles Poynton, a renowned mathematician and video expert. The informal workshop took place in New York with just a few others, providing a great opportunity to listen and discuss hot button issue around High Dynamic Range (HDR).

    Charles Poynton

    Poynton prepared a thick handout of content and had this prepared as slides to review – but we may have looked at a half dozen during the whole day. The time was spent with Poynton verbally explaining the topics and fielding questions along the way. This can be an effective learning technique, but it ended up lacking the structure of a more formal workshop – which is valuable too.

    He first presented his 0th Axiom of Digital Imaging (an axiom is self evident and can’t be proved the way a theorem can). This goes something like, “In the making of commercial images the only thing that matters is what happens at the approval process. Everything that happens downstream of this should not alter this image.”  In other words, the “artistry” happens in mastering process and once approved, should be faithfully delivered to the end user.

    Poynton hammered home some basics on light and light measurement. For example, he wanted us all to remember a few key light levels:

    32,000 nits – luminance of white card illuminated by sunlight+skylight at noon on a clear day
    32,000 nits plus 2 f-stops – headroom needed for specular highlights
    320 nits – luminance of diffuse white rendered by a typical consumer TV
    32 nits – luminance of typical diffuse white in cinema
    3.2 nits – the light from a single candle
    He next turned to focus on three significant effects. To illustrate the Hunt effect, Poynton suggested we think of the color of flowers in sunlight, then think of the color of those same flowers at twilight. The color of the flowers has not changed, but our perception of the colorfulness of them has – they don’t look as colorful in dim light. His point: “If you capture the flowers in daylight and show on a display with only 300 nits of brightness, the flowers will look like they are captured at twilight.”  Since displays can show the full dynamic range of the natural world, we have to apply “artistic intent” to alter the image to produce what is desired. In other words, if you want the flowers rendered at 300 nits to have the visual colorfulness that they did in sunlight, you have to add more color. It is not accurate, but it conveys the artistic intent.

    The Stevens effect is loosely comparable to the Hunt effect, but applies to visual contrast: “apparent contrast decreases with decreasing luminance.”  Therefore, you also need to increase the contrast of the flowers example above to restore the perception of viewing them in sunlight.

    The Bartleson & Breneman effect states that “grayscale tones surrounded by black appear visually different from the same tones surrounded by white.”  This means the flowers described above would appear less colorful when surrounded by black than when surrounded by white – which to me seems counter intuitive. Since most content is mastered in a room with low level black surrounding light, this effect argues for adding more colorfulness to the image to compensate for this effect.

    We ended up talking a lot about Perceptual Quantizer (PQ) and Hybrid Log Gamma (HLG) curves, why they are needed and how they work.

    Poynton is well respected in the industry, and he has made a number of informative, but sometimes provocative or controversial statements. Below is a quick summary of some I remember.

    The color filters in an acquisition camera do not limit the color gamut of the capture content. Unfortunately, he never got around to explaining why this should be so as it is not obvious to me why that should be the case.
    Using Imperial units like footcandles [fc] and footlamberts [fL] impedes understanding of radiometry and photometry
    Confusion is rampant in ITU, SMPTE, EBU and MPEG standards regarding the reference white and peak white luminance levels. For SDR content, reference white is standardized by SMPTE (but not ITU) as code value 940 for a 10-bit word and 100 nits. Peak white is speced as code value 1019 according to ITU-R BT.1886, which equates to about 122 nits. There is inconsistent standardization on how to handle out of range luminance value and how to clip or not clip them – so there is inconsistent implementation of this.
    There is no definition of reference level for diffuse white for HDR, which is a big oversight. He recommends using 180 nits for diffuse white for UHD. Specular highlight should be up to 3X diffuse white for UHD (540 nits).
    He recommends reserving code values 64-940 in a 10-bit video for the reference data with 941-1019 for specular highlights
    Poynton uses the terminology OECF and EOCF instead of the more widely used OETF and EOTF
    Poynton would like to see the “brightness” adjustment we know eliminated from consumer TVs with the “contrast” knob relabeled “white level” and make sure it adjusts the diffuse white level, not the peak white level.
    HLG has been developed as a scene-referred OECF, but he thinks it should be explained as a display-referred transform the way PQ is.
    When referring to the EOCF (EOTF) use the notation PQ and HLG-1. When referring to the OECF (OETF) Poynton uses the notation PQ-1 and HLG to emphasize that one is scene-referred and the other is display-referred
    Poynton considers HDR10 to refer to 10-bit PQ or HLG-1 in a BT.2020 container (although there does not seem to be any document that states this explicitly)
    The MaxCLL (maximum content luminance) specification of ST-2086, which is part of the mastering metadata requirement, is based on Max [R’, G’, B’] and is not useful in determining maximum luminance, maximum display power or diffuse white luminance.
    The MaxFALL (maximum frame-average luminance level) has the same issue as it is based upon Max [R’, G’, B’]. It is not useful to determine the average or maximum luminance, average or maximum display power or diffuse white luminance.
    It is hard to summarize an entire day of discussion in one article, but I think you get the idea. There remains a lot of work to be done on HDR/WCG to say the least.
    https://twitter.com/CINERAMAX<br /><br />https://WALLSCREEN-SKYLOUNGES.COM

  • #2
    Chris associate at Display Daily/Insight Media adds this ...

    s a follow-up to my HDR10 vs Dolby Vision article, I had a chance to talk to Patrick Griffis of Dolby Laboratories, Inc. on Monday about Dolby’s view of high dynamic range (HDR) systems. This balances the article Samsung, HDR and Industry Experts (Subscription required) I wrote after visiting the Samsung 837 venue, where I attended a workshop on HDR that focused on HDR10. Griffis is the Vice President of Technology in Dolby’s Office of the CTO and the SMPTE Vice President for Education. He has been at Dolby since 2008 and before that he had positions at Microsoft and Panasonic. At Dolby, he has been active in HDR for 4 – 5 years.

    Two questions that haunt HDR, not just at Dolby, but everywhere the technology is being implemented, are “How black is black?” and “How bright is white?”

    Dolby Luminance Levels resize

    According to Griffis, the human visual system (HVS) can see just a handful of photons; perhaps 40. In order to see this, it takes several hours of adaptation in total darkness - not a realistic TV-watching scenario. Dolby did human viewing tests showing that a “black” this black is not necessary. There tests showed that a “black level” of 0.005 nits (cd/m²) satisfied the vast majority of viewers. While 0.005 nits is very close to true black, Griffis says Dolby can go down to a black of 0.0001 nits, even though there is no need or ability for displays to get that dark today. This 0.0001 nit black level is what you would see if your scene were illuminated on a dark night, illuminated only by starlight. As a comparison, Standard Dynamic Range (SDR) content is typically mastered with a black level of 0.1 nit, comparable to what you would see on a moon-lit night.

    How bright is white?  Dolby says the range of 0.005 nits – 10,000 nits satisfied 84% of the viewers in their viewing tests – no need to go to blacker blacks and little need to go to brighter whites. While Griffis personally can see advantages to setting the white point to 20,000 nits instead of 10,000 nits, he can see the disadvantages, too. Displays now and in the near future simply cannot achieve 20,000 nits. He says the brightest consumer HDR displays today are about 1,500 nits. Professional displays where HDR content is color-graded can achieve up to 4,000 nits peak brightness, so Dolby Vision HDR content is mastered at a higher brightness than it is viewed by consumers. This is the reverse of the situation for SDR where the content is typically color-graded at 100 nits and then viewed by consumers at a higher brightness of 300 nits or more.

    Dolby uses the Perceptual Quantizer Electro-optical Transfer Function (PQ EOTF) to encode light levels for Dolby Vision. This isn’t surprising since Dolby specifically developed the PQ EOTF for use in HDR systems like Dolby Vision. The PQ EOTF is also used for HDR10 content, except HDR10 uses a 10 bit version and Dolby uses a 12 bit version of the PQ function. The PQ EOTF has been standardized as SMPTE ST-2084.

    If too few a number of bits is used in the EOTF function, then viewers may be able to see “contouring” or steps between two gray values that have bit values that differ by one count. Dolby claims 12 bits are necessary to avoid this problem and presented the figure below to demonstrate this issue.


    If, at any given luminance level, the EOTF exceeds the threshold for minimum contrast step, it would be possible for a viewer to see contouring in the image at that luminance level. In this image, Dolby uses the Barten Ramp as the threshold of visibility for contouring. It would take a 13 bit Log EOTF, a 15 bit Gamma EOTF or the 16 bit OpenEXR EOTF to remain below this threshold. Sensibly, Dolby chose the EOTF that required the fewest number of bits to encode the image without visible contouring: a 12 bit PQ EOTF. A 10 bit PQ EOTF would be above the threshold defined by this Barten Ramp at all luminance levels.

    The question here is, “Is the Barten Ramp the correct threshold?” The Barten Ramp is calculated from P. G. J. Barten’s 1999 Contrast Sensitivity Function (CSF), a model that supposedly incorporated most of the important variables in the ability of the HVS to detect contouring in digital display systems. Note this is modeled rather than measured data and the model is based on data that predates the 1999 model. Measured data from W. F. Schreiber’s 1992 book shows the eye is not nearly as sensitive to contrast steps as Barten’s CSF model indicates, as shown in the image from ITU Report BT.2246-5-2015, below.



    While a 10 bit version of the PQ EOTF would exceed the threshold based on the Barten CSF model, it would stay well below the threshold established by Schreiber’s measured data. The satisfactory application of the 10 bit PQ function by HDR10 supporters would seem to indicate that Barten sets too strict a limit. For example, at 1 nit of luminance, Barten says the HVS can detect a 0.5% brightness step but Schreiber says the HVS can only detect a 2.0% brightness step. This is a huge difference. Which is right?  Depending on 1992 or 1999 data to determine something as important as the need for 10 or 12 bits doesn’t seem right – someone should repeat the experiments to find what the correct answer is.

    Another reason why HDR10 can use a 10 bit PQ EOTF is HDR10 content is mastered over a narrower luminance range than Dolby Vision. HDR10 masters over a range of 0.05 – 1000 nits with a 20,000:1 range of brightness levels, while Dolby Vision is mastered over a range of 0.0001 – 10,000 nits, a brightness range of 1,000,000:1. HDR10 doesn’t need to use code values to represent brightness levels from 1000 – 10,000 nits or from 0.0001 – 0.05 nits, so fewer code values are needed and 10 bits are enough to represent all of them. The implementation of HDR10 in UltraHD Premium OLEDs runs from 0.0005 – 540 nits, a brightness range of 108,000:1. Since the HVS is relatively insensitive to contouring at very low brightnesses, according to either the Barten Ramp or Schreiber, no additional steps are required for this wider range of brightnesses compared to the LCD version of HDR10.

    Dolby is a proprietary system and brands must pay a license fee to Dolby to use it, while HDR10 is an open system without license fees. There are advantages to both approaches. A proprietary system is well defined and inter-operability is not a problem. An open system, particularly one early on the development curve, can have multiple contradictory approaches. While HDR10 seems to work and be compatible across systems, the deeper I looked into HDR10 and its underlying standards from SMPTE, the ITU and others, the more confused I got and the more worried I got about its lack of standardization.

    Griffis said that for streaming video, Dolby Vision provides a backward-compatible framework for HDR. There is a base layer of SDR content encoded with a Gamma EOTF and decodable by any streaming video decoder for showing on an SDR display. Then there is an enhancement layer that includes all the necessary data to convert the SDR stream into an HDR stream and metadata that tells the Dolby Vision decoder exactly how to use this enhancement layer data to perform this SDR to HDR conversion. Griffis said the enhancement layer and the metadata adds just 15% on average to the size of the bitstream. This allows streaming companies like Netflix to keep just one version of the file and then stream it to either SDR or HDR customers. An SDR (or HDR10-only) system discards the extra data and shows the SDR image. This saves storage space on the server, at the expense of higher streaming bit rates to SDR customers. Of course, in 2016, the vast majority of streaming customers are SDR customers and this streamed enhancement data is mostly discarded.

    I’m not sure that this is a worthwhile saving – storage space is cheap and streaming bandwidth is relatively expensive. Beside, a streaming company like Amazon or Netflix doesn’t store just one version of a file – they store multiple versions to allow for adaptive bit rates. If a customer can’t accept the full bit rate, the streaming company doesn’t decimate the file in real time to stream at a lower bit rate. Instead it starts to stream the version of the file that was compressed off-line with a higher compression ratio so it could be sent at a lower bit rate. Having one more file, a premium HDR version, doesn't add much to their storage requirements.


    Dolby Vision per-frame metadata includes information on the minimum, mean and maximum brightness in a scene.

    One thing Dolby Vision does that HDR10 does not do is send dynamic metadata on a per frame or per scene basis. HDR10 sends metadata on a per movie basis. Griffis said this would be OK if the viewer was watching the content on a display that was identical to the display the content was mastered on. Since it never is, it is necessary for the TV to adjust the content to match the artistic intent. The dynamic metadata provides information useful in this content adjustment.

    Dolby Hisilicon SoC resize
    HiSilicon Hi3798C V200 SoC Demo Platform with Digital Tuner and Dolby Vision capability at IBC 2015. This SoC has a Quad core Cortex A53 processor at its heart. (Photo Credit: CNXSoft-Embedded Systems News)

    To use the Dolby Vision proprietary system, it is necessary to buy a system on a chip (SoC) ASIC from one of Dolby’s partners: MStar, HiSilicon, MediaTek or Sigma Designs. According to Griffis, these four companies have 75% of the market for these types of SoCs, so HDR system makers are not particularly limited in their SoC sources. Griffis added that the SoC that can decode Dolby Vision can also decode HDR10, so it is not necessary for a TV maker to include two decoder chips in a dual Dolby Vision/HDR10 system.

    When asked about the added cost of Dolby Vision compared to HDR10, Griffis said, “the incremental cost is minimal and frankly a fraction of the total TV cost which is much more driven by the display electronics and electronic components, not royalties.”

    A final question: “Which will prevail, Dolby Vision or HDR10?” Tune in next year, or perhaps next decade, to find out. Since HDR brings something to TV that consumers can actually see at a fairly modest cost in both dollars and bits, HDR itself isn’t likely to fade away. –Matthew Brennesholtz

    Patrick Griffis from Dolby is my contact to get Digital Cinema Dolby Vision for a home, but I have to remind him I am the Nab 16 future of cinema symposium troublemaker for him to remember me.LOL
    https://twitter.com/CINERAMAX<br /><br />https://WALLSCREEN-SKYLOUNGES.COM

    Comment

    Working...
    X