Re: [EXTERNAL] Re: cICP wording feedback

Hi Chris, all,

Apologies for the long-winded reply, I just want to make sure we’re all coming from the same place.

There are 2 separate design goals in use:


  1.  HLG is a relative, scene-referred imaging system whose design goal is to be able to encode a dynamic range slightly greater than the instantaneous dynamic range of the human visual system (approx. 16-17 stops) and replicate this in a perceptually similar way in a range of viewing environments.  Whilst the display side mathematics has a nominal viewing condition of a 1000 cd/m2 monitor in an ambient environment of 5 cd/m2, there is no normalisation to this range. The standard and accompanying reports give methods to calculate the offset for a wide range of viewing conditions – I have tested this with monitors from 350 – 4000 cd/m2 under viewing conditions from very dark to indoor daylight viewing – again the system is not limited to this range, but that’s the range of monitoring and reproducible lighting I had for the tests.  HLG also has a degree of backwards compatibility with previous generation SDR monitors – it displays a usable image as the transfer function looks like an SDR camera function with a defined highlight compression.



  1.  PQ is an absolute, display referred imaging system whose design goal is to encode the range 0.001-10000 cd/m2  (approx. 30 stops) which is well beyond the simultaneous dynamic range of the human visual system. The absolute nature of PQ was designed to replicate an encoded image as exactly as possible on a range of monitors.  It was not designed to be used in a relative manner.  If a monitor has less capability than the grading monitor, clipping would occur, therefore, there are a number of metadata schemes available to aid with this, HDR10, Dolby Vision, HDR10+, SL-HDR.  There has been a recent addition, Dolby IQ, to deal with changing viewing environment.

If you would like to see some use cases, the W3C Color on the Web group have been discussing this for some time and there are a number of demonstrations, with pseudo-code available at: https://bbc.github.io/w3c-tone-mapping-demo/


These use transforms that go via linear display light with, in the case of the PQ to HLG transform, a pseudo-display with a luminance range of 0-1000 cd/m2.  Different brightness levels of pseudo-display can be used in the mathematics to match the expected display characteristics of say sRGB or ITU-R BT.1886, provided the OOTF is calculated correctly.  This has been shown in the Color on the Web group.  Transforms can also go via scene light, these are in common use for matching the look of a camera to those using a different standard, e.g. using BT.709 cameras in a BT.2100 HLG production.

Best Regards

Simon


Simon Thompson
Senior R&D Engineer

BBC Research & Development


From: Seeger, Chris (NBCUniversal) <Chris.Seeger@nbcuni.com>
Date: Sunday, 4 February 2024 at 01:22
To: jbowler@acm.org <jbowler@acm.org>, Simon Thompson - NM <Simon.Thompson2@bbc.co.uk>
Cc: jbowler@acm.org <jbowler@acm.org>, public-png@w3.org <public-png@w3.org>
Subject: Re: [EXTERNAL] Re: cICP wording feedback

External: Think before clicking
I hope I’m  translating your sentence correctly:

“Because HLG is relative a subsequent frame can, I assume, change the base (resulting in the display upping the luminance across the whole screen) and accommodate the previously clipped values.”

HLG was not designed to shift luminance of the content based on a scene or frame change to prevent clipping. HLG was designed to process an HLG signal (OETF) (signal from the camera)  based on a specific displays peak luminance capability which determines a variable gamma to use thru HLG’s EOTF (for translation of the signal to a display). HLG’s variable gamma shifts shadows and midtones in an attempt to preserve the perceptual look of the images on displays of different luminance levels.  It captures a smaller dynamic range compared to PQ and then it up/down in luminance. HLG’s normalized range is between 0 and 1,000nits (up to 1,810nits if overshoot is used).

PQ is designed to capture a larger dynamic range from the scene between 0 and 10,000 nits.  It displays these absolute values on reference displays or TV’s using a reference setting.  PQ can also be adjusted up and down in luminance (in a relative fashion) using systems like Dolby Vision IQ or even different picture modes on a TV dependent on room ambient lighting. PQ is designed to use dynamic and static metadata which can assist remapping content to a specific TV’s capabilities.

Best,
Chris

From: John Bowler <john.cunningham.bowler@gmail.com>
Date: Saturday, February 3, 2024 at 7:11 PM
To: Simon Thompson - NM <Simon.Thompson2@bbc.co.uk>
Cc: jbowler@acm.org <jbowler@acm.org>, public-png@w3.org <public-png@w3.org>
Subject: [EXTERNAL] Re: cICP wording feedback
On Fri, Feb 2, 2024 at 1:40 AM Simon Thompson - NM
<Simon.Thompson2@bbc.co.uk> wrote:
> I think the range 0-1 is a target in live video production, not necessarily always possible, if you think of a sports game under natural lighting or an outdoor interview, then as the lighting changes, the camera operator will slowly adjust the iris to prevent any large shifts being visible.  Any still images exported from a live video will probably include values outside the 0-1 range.  The various regional production specifications prevent clipping to the range 0-1 as this would cause ringing when filtering and increase bitrate requirements in DCT based encoders.

My understanding is that you are a BBC employee and HLG is a BBC
invention.  I based my comments on the actual wording of the cICP text
and, indeed, the use of the magic numbers (16,235) by Kodak but now
that I have read the H.273 equations my comments were entirely wrong.
A "narrow band" image is called such because it really is only
transmitting a part of the encoding space; so, using 8-bit numbers, it
transmits encoded values in the range 16..235 by transmitting values
in the range 0..1.  Hence the equations 20 through 22 in the spec,
specifically the part:

> 219 * E′r + 16

That's from equation (20); E'r is a linear value that has been
obtained from the inverse of the transfer characteristics (EOTF; the
encoding function).  It typically (when the transfer characteristics
are not 11 or 12) is limited to a value in the range 0..1

So if a PNG contains a value '0' in RGB the equation evaluates to 16
and if it contains the maximum value the equation evaluates to 235
(the magic numbers).  This is then scaled back to the range 0..1 (or
at least that is the intent of equations 20..22; some equations are
more clearly wrong, such as that for Clip3.)

So they guys in the Beeb who did this knew what they were doing; the
transmitted signal can contain only a sub-range, specifically
0.063..0.922 (3dp) of the encoding range 0..1 and the transmission is
"narrow range".

This does depend on my correct interpretation of the direction of the
equations, which is why I would like you to go back and ask your
colleagues.  I've been troubled by the description "full range"
because, by the definition in the PNG specification, it was backwards.
The explanation above makes sense to me; only a subrange of the
encodable range is transmitted (so it is narrow) and out of range
values (still within the representable range of the encoding) are
clipped.  HLG is a relative encoding in contrast with PQ when "1.0"
means 10000cd/m^2, making it "absolute".  Because HLG is relative a
subsequent frame can, I assume, change the base (resulting in the
display upping the luminance across the whole screen) and accommodate
the previously clipped values.

Received on Tuesday, 6 February 2024 12:22:31 UTC