[ttml2] Feedback on HDR compositing from Nigel Megitt via GitHub on 2017-07-07 (public-tt@w3.org from July 2017)

From: Nigel Megitt via GitHub <sysbot+gh@w3.org>
Date: Fri, 07 Jul 2017 11:34:52 +0000
To: public-tt@w3.org
Message-ID: <issues.opened-241237762-1499427291-sysbot+gh@w3.org>
nigelmegitt has just created a new issue for https://github.com/w3c/ttml2:

== Feedback on HDR compositing ==
Filing on behalf of Lars Haglund of SVT, who asked me to send this on his behalf (I have slightly edited it to make it more applicable as an issue on this repo but kept the substantive comments untouched):

SVT has not been involved in any W3C-work regarding subtitles, but has looked at the HDR compositing section in TTML2 (i.e. W3C’s initial work on timed text markup which will be used for UHDTV Subtitles).
 
As you can see, W3C’s plans are to use the sRGB colour space and transfer function to transmit subtitles and then to transform them in device for display on the relevant screen. As such, the transform needs to be simple enough to use in a consumer devices graphics plane processor.
 
There are two example transforms, for PQ and HLG, given in the W3C document, but to me/SVT there are problems with the PQ-transform.
 
1) DVB, in TS 101 154, has chosen 10-bit Narrow Range (64-940), not Full Range (0-1023), for PQ-based transmissions, hence it is likely a problem that W3C describes its example transform based on Full Range (0-1023).

I.e. if a CE-manufacturer is following that recommendation, then a pre-scaling (a stretch) of the received video from 64-940 to 0-1023 would be needed that in any bit-depth limited processing would introduce quantisation-errors.

Perhaps instead: Keep the narrow range video as it is received; instead squeeze-scale the subtitles' full range values (0-255 in 8-bit) into narrow range values (16-235 in 8-bit, hence 64-940 in 10-bit) before compositing. Note: if compositing in ‘linear light’, then processing in very high bit-depth (ideally ‘float’) is needed to not loose nuance-steps.
 
2) As the subtitles range, in the W3C recommended formulas, seems to be mapped to potentially cover the complete PQ range – i.e. subtitles possible to be as crazy bright as 10 000 cd/m2 (why?) – then “tts:luminanceGain” is actually not about “gain”, but always about “attenuation”. I.e. why isn't this dynamic metadata called “tts:luminanceAttenuation”?
 
2.1) As subtitles/graphics in relation to ‘Constrained PQ’ video (constrained by upcoming ITU recommended production practises) most often will stay at the very same luminance levels as for SDR subtitles/graphics inside the 0-100 cd/m2 portion of the extreme 0-10 000 cd/m2 range: the W3C recommendation regarding “tts:luminanceGain” may give too coarse level-steps for each metadata-value (creatively/authoring-wise), when always attenuating from the extreme 10 000 cd/m2 and by that initial step losing a lot of the available metadata values?

Perhaps instead: If the metadata values are too few, map the subtitles narrow range into a moderate realistic portion of PQ's complete range as a default bundled with a default attenuation?
(Example without knowing how “tts:luminanceGain” is defined or works, here exemplified as percentage: An example default mapping into a maximum 0-3200 cd/m2 portion bundled with a “tts:luminanceGain” at 3.125% gives a default compositing into the legacy 0-100 cd/m2 portion and a possibility to reach a subtitle luminance 5 stops brighter when needed for readability, because of bright video background area, by releasing the attenuation up to 100%, in this example, corresponding to 3200 cd/m2).
 

Please view or discuss this issue at https://github.com/w3c/ttml2/issues/400 using your GitHub account
Received on Friday, 7 July 2017 11:34:58 UTC