Re: Mapping of SDR into HDR

Dear All,


I don’t agree that “SDR should be mapped into HDR as when played in a 
reference environment, the composited result should then be adapted to 
the viewing environment”. In particular I disagree that, in compositing, 
there needs to be a reference environment.


If you mapped to HDR in a reference environment you would have to invert 
that rendering, and then re-render for the desired display and viewing 
environment. Instead I think that all signals (SDR and HDR) should be 
converted to a neural un-rendered form. Then the display can render, as 
appropriate, to that display and viewing environment.


An appropriate un-rendered form is a linear scene referred signal. A 
scene referred signal is what is, almost universally, captured by 
cameras (both still and video). It is not the signal that is displayed 
on a reference monitor in reference condition. All signals (SDR and HDR) 
may be converted to a linear scene referred form. Scene referred linear 
light is what is generated in CGI, where it is commonly stored as half 
floats with nominal diffuse white at a value of 1.0. A format for linear 
scene referred video is specified in BT.2100.


I beg to differ with Craig Todd’s interpretation of SDR and of the ITU 
documents to which he refers. There is, to my mind, a fundamental 
difference, and a fundamental difference in philosophy, between PQ and 
HLG/SDR. Claims that display referred PQ is the same as scene referred 
SDR seem to have engendered a great deal of confusion over the past few 
years.


Fundamentally PQ is display referred, whereas SDR and HLG are scene 
referred. This means that the PQ signal is defined in terms of the 
signal that is displayed on a reference monitor in reference viewing 
conditions, whereas a scene referred signal, such as SDR or HLG, is that 
which is captured by the camera. Since it is necessary to render a 
signal for the actual viewing conditions and viewing environment, and 
this rendering is non-linear, scene and display referred signals are not 
linearly related to each other. They are, fundamentally, different.


A good quote, which I think clarifies this issue, comes from “Cinematic 
Color” (full reference and URL below). It says: “Broadly speaking, film 
negatives encode an HDR scene-referred image, and the print embodies a 
display-referred tone mapping.” Note that the negative, corresponding to 
HLG/SDR, is scene referred and that the print, corresponding to the PQ 
signal, is display referred. The same paper provides a good discussion 
of scene versus display referred signals and also says: “This mismatch 
between the dynamic range of the real world and the dynamic range of 
display-technology makes working in display-referred color spaces (even 
linear ones) ill suited for physically-based rendering, shading, and 
*/compositing/*.”


 From my perspective PQ appears to have been designed with the cinema 
workflow in mind. That is, the final image is rendered, or “graded”, 
manually (by a colourist), in a reference cinema environment. The idea 
is that the movie will be seen in the same reference environment and, if 
this is true, the final grading is WYSIWYG. Unfortunately this isn’t 
always the case because movies are not always seen in a dark environment 
at the same brightness. For TV distribution (where the viewing 
environment is much brighter) the studios go to enormous trouble 
re-grading. Furthermore for 3D theatrical releases, where the display 
luminance is significantly less than 2D, the signal has to be separately 
graded. This leads, in part, to the enormous number of versions of each 
movie held by studios since, because each is display referred, there has 
to be a different version for each display and viewing environment.


SDR and HLG, by contrast, are scene referred, which means that the 
signal does not imply any specific display or viewing environment. This 
is obviously important for the web, where there is no control of the 
viewing device or environment. Scene referred signals may be rendered 
for a wide range of display brightnesses and environments. This 
rendering is specified (for a reference viewing environment) in BT.2100. 
Extended range rendering, taking into account the effect of surrounding 
luminance, is described in "Display of high dynamic range images under 
varying viewing conditions" (full citation below).


Craig Todd suggests that SDR, like PQ, is display referred. This is not 
the case. He mentions BT.1886, which specifies the EOTF (the display 
non-linearity, or gamma) for SDR. He rightly says that this was only 
belatedly specified, actually only in 2011. Whilst a specification for 
the assumed CRT characteristic was welcome, it was only specified 18 
years after BT.709 (and a full 29(!) years after the standard definition 
specification, BT.601, which defines the same camera non-linearity as 
BT.709). This does rather suggest that it is less important than the 
camera (gamma correction) characteristic specified decades earlier. And 
this is because BT.601 and BT.709 specify the signal, as captured by the 
camera (not the way it should be rendered on a reference monitor). It is 
worth noting that BT.1886 does NOT specify a display brightness. Rather 
the brightness is scaled to the actual display brightness. PQ, on the 
other hand, explicitly specifies the brightness of a pixel on the 
(reference) display. So SDR signals are dimensionless, whereas PQ 
signals have dimensions of cd/m^2. All this makes it clear that PQ and 
SDR/HLG are fundamentally different.


Craig further suggests that BT.2035 is somehow part of the SDR signal 
specification. This is clearly not the case having only been approved 31 
years after BT.601.


So what does happen in practice with SDR? Viewers do indeed watch on 
brighter displays, in brighter environments, than the reference 
environment. However the signal is not simply stretched. It is 
re-rendered to suit the display and (assumed) viewing environment. TV 
manufacturers often include their own “secret sauce” to make the 
programs look “good”. But, typically, TV display gamma is 2.2, not the 
2.4 specified in BT.1886, presumably because this make the pictures look 
psychovisually correct on brighter displays in brighter environments.


It is correct that PQ is likely to be displayed at a higher luminance 
than is specified for a reference display. This is a bad thing. It 
requires that PQ is re-rendered, not merely stretched in brightness to 
match the display. Simply stretching the luminance results in poor 
quality pictures because a psychovisual adjustment is needed. Simply 
stretching the luminance makes the mid tones (e.g. flesh tones) look too 
bright and “misty” or “washed out”. Unfortunately PQ, unlike HLG, does 
not specify how to re-render PQ signals on a brighter display to make 
them look correct. For content producers, such as studios, this is a 
problem because they do not know how any specific TV or monitor will 
implement the display rendering. Their careful grading may be spoilt by 
unspecified display rendering.


A further problem with re-rendering PQ is that it makes the effects of 
quantisation (“banding”) worse. PQ stands for perceptual quantisation 
and is based on setting quantisation levels so that quantisation can 
barely be seen on a reference display. This allows it to maximise 
dynamic range on that reference display. But, if you re-render PQ, you 
upset this carefully optimised quantisation strategy. Quantisation 
levels that were invisible on a reference display are stretched apart so 
that banding becomes visible on the brighter picture.


So there are two serious problems in re-rendering PQ for a brighter 
display. Firstly no algorithm is standardised to do this. Hence 
producers cannot know how their carefully produced and graded content 
will appear to the final viewer. Secondly in stretching the luminance 
the effects of quantisation become more visible, thereby degrading the 
dynamic range. In contrast, rendering HLG on brighter displays is 
specified in BT.2100 and the effects of banding on HLG are (slightly) 
reduced as displays get brighter.


The reason for these differences between PQ and HLG is due to the 
differences in their philosophy.


PQ appears to have been designed to try to produce the ultimate quality 
in a controlled, dark environment (such as a home cinema), whether it 
actually achieves this is a moot point. By assuming that PQ signals will 
be displayed in a controlled environment it is free to use increases in 
display brightness to produce brighter and brighter highlights. Since 
diffuse white is (now) set at 203 cd/m^2 it is, theoretically, possible 
for a PQ signal to have highlights up to about 50 times brighter than 
the diffuse part of the picture. Whether this is actually worth the 
bother is another moot point.


HLG was designed to be a practical TV system. With TV, and on the web, 
we do not know what the display characteristics will be and what the 
ambient conditions will be. With TV and the web you certainly cannot 
assume you have a controlled, dark, environment. In a bright environment 
you cannot see very dark parts of the picture because they are swamped 
by ambient light. So, with HLG, we assume that brighter displays will be 
used to show pictures in brighter environments (not used for ever 
increasing highlights). The dynamic range allocated to HLG highlights 
remains constant and the whole picture is made brighter, on brighter 
screens, so that it can be seen in brighter environments (such as 
offices, on the move, in bedrooms or living rooms). We think this makes 
good sense for the TV and web use cases. The rendering for brighter 
displays is specified (backed up by published experimental results), and 
quantisation effects are not adversely affected by rendering for a 
brighter display.


PQ can produce stunning results in a controlled dark environment (and so 
can HLG). But HLG seems simpler and more appropriate for applications 
such as TV and the web when the viewing conditions are not known.


Best regards,
Tim


Selan, J.2012.“Cinematic color: from your monitor to the big screen.”In 
ACM SIGGRAPH(2012)Courses 
(SIGGRAPH'12).ACM,NewYork,NY,USA,,Article9,54pages. 
DOI=http://dx.doi.org/10.1145/2343483.2343492.AlsoavailableasaVisualEffectsSocietyTechnology 
Committee White Paper from http://cinematiccolor.com/


<http://cinematiccolor.com/>

Tim Borer, "Display of high dynamic range images under varying viewing 
conditions", Proc. SPIE 10396, Applications of Digital Image Processing 
XL, 103960H (19 September 2017); doi: 10.1117/12.2274253; 
http://dx.doi.org/10.1117/12.2274253


Thompson, S. “Evaluation of Required Adjustments for HDR Displays under 
Domestic Ambient Conditions”. 
7thIEEEInternationalconferenceonconsumerelectronicsinBerlin(“ICCE-Berlin”).3rd-6thSeptember 
(2017).

Dr Tim Borer MA, MSc, PhD, CEng, MIET, SMIEEE, Fellow SMPTE
Lead Engineer
Immersive & Interactive Content
BBC Research & Development
BBC Centre House, 56 Wood Lane, London  W12 7SB

On 21/11/2017 00:00, Todd, Craig wrote:
>
> This group was brought to my attention today. Nice discussion going on.
>
> I’ve a few points and comments re the discussion of mapping/converting 
> SDR into PQ HDR.
>
> Fredrik Hubinette wrote:
>
> “SDR should be mapped into HDR as when played in a reference environment,
>
> the composited result should then be adapted to the viewing environment.”
>
> I agree.
>
> First, backing up a bit, there is the issue as to how PQ should be 
> displayed, and the comments (that I find nonsensical) that there is 
> some fundamental difference between how PQ (and its “absolute” mapping 
> of code values to pixel brightness) and SDR work in practice.
>
> Consider SDR: the EOTF was (belatedly) documented in BT.1886 
> <http://www.itu.int/rec/R-REC-BT.1886-0-201103-I/en>, with reference 
> brightness documented (also belatedly) in BT.2035 
> <http://www.itu.int/rec/R-REC-BT.2035-0-201307-I/en>. Two documents 
> that basically say use gamma 2.4 with 100% at 100 nits. The Scope of 
> BT.2035 states:
>
> “This Recommendation prescribes a method allowing HDTV producers or 
> broadcasters to establish a *reference viewing condition* for 
> evaluation of HDTV program material or completed programmes that can 
> provide repeatable results from one facility to another when viewing 
> the same material. This includes the display device and the 
> surrounding environment.” (my emphasis).
>
> BT.2100 <http://www.itu.int/rec/R-REC-BT.2100-1-201706-I/en> defines a 
> reference viewing environment, as well as reference PQ EOTF. The 
> intention is that they go together. There is no intention expressed 
> that the reference PQ EOTF should be used in a non-reference viewing 
> environment.
>
> So what happens in practice with SDR? People watch SDR video in 
> brighter environments than specified in BT.2035, and they use brighter 
> displays than specified in BT.2035.
>
> So what should happen in practice with PQ HDR? People will watch in 
> brighter environments than specified in BT.2100, and they will use 
> brighter pixels than specified in BT.2100. To my mind, there is no 
> fundamental difference in philosophy or practice between SDR and PQ 
> HDR. The only difference is that with SDR the curve and the scaling of 
> the curve were specified in separate documents. We have learned how to 
> write better specifications; with BT.2100 we tried to define a 
> complete HDR TV system in a single document.
>
> In the limited amount of PQ content to date, we have seen mid-tone 
> brightness similar to that of SDR. See: 
> https://www.dolby.com/us/en/technologies/dolby-vision/operational-guidelines-for-pq.pdf. 
> A version of this study is included in the new Report ITU-R BT.2408 
> <http://www.itu.int/pub/R-REP-BT.2408-2017> (see Annex 1).
>
> If full-scale 100 nit SDR white is mapped into PQ at 300 nits, what 
> will happen? (Hint: on a /PQ reference display/ it will look like a 
> /living room SDR display/.) Typically the 100 nit (on an SDR reference 
> display) SDR white will display at say 300 nits on a consumer display 
> intended for use in a bright room. If an HDR display is set in a mode 
> for best results in a bright room, it may likewise display a 100 nit 
> PQ pixel at 300 nits. For an SDRx3 => PQ mapped image, the SDR white 
> would then display at 900 nits on the living room PQ display, making 
> SDR whites come out 3x brighter than HDR content. More important than 
> white level would be mid-tone levels, especially the level of faces. 
> We certainly don’t want SDR faces displaying 3x brighter than HDR 
> faces. So far, we’ve seen faces in PQ at levels close to those for SDR 
> (using reference monitors for both). It is possible that over time, 
> with more experience and familiarity with HDR images, HDR levels may 
> creep up. It is also apparent that broadcast content often employs 
> higher levels than movie or dramatic content.
>
> There is an approved (but as yet unpublished) update to Report 
> BT.2390-2 <http://www.itu.int/pub/R-REP-BT.2390-2-2017> that has an 
> added section on SDR->HDR mapping. An excerpt is below (note that this 
> language was very carefully crafted).
>
> *“10.1.1  Display referred mapping of SDR into PQ*
>
> The following procedure may be followed to achieve consistent mid-tone 
> luminance levels when mapping standard dynamic range content into PQ.
>
> Standard dynamic range ITU-R BT.2020 content should be mapped to PQ by 
> applying the ITU-R BT.1886 display EOTF and then applying the PQ 
> EOTF^-1 . For unity mapping the peak signal of standard dynamic range 
> content should be set to 100 cd/m^2 or 51% PQ.
>
> Unity mapping does not change the display of the SDR content (it will 
> display on the PQ HDR reference monitor the same as it displayed on 
> the reference SDR monitor). Thus, no OOTF adjustment of the SDR 
> display light signal is necessary.
>
> If the SDR content is being inserted into HDR programming, and there 
> is desire to more closely match the brightness of the HDR content, and 
> that brightness is known, scaling can be done to bring up the 
> brightness of the mapped SDR content. Scaling should be performed with 
> care lest scaled SDR content, in particular skin tones, becomes 
> brighter than in the HDR content.”
>
> Hope the above helps with understanding.
>
> Best regards,
>
> Craig Todd
> Sr. VP and CTO
> Dolby Laboratories, Inc.
> 1275 Market St.
> San Francisco, CA 94103, USA
> T  415-558-0221  M 415 672-0221
> www.dolby.com <http://www.dolby.com/> |ct@dolby.com
>

Received on Monday, 27 November 2017 10:16:30 UTC