W3C home > Mailing lists > Public > public-colorweb@w3.org > March 2019

RE: High Dynamic Range imaging and ICC colour management

From: Craig Revie <Craig.Revie@ffei.co.uk>
Date: Fri, 29 Mar 2019 15:06:52 +0000
To: Craig Revie <Craig.Revie@ffei.co.uk>, Joe Drago <jdrago@netflix.com>
CC: Mark Watson <watsonm@netflix.com>, "public-colorweb@w3.org" <public-colorweb@w3.org>
Message-ID: <AM0PR08MB30288C041BB32D4868EE8EDAD95A0@AM0PR08MB3028.eurprd08.prod.outlook.com>
As promised, here is a summary of our discussion (Joe and Ben, feel free to add anything I have missed.

Summary of discussion with Netflix
Craig Revie, Joe Drago, Ben Nason

Use of ICC Profiles: Netflix approach is to use a Matrix TRC ICC Profile as image metadata that provides a mapping of pixels to PCS XYZ values. Pixel values and PCS values are in the range [0 1]. The luminance corresponding to pixel encoding RGB(1 1 1) (mapped to PCS XYZ(1 1 1)) is given in the ‘lumi’ tag of the profile (peak content luminance).  This allows the absolute luminance of content pixels to be determined. For the simplest case where the peak display luminance is equal to the peak content luminance, standard ICC colour management (e.g. LittleCMS) can be used along with an ICC Profile for the display to map images directly to the display. In this way absolute luminance is preserved.

Graphics white: Netflix fallback assumption is that graphics white will be around 300 cd/m2. It is very likely that content diffuse white will vary substantially from this value for any given image sequence (e.g. snow scene to cave). On the MS Windows platform, the user can set the level for graphics white. Netflix does not currently support the Windows platform but would look to adjust their rendering based on this setting. This would also apply to display of HDR content using web browsers. [Note from Joe: On the PS4, we don't know anything about the panel; we only know what the peak luminance is of the final rendering surface is (10000 nits). We scale all content we render into that 10000nits space and happen to render graphics white at 300 nits into that surface.]

Different display peak luminance: where the display peak luminance is higher than that of the content, absolute luminance mapping is applied. Where the display peak luminance is slightly lower than the peak content luminance, content is clipped. Where the difference is large, mapping is performed using a Reinhard mapping function [2]. This mapping function is a global mapping that modifies the entire luminance range and so does not attempt to preserve the luminance range below diffuse white (no ‘knee’). This is done as an intermediate step between the source and destination ICC Profiles.

Independent source and destination: in some cases, for example for a desktop computer, the content provider has no control over the way in which content is rendered to the display. Communicating the intended mapping between the content provider and (independent) display driver algorithm will require further discussion between various stakeholders. Content that assumes PQ may need to be mapped differently from content that is HLG and SDR content will be different again. Metadata could be provided in the form of an ICC Profile or by another means. Perhaps defining both options would be useful. A discussion of this topic between stakeholders would be very valuable and some form of best practice document could be agreed. This may be a role that the ICC could usefully play.

Web browsers: it may be that ICC Profiles are too large for some web content and providing metadata in a more compact form may be desirable. For this and other applications, as a minimum, it would be helpful to extend the set of parametric curves to include PQ and HLG.

PQ and HLG: it would be helpful to have an exchange of ideas between experts who have experience with PQ and those who have experience of HLG workflows, not necessarily about ICC but more generally about tone and colour mapping expectations.

 [2] Reinhard, E., Devlin, K., Dynamic Range Reduction inspired by Photoreceptor Physiology, IEEE Transactions on visualisation and computer graphics, 2005. See http://erikreinhard.com/papers/tvcg2005.pdf

From: Craig Revie <Craig.Revie@ffei.co.uk>
Sent: 11 March 2019 07:42
To: Joe Drago <jdrago@netflix.com>
Cc: Mark Watson <watsonm@netflix.com>; public-colorweb@w3.org
Subject: RE: High Dynamic Range imaging and ICC colour management

Hi Joe,

I agree that this is really in the weeds, why don’t we take this off-line and then make a summary for the group afterwards? I am happy to set up a call to discuss with you and anyone else on this list who has the patience.

Best regards,

From: Joe Drago <jdrago@netflix.com<mailto:jdrago@netflix.com>>
Sent: 08 March 2019 20:34
To: Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>>
Cc: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>; public-colorweb@w3.org<mailto:public-colorweb@w3.org>
Subject: Re: High Dynamic Range imaging and ICC colour management

Ah, your third paragraph might help explain our disconnect.

Your LED example only makes sense if you think I mean maxCLL when I say "max luminance", which is not what I want, at all. We don't actually leverage maxCLL nor maxFALL in our pipeline; we simply want to know how bright a pixel *would be* if all 3 RGB channels were at max. It doesn't require any pixels in that image to actually reach that lumi ceiling. To offer a (similarly contrived) example, you could offer completely black image with a lumi tag set at 10000, and that is still a valid image. To come back to your LED example, if the lumi tag was set to the same value in both images, all of the raw channel values (except where the LED is enabled) would not change at all.

To start over on my usage of the lumi tag: the value we store is "how bright is (1.0, 1.0, 1.0)", not "how bright is the brightest pixel". For example, if I were to create a regular 1 pixel pure-red sRGB image and store a lumi value of 100 nits, the brightness of that image/pixel is only 21.26 nits, as that is the ceiling for pure red in BT.709's gamut. If *all* of those channels were full though, it'd be 100 nits in luminance.

Your first paragraph suggests that your tonemap operator takes into account where diffuse white is and avoids adjusting the pixels under that threshold (and avoiding any discontinuities that could make?). In our pipeline, if we find that the image we must render to the output/panel exceeds the max luminance of the output, we tonemap the entire image (including anything that would be under diffuse white) the same way, currently just using basic a Reinhard operator. Darker pixels are barely affected in this case anyway.

To revisit the general notion of maxCLL, there is a possible maxCLL variant that could be leveraged in our lumi tag, and one colorist uses if autograding (-a) is enabled:

When encoding an image and you don't know which max luminance to use (the value for the lumi tag), you can walk the image looking for the *largest RGB channel* (NOT brightest nits), and then choose the max luminance that would be achieved if you used that max channel *in all 3 channels*. For example, imagine you're ingesting a 10bpc BT.2020 PQ (10000 nits) source image for image compression to a simpler curve (say gamma 2.2) and new max luminance, and it is a fairly dark scene. You inspect all of the pixels, and the pixel with the largest single channel is rgb10(0,0,520), which is only a meager 5.9 nits, because blue's max luminance is really low. This isn't the brightest pixel in the scene by any means, to be clear. There might be another pixel in the scene that is rgb10(280,280,280) which is 7.1 nits in brightness, but since 520 is a bigger raw channel, we use 520 for the next step.

If you then evaluate the brightness of a theoretical color that has max channel in all channels rgb10(520,520,520) in the source space, you come up with a max luminance of ~100 nits, which is what you use in the destination image's lumi tag. This value isn't stating that there exists a pixel in the file that is 100 nits (thus being maxCLL). Instead, it explains what the brightness of (dstMaxChannel, dstMaxChannel, dstMaxChannel) is, even if none of the pixels hit that ceiling. If you then inspect that blue pixel (from before) in the destination image, it should be rgb10(0,0,dstMaxChannel), as you've left precisely enough room in the range to accommodate the biggest channel seen in the source image. Its brightness will still be 5.9 nits.

I'm not suggesting we must dial in precisely the max luminance like this for every image we encode. I think picking a reasonable common upper bound given a transfer function and bit depth is easier for people to get comfortable with. That said, I figured the thought exercise above might clarify exactly how I'm using the lumi tag now, and why we don't store or leverage a per-image diffuse white.

(Aside for the mailing list: I'm happy to continue this conversation here, but if it is getting too far into the weeds for others, we can discuss offline in a meeting or otherwise. I could talk about this all day.)

On Fri, Mar 8, 2019 at 7:24 AM Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>> wrote:
Hi Joe,

I guess this may be unfamiliarity on my part and I apologise if this is obvious.

If you create an image and tell me only the peak luminance, how should I display such an image? In the general ‘broadcast model’ I was presenting, there are two independent elements: an image or content creator and a display controller. The content creator produces content intended to be displayed on any conforming display and so knows nothing about the actual display’s peak white or the level that the display controller has chosen to display ‘diffuse white’.

It seems that at least for cases where the display has a lower peak luminance than the content, some mapping must be done to present the image on the display without significant loss of quality. Traditionally this has been achieved by applying tone compression to values greater than diffuse white and to do this effectively, it seems that the display controller needs to know both. Similar issues arise for displays with power-limiting circuits where the metadata associated with some (most?) PQ systems can be used by the display controller to present the image optimally on the display.

How about another (slightly contrived) case where a content creator wishes to make two images of a scene which has a small, very bright LED. On the first day, the LED is off and the second day it is on. Let’s assume that the LED does not change the average picture light level significantly. Let’s also assume that on the first day, the creator does not know about the LED. How should the content creator select the peak luminance for both cases to ensure that the display does the right thing?

What am I missing?

Best regards,

From: Joe Drago <jdrago@netflix.com<mailto:jdrago@netflix.com>>
Sent: 05 March 2019 15:59
To: Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>>
Cc: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>; public-colorweb@w3.org<mailto:public-colorweb@w3.org>
Subject: Re: High Dynamic Range imaging and ICC colour management

It depends on your goals, and who your "masters" are.

Top-of-head, I see a handful of possibilities for these luminance values:

* None specified at all. In this case, we implicitly consider peak white == diffuse white == our global diffuse white. This is how we treat all SDR content currently.

* Only peak white specified. This is when your master is "creative intent", and if the author of the image wants the white in that image to be really drab or really fierce, "I said what I said", and you must honor it. In this case, diffuse white be damned, try your best to honor the exact luminance offered in the file.

* Both diffuse and peak white are implicit in the transfer function (such as HLG). This buys you the scene-referred environment I believe you're hinting at.

* A diffuse white and either a unorm scale factor or "value of diffuse white" (between [0-1]) are provided (which is different than offering peak white). The latter would be a way to simulate overranging, as all of the file formats Netflix uses are internally unorm. Either way, diffuse white becomes the master here.

Perhaps to your point, the value of diffuse white should be consistent, and it is really only known at runtime. For example, MS Windows offers a slider which lets you pick how bright diffuse white is in their display settings. That said, if the image author wants authoritative control over how their pixels are rendered (diffuse white be damned) and wants to preserve perfect creative intent, the only value necessary is what their image's range of unorm values represent in luminance (peak white). On the other hand, if the author wants to guarantee that their diffuse whites look correct and are comfortable with *all other pixel values in their image* changing in luminance when someone manipulates diffuse white's value (such as an enduser tweaking that slider), that can be offered as well. This type of metadata would make sense (say) for sprite atlases full of UI affordances.

Right now, Netflix's pipeline and engines support the first two bullets, and I could imagine us wanting at least one of the other two for specific UI reasons, but right now the UI is happy with bullet #1 for those types, and the creatives are happy with #2 because they can dial in exactly whatever pixels they want.

As a final aside, one other possible abuse for the lumi tag would be to instead leverage the X and Z values (which are required in the spec to be 0) instead, and legislate one of the above bulleted scenarios (amongst others I haven't thought of) depending on which are nonzero. This might be a safer way to maintain backwards compatibility.

On Tue, Mar 5, 2019 at 7:18 AM Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>> wrote:
Thanks. Probably for interoperability both peak and diffuse white should be included?

From: Joe Drago <jdrago@netflix.com<mailto:jdrago@netflix.com>>
Sent: 05 March 2019 15:16
To: Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>>
Cc: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>; public-colorweb@w3.org<mailto:public-colorweb@w3.org>
Subject: Re: High Dynamic Range imaging and ICC colour management

Peak white; Diffuse white is chosen globally at app startup based on a handful of factors, such as output mode (HDR10, DVLL, etc).

On Tue, Mar 5, 2019, 6:43 AM Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>> wrote:
Hi Joe,

Thanks for your reply and a good overview of the work being done at Netflix. It is not my intent to offer any solutions at this stage in the discussion, but it seems that you are in a similar place to others working in this area, some of whom are members of this group, in that you have adopted ICC and (for perfectly good reason) given it a proprietary flavour. It seems to me that there may be scope for a suitable group of stakeholders in this area to work with the ICC to develop an open framework providing interoperability.

As well as Netflix, I am aware of work in this area by a number of other companies and it would seem that working together as a group may produce a much better solution for all concerned.

One question about your use of the luminance tag. Are you using this for peak white or diffuse white? Either way, it seems that at least for PQ image encoding both values are needed.

Best regards,

From: Joe Drago <jdrago@netflix.com<mailto:jdrago@netflix.com>>
Sent: 05 March 2019 00:38
To: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>
Cc: Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>>; public-colorweb@w3.org<mailto:public-colorweb@w3.org>
Subject: Re: High Dynamic Range imaging and ICC colour management

Hello Craig (and all other colorweb folks) --

In the meantime, please let me know if this is something you would like to work on or if you are already aware of other solutions or proposed solutions to this problem.

I figured I'd chime in here as I was responsible for the implementation of HDR image packaging here at Netflix, and my approach leverages (if not abuses!) ICC profiles. Here are some jump-off points which I'll reference in my explanation:



Just as a clarification for these links: The first link is our Netflix Blog post where we discuss the strategy we employed, and the second link is the open source tool (colorist) I created to implement that strategy. Seeing as you're all experts in ICC and I'm squarely in the newcomer group, I'll spare any explanations on ICC's guts and skip to my approach to the situation and various learnings/conclusions I came to while tackling this problem. I'll recap my (ab)use of the ICC profile from the colorist site's Overview block in this email below.

My mandate here at Netflix was to come up with a way to pack an HDR image coming from a source video that was HDR10 (BT.2020 PQ, 10000 nits display-referred) into any file format I deemed capable, and then see where we could leverage preexisting standards as much as possible. Seeing as both our source data and our output sink both operated in absolute luminance (display-referred with a range of [0-10000] nits), I based my solution around that.

I learned early on that ICC profiles could perfectly describe our color primaries, and after being disappointed that there wasn't a parametricCurveType for PQ, I realized if I was willing to use more bits per channel (16 instead of 10), that such a severe curve as PQ wasn't necessary for the transfer function, and I could instead stick with a more friendly curve that maintains enduser blending expectations (BT.1886 or even just a plain 2.2 gamma). This solved the insufficient precision in the curv LUTs cited in your slide deck, as (for example) a Type1 'para' with a value of 2.2 is quite friendly and packs nice and tiny for image file embedding. However, to fully realize a display-referred image with a standard ICC profile, I needed one more piece of information: the absolute luminance range.

I noticed that ICC profiles have had a 'lumi' tag, but despite it being populated in some very common ICC profile chunks out there (including on color.org<http://color.org>), it was merely informative and modern image editing software / CMMs appeared to ignore the value completely. However, if I were to honor & leverage this value during image creation and conversion, I could maintain a proper display-referred image file, which I could then convert on the fly when rendering on game consoles, and thus maintain creative intent exactly.

In summary: I'm storing the "correct" value in the lumi tag based on the spec, but I'm actually *using* it to derive a luminance scale when converting and rendering, thus making the ICC profile display-referred.

To be clear, I'm not trying to make the case that this is the one-true-way forward for HDR or anything; I recognize the conveniences that HLG provides and that using absolute luminance signs up various decoders for luminance scaling and tonemapping to interpret the image in the first place. That said, I do think that absolute luminance in image files is the most accurate way to maintain creative intent, and for any BMFF-derived image file (HEIF, MIAF, AVIF) using an 'nclx' type colr box, they already offer a means to specify a PQ transfer_characteristics value which implies absolute luminance anyway, so this is something we need to honor in the future regardless of my ICC lumi tag abuse.

As a wish list for ICC profiles in this very specific, display-referred image file world, here's some things I wish I had:

* I wish there existed a tag (such as the lumi tag) which indicated that the associated pixel data was display-referred, and provided a real luminance range that must be honored. While I'm using the lumi tag currently, there are 30 years of files with embedded ICC profiles that may or may not contain "interesting" lumi tag values, so ideally it'd be a new tag type, or perhaps co-opting a lesser-used tag (viewingConditionsTag?).

* A new official parametricCurveType type value for PQ, so CMMs could do proper precision PQ math without making ICC profile chunks really large.

My disclaimer for the second wish there is that I only think PQ is necessary for HDR when at 10 and 12 bpc. Once you have 16bpc, I think you can get away with murder and just use a simple 2.2 gamma; everything looks fantastic and trivially blends in the 'incorrect' way people are used to. Our final deliverables in production for our HDR images ended up being .JP2 files encoded with 16bpc BT2020 2.2 gamma, display-referred at [0-10000] nits. I still think a PQ para type should exist.

Anyway, this is where I landed on packaging HDR images. You can see my full implementation for conversion and reporting in the colorist repo linked above, including my (new and incomplete) method for detecting special ICC profile combinations that can be repackaged as compliant and ICC-free BT.2100 nclx colr boxes in AVIF files. I'm confident that I'm maintaining the fidelity of these stored images and rendering the correctly, and I'm also confident that I'm ever-so-slightly abusing ICC profiles to achieve it.

On Mon, Mar 4, 2019 at 2:04 PM Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:
Hi Craig,

This is very interesting and something we have been looking at in the context of applications running on Games Consoles and Set Top Boxes where the output is locked to HDR (when the TV supports), to avoid the problems associated with HDR / SDR switching on the HDMI link.

I'd like to introduce my colleague Joe Drago (who joined this list after your post, which is the main reason for this message). He has some comments / questions.


On Mon, Mar 4, 2019 at 6:07 AM Craig Revie <Craig.Revie@ffei.co.uk<mailto:Craig.Revie@ffei.co.uk>> wrote:
I have been asked to post a presentation that I made in last month’s ICC meeting to this group. The links below are to my presentation and a recording of the meeting including discussion.


This presentation was the result of discussion with several experts from different areas, some of whom are members of this group. The problem that we are trying to address is that of effective presentation of High Dynamic Range (HDR) and Standard Dynamic Range (SDR) images or video sequences on the same display. This is particularly relevant where the content creator is not able to control the way in which the content is rendered to the display or to broadcast content appropriate to a specific display.

An effective solution to this problem requires an architecture similar to that of the ICC where an input transform maps colour content to a well-defined ‘connection space’ and an output transform maps from this connection space to a device (usually a display or printer). Although the presentation is from an ICC perspective, other models could be used.

This is a very basic outline of one possible solution to this problem and at this stage I am interested to identify others (from this group and elsewhere) who would be interested in working further on this topic. The response from the ICC was very positive and I believe that the ICC would be willing to host a forum for this discussion, especially to explore whether the current ICC model (with minor extensions) would be useful.

I would be happy to present these ideas in more detail if anyone is interested.

In the meantime, please let me know if this is something you would like to work on or if you are already aware of other solutions or proposed solutions to this problem.

Best regards,
Craig Revie
[FFEI Limited]<http://www.ffei.co.uk>

This message and any attachment is confidential and is protected by copyright. If you are not the intended recipient, please email the sender and delete this message and any attachment from your system.

Dissemination and or copying of this email is prohibited if you are not the intended recipient. We believe, but do not warrant, that this email and any attachments are virus free. You should take full responsibility for virus checking.

No responsibility is accepted by FFEI Ltd for personal emails or emails unconnected with FFEI Limited's business.

FFEI Limited is a limited company registered in England and Wales (Registered Number: 3244452).
[Join us on Linked In]<http://www.linkedin.com/company/ffei>[Follow @FFEI_ltd]<https://twitter.com/FFEI_ltd>[FFEI YouTube Channel]<http://www.youtube.com/user/FFEIPrintTechnology>
Registered Office: The Cube, Maylands Avenue, Hemel Hempstead, Hertfordshire, HP2 7DF, England.
Received on Friday, 29 March 2019 15:22:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:14:15 UTC