- From: Dirk Schulze <dschulze@adobe.com>
- Date: Tue, 23 Jul 2013 22:53:14 -0700
- To: Rik Cabanier <cabanier@gmail.com>
- CC: "robert@ocallahan.org" <robert@ocallahan.org>, "public-fx@w3.org" <public-fx@w3.org>
On Jul 24, 2013, at 7:06 AM, Rik Cabanier <cabanier@gmail.com> wrote: > > > On Tue, Jul 23, 2013 at 4:41 PM, Dirk Schulze <dschulze@adobe.com> wrote: > > On Jul 23, 2013, at 7:00 PM, Rik Cabanier <cabanier@gmail.com> wrote: > > > > > > > On Mon, Jul 22, 2013 at 11:51 PM, Dirk Schulze <dschulze@adobe.com> wrote: > > > > On Jul 23, 2013, at 8:30 AM, Robert O'Callahan <robert@ocallahan.org> wrote: > > > > > On Tue, Jul 23, 2013 at 5:58 PM, Dirk Schulze <dschulze@adobe.com> wrote: > > > On Jul 23, 2013, at 7:41 AM, Robert O'Callahan <robert@ocallahan.org> wrote: > > > > I'm afraid that your proposed change may be rather complex though. I'd like to see the details. > > > > > > grayscale, sepia, saturate, hue-rotate, invert, opacity, brightness and contrast are all filter operations that can be represented by a color matrix. Lets take a look at the following example: > > > > > > I understand all that. I'm just saying that you'll have to define exactly how filter primitives are combined into groups for clamping, and you'll be forcing implementations to do it that way. > > > > > > Also, note that in your example it seems no clamping is actually necessary, since the primitives you chose should not send any values out of the 0-255 range. (Except huerotate maybe?) > > > > As long as values are between 0% and 100% for most of the shorthand they should be fine. However, this is not always the case for filters like brightness, saturate or contrast. Here you often want to go beyond 100%. To find the ranges for hue-rotate is a bit more math. I did not do the math yet, but checking the matrix multiplication (and especially summation) that is involved, I expect smaller ranges where you can assume that you don't need to clamp. > > > > It would indeed require more detailed description in the spec and as you described require a definition of grouping of filter primitives. On the other hand, the spec requires clamping at the moment which is also forcing implementations to do it one certain way and in this case a less efficient way. > > > > Not necessarily. As you mentioned, certain filer operation always produce clamped values so it would be OK for the implementation to optimize those. > > I think you should remove the note and leave it up to the implementors if they want to optimize certain code paths. > > I am not talking about rare use cases. Values over 100% for the most filter primitives is very likely to happen. Same for using more than one filter primitives in a chain. Again, we are about a complexity from linear to a constant in many cases. I think that this fact is worth it to questioning the "always-clamp". Even with better hardware and GPU acceleration in the future, this would still be a win. > > Maybe you're missing my point. The spec should say that you should always clamp. > However, an implementation could look at the filter chain and even the used values. Then it could decide that clamping is not needed and chain the filters together. No, I am not. I totally get your point. It is the same as roc says. Let it up to the browser vendor to optimize filters when possible. My point is that these optimizations are extremely limited if we force implementations to clamp after each filter primitive. > > To your previous argument about the cache: Quite often the images are big enough to cause cache misses. And even if not, the speedup of one run in comparison to n runs over each pixel is huge. > > There would be no cache misses if you apply the n-runs at the same time on the same pixels. You can do this easily for filters such as sepia or contrast than don't look at surrounding pixels. For blurring and dropshadows, things are more complex but it can still be done. > So, the large CPU cache is set to a mode to read from an image (refresh-ahead) and write to another image efficiently (write-behind) while the actual filter is in the instruction cache. > Things would be easier on the GPU :-) You would not have different cache misses, indeed. You would still have a linear operation expense in comparison to a constant of 1. Greetings, Dirk
Received on Wednesday, 24 July 2013 05:53:47 UTC