[css2-transforms] Re: comments on Matrix from Rik Cabanier on 2013-03-20 (public-fx@w3.org from January to March 2013)

From: Rik Cabanier <cabanier@gmail.com>
Date: Wed, 20 Mar 2013 10:39:19 -0700
To: Benoit Jacob <jacob.benoit.1@gmail.com>
Cc: Dirk Schulze <dschulze@adobe.com>, "public-fx@w3.org" <public-fx@w3.org>
Message-ID: <CAGN7qDDshrpd7jZScDT=0PT350O3a47u6SNm6-a_vROWFVkaTg@mail.gmail.com>
On Wed, Mar 20, 2013 at 7:29 AM, Benoit Jacob <jacob.benoit.1@gmail.com>wrote:

>
>
> 2013/3/20 Dirk Schulze <dschulze@adobe.com>
>
>> It is easier to answer to the answers entirely, even if we can discuss
>> details in separate threats later.
>>
>> The specification describes a unified way to exchange matrices across
>> other specifications. This is a very reasonable approach for me.
>
>
> It's reasonable to exchange matrices across APIs but we don't need a
> Matrix class for that, we can exchange raw arrays (say Typed Arrays) as is
> done in WebGL. We'd just need to agree once and for all on a storage order
> (say column-major as in WebGL).
>
> If we do add a Matrix interface for the purpose of exchanging data, then
> at least it does not need to offer any computational features.
>
>
>> We already have matrix definitions in SVG (SVGMatrix). And even if
>> SVGMatrix is less specified than with this specifications, we have a huge
>> amount of compatible implementations, including all major browsers and even
>> more SVG viewers. I am much less concerned about the specification than you
>> are. In fact, there is a need for an exchange format of transformation
>> descriptions. Currently, HTML Canvas relies on SVGMatrix to describe a CTM.
>>
>
>> The primary goal of the specification is interoperability and backwards
>> compatibility. As mentioned before, SVG described SVGMatrix. This
>> specification replaces SVGMatrix with the requirement to be as much
>> backwards compatible as possible. This requires to follow the naming schema
>> chosen in the specification.
>
>
> That SVG has a SVGMatrix doesn't imply that other Web APIs should have a
> matrix class. Maybe SVG had a good reason to have a matrix interface, which
> I don't know, but I don't understand how that would generalize enough to
> have a Web-wide matrix interface,
>

No, CSS, Canvas and SVG currently all have different matrices. People
shouldn't have to learn 3 different ways to do the same thing.
A matrix class is needed to bridge the gaps.


> when, as I said above, arrays are enough to exchange matrices, and even if
> we really wanted a matrix interface for data exchange, that still wouldn't
> justify putting computational features in it.
>

There is a need for those transformations. Are you suggesting that everyone
should roll their own or rely on JaveScript libraries?


>
>
>> The interoperability is not limited to SVGMatrix. There is a strong
>> relation to CSS Transforms. The decomposing code is entirely based on the
>> matrix decomposing for CSS Transforms.
>
>
> It's good to know where bad things (this matrix decomposition) come from,
> but that doesn't provide justification to spread them even wider!
>

The reason we use this particular decomposition is that Apple uses it in
their Core Animation framework which is used to do animated transforms.
If a user wants to simulate a transition or animation of a transform, he
needs to use the exact same algorithm that the browser used.


> Since, as you seem to concede below, there is no other documentation for
> decompose() than looking at its pseudocode, let's do that.
>

The code comes from the book 'graphics gems' and is widely accepted as a
reasonable way to decompose a matrix.
You can see WebKit's implementation of pseudo code here:
https://github.com/WebKit/webkit/blob/master/Source/WebCore/platform/graphics/transforms/TransformationMatrix.cpp
The function starts on line 298


>
> The key part is where the upper-left 3x3 block is decomposed into
> "rotation", "scale" and "skew". I use double quotes here because the
> geometric terms here, "rotation" and "scale", are clearly being abused, as
> we're going to see:
>
> // Now get scale and shear. 'row' is a 3 element array of 3 component vectors
> for (i = 0; i < 3; i++)
>     row[i][0] = matrix[i][0]
>     row[i][1] = matrix[i][1]
>     row[i][2] = matrix[i][2]
>
> // Compute X scale factor and normalize first row.
> scale[0] = length(row[0])
> row[0] = normalize(row[0])
>
> // Compute XY shear factor and make 2nd row orthogonal to 1st.
> skew[0] = dot(row[0], row[1])
> row[1] = combine(row[1], row[0], 1.0, -skew[0])
>
> // Now, compute Y scale and normalize 2nd row.
> scale[1] = length(row[1])
> row[1] = normalize(row[1])
> skew[0] /= scale[1];
>
> // Compute XZ and YZ shears, orthogonalize 3rd row
> skew[1] = dot(row[0], row[2])
> row[2] = combine(row[2], row[0], 1.0, -skew[1])
> skew[2] = dot(row[1], row[2])
> row[2] = combine(row[2], row[1], 1.0, -skew[2])
>
> // Next, get Z scale and normalize 3rd row.
> scale[2] = length(row[2])
> row[2] = normalize(row[2])
> skew[1] /= scale[2]
> skew[2] /= scale[2]
>
> // At this point, the matrix (in rows) is orthonormal.
> // Check for a coordinate system flip.  If the determinant
> // is -1, then negate the matrix and the scaling factors.
> pdum3 = cross(row[1], row[2])
> if (dot(row[0], pdum3) < 0)
>     for (i = 0; i < 3; i++)
>         scale[0] *= -1;
>         row[i][0] *= -1
>         row[i][1] *= -1
>         row[i][2] *= -1
>
> So that's just plain old Gram-Schmidt orthonormalization. So what's being
> called "rotation" here is just the orthonormalized rows of the original
> matrix, and what's being called "scale" here is just the lengths of the
> rows of the original matrix. In other words, this decomposition is a QR
> decomposition.
>
> Sure enough, if the input matrix is of a very special form, like
>
>     DiagonalMatrix * Rotation
>
> Then this decomposition will recover the expected scaling and rotation.
> But that's about it (it works in slightly more generality thanks to the
> skew factors, but not much more generality).
>
> For an arbitrary matrix, this decomposition will return "scaling" and
> "rotation" components that aren't what one would naturally expect.
>

The point is that it matches the internal logic.


>
> Here's an example. Let's reason with 2D transforms for simplicity. Take
> the following 2x2 matrix:
>
>   1  3
>   3  1
>
> It's a non-uniform scaling along orthogonal axes: it scales by a factor of
> 4 along the axis generated by the (1,1) vector, and it scales by a factor
> of -2 along the axis generated by the (1,-1) vector.
>

> So if a decomposition method is going to claim to be able to recover
> "scaling" and "rotation" and "skew" components from this, it should be able
> to find scaling factors of (4, -2), the rotation by 45 degrees bringing the
> above (1, 1), (1, -1) axes onto the X and Y axes, and no skew.
>
> But if you apply the spec's decompose() to this matrix, you won't find
> that. The scaling factors, being computed as the lengths of the rows, will
> be claimed to be equal to sqrt(1*1+3*3) == sqrt(10) and -sqrt(10). That's
> not useful. That sqrt(10) number is not an interesting thing associated
> with this geometric transformation, it's just incidental to how this
> transformation is represented in this particular basis. Moreover, giving
> both scaling factors as equal to sqrt(10) in absolute value misses the fact
> that this is a non-uniform scaling (the geometric scaling factors here are
> -2 and 4). The resulting "rotation" matrix will be equally useless from a
> geometric perspective.
>
> I'm not saying that the computation performed by decompose() is generally
> useless. It's Gram-Schmidt orthonormalization, and it's one of the most
> basic, and most useful algorithms in linear algebra. But it's not giving
> what is implied by the names of "rotation" and "scaling", and as a matrix
> decomposition, it amounts to a QR decomposition, which is useful for linear
> solving but is not revealing geometric components of the transformation at
> hand here.
>
> What this should all have been, is a polar decomposition
>
> http://en.wikipedia.org/wiki/Polar_decomposition
>
> whereby an arbitrary matrix (even singular) can be decomposed as
>
>     rotation * scaling
>
> or
>
>     scaling * rotation
>
> provided that the scaling is allowed to be along arbitrary orthogonal
> axes, not just the X/Y/Z axes. That is the right approach because it has
> meaning at the level of the geometric transformation itself, not just for
> its matrix in a particular basis. In other words, you can use another basis
> and still get the same rotation and scaling.
>
> The skew factors are a poor way to compensate for its inability to account
> for other axes than X/Y/Z.
>
> The requirement that the top-left 3x3 block be non-singular is another sad
> aspect. It's an artifact of using non-pivoting QR here. It could be solved
> by pivoting or, better, by abandoning QR completely and switching to a
> polar decomposition, which doesn't have this limitation. Anyway, in its
> current form, the algorithm suffers from the same issue as I outlined in my
> first email for inverse(): it's really bad to suddenly stop working on
> "singular" matrices. (See first email).
>

I agree that the chosen decomposition method was not the most optimal but
it's not an unreasonable. (Flash for instance offers the same decomposition
logic)


>
> At this point I have NO confidence in the W3C working groups to spec
> anything doing nontrivial matrix computations. This decomposition should
> have been turned down.
>

We did not 'choose' it. We simply documented what's implemented by the
browsers.
You are free to choose another algorithm but it won't match what browsers
do.


> I had a suspiscion that this working group was outstretching its domain of
> competence by speccing matrix APIs, now I'm convinced of it.
>

This is just a proposal, not an edict. What would you like to see that
would put you at ease?

Thinking about this some more, I agree that we should redesign the
interface.
The quaternion/QR decomposition is useful for smooth transitions between 2
different matrices, but it does not produce intuitive output for a
decomposition.

Maybe it should be replaced with an 'interpolate' method:

Matrix interpolate(Matrix endMatrix, float progress)

This will give you an interpolated matrix calculated with the pseudocode in
the spec.

If we still want a reasonable decomposition method, could you provide the
pseudo code for that?


> It was a request from SVG WG members to provide one decomposing operation.
>> To use the same as CSS Transforms is a logical way. The composing and
>> decomposing are described by the pseudo code at the end of the
>> specification (will be removed to avoid duplication with CSS Transforms). I
>> understand your concerns about additional decomposing algorithms, but this
>> is the reasoning for this specific algorithm.
>>
>> To point 6. This is not a matrix library. The spec provides a simple set
>> of functions to do basic operations. It does not aim to allow full linear
>> algebra.
>
>
> As we just discussed, this offers a QR decomposition method (part of
> decompose()) even if it's hidden under misleading geometric names.
>

Yes. I think there are errors there. Let's fix them!


> This also offers matrix products, and various geometric transformation
> helpers. In my book, this _is_ a matrix library; regardless of naming, this
> is plenty complex enough to be very hard to optimize fully.
>

I agree. It is a library.


>
> Even an API offering only, say, translate() and scale() and skew() and
> transpose() would already have hard problems to solve. First, as these are
> cheap operations, the overhead of a DOM API call would be dominant, so
> browser developers would be scratching their heads about whether to add
> special JS-engine-level shortcuts to avoid the overhead of DOM calls there.
> That may sound overengineering until you realize that if a benchmark
> stresses these operations, having such shortcuts will allow to get faster
> by easily TWO orders of magnitude there. Now suppose that a browser engine
> has such shortcuts. The next problem as I mentioned in my first email is
> temporaries removal. Indeed if a benchmark (or a real application, for
> that's a real use case) does .translate().scale().skew()... then avoiding
> copying the intermediate results to/from temporary matrices will allow > 2x
> speedups. In short, as soon as you have_any_ computational feature in a
> matrix library, it's a tough job to optimize and maintain.
>

I'm unsure if I agree. Once you have the matrix object, all the variables
live on the C++ side. JS will just have 1 object that it has to manage.
There are also calls to modify the matrix in-place that can be used to
avoid temporaries.

I believe WebKit already has this matrix internally to speed things up...


>
>
>
>> It just specifies what is necessary to fulfill the goal as an common
>> exchange format for transformation matrices. You are mentioning benchmarks
>> for browsers. I actually hope that browsers will optimize for performance
>> as well. This brings the question of precision over performance. Either you
>> make a compromise or decide for one or the other. Again, for me this is not
>> the priority. I can live with one or the other direction.
>>
>> I hope this answers some of your questions.
>>
>
> Unfortunately, it doesn't.
>
> Benoit
>
>
>>
>> Greetings,
>> Dirk
>>
>> >
>> > Benoit
>>
>>
>
Received on Wednesday, 20 March 2013 17:39:47 UTC