Re: comments on Matrix from Benoit Jacob on 2013-03-20 (public-fx@w3.org from January to March 2013)

From: Benoit Jacob <jacob.benoit.1@gmail.com>
Date: Wed, 20 Mar 2013 10:29:59 -0400
To: Dirk Schulze <dschulze@adobe.com>
Cc: "public-fx@w3.org" <public-fx@w3.org>
Message-ID: <CAJTmd9oBOPs8aNiuCkVwvbSmWkR13ZbMWoTygRSKJhdxYgBPNQ@mail.gmail.com>
2013/3/20 Dirk Schulze <dschulze@adobe.com>

> It is easier to answer to the answers entirely, even if we can discuss
> details in separate threats later.
>
> The specification describes a unified way to exchange matrices across
> other specifications. This is a very reasonable approach for me.


It's reasonable to exchange matrices across APIs but we don't need a Matrix
class for that, we can exchange raw arrays (say Typed Arrays) as is done in
WebGL. We'd just need to agree once and for all on a storage order (say
column-major as in WebGL).

If we do add a Matrix interface for the purpose of exchanging data, then at
least it does not need to offer any computational features.


> We already have matrix definitions in SVG (SVGMatrix). And even if
> SVGMatrix is less specified than with this specifications, we have a huge
> amount of compatible implementations, including all major browsers and even
> more SVG viewers. I am much less concerned about the specification than you
> are. In fact, there is a need for an exchange format of transformation
> descriptions. Currently, HTML Canvas relies on SVGMatrix to describe a CTM.
>

> The primary goal of the specification is interoperability and backwards
> compatibility. As mentioned before, SVG described SVGMatrix. This
> specification replaces SVGMatrix with the requirement to be as much
> backwards compatible as possible. This requires to follow the naming schema
> chosen in the specification.


That SVG has a SVGMatrix doesn't imply that other Web APIs should have a
matrix class. Maybe SVG had a good reason to have a matrix interface, which
I don't know, but I don't understand how that would generalize enough to
have a Web-wide matrix interface, when, as I said above, arrays are enough
to exchange matrices, and even if we really wanted a matrix interface for
data exchange, that still wouldn't justify putting computational features
in it.


> The interoperability is not limited to SVGMatrix. There is a strong
> relation to CSS Transforms. The decomposing code is entirely based on the
> matrix decomposing for CSS Transforms.


It's good to know where bad things (this matrix decomposition) come from,
but that doesn't provide justification to spread them even wider!

Since, as you seem to concede below, there is no other documentation for
decompose() than looking at its pseudocode, let's do that.

The key part is where the upper-left 3x3 block is decomposed into
"rotation", "scale" and "skew". I use double quotes here because the
geometric terms here, "rotation" and "scale", are clearly being abused, as
we're going to see:

// Now get scale and shear. 'row' is a 3 element array of 3 component vectors
for (i = 0; i < 3; i++)
    row[i][0] = matrix[i][0]
    row[i][1] = matrix[i][1]
    row[i][2] = matrix[i][2]

// Compute X scale factor and normalize first row.
scale[0] = length(row[0])
row[0] = normalize(row[0])

// Compute XY shear factor and make 2nd row orthogonal to 1st.
skew[0] = dot(row[0], row[1])
row[1] = combine(row[1], row[0], 1.0, -skew[0])

// Now, compute Y scale and normalize 2nd row.
scale[1] = length(row[1])
row[1] = normalize(row[1])
skew[0] /= scale[1];

// Compute XZ and YZ shears, orthogonalize 3rd row
skew[1] = dot(row[0], row[2])
row[2] = combine(row[2], row[0], 1.0, -skew[1])
skew[2] = dot(row[1], row[2])
row[2] = combine(row[2], row[1], 1.0, -skew[2])

// Next, get Z scale and normalize 3rd row.
scale[2] = length(row[2])
row[2] = normalize(row[2])
skew[1] /= scale[2]
skew[2] /= scale[2]

// At this point, the matrix (in rows) is orthonormal.
// Check for a coordinate system flip.  If the determinant
// is -1, then negate the matrix and the scaling factors.
pdum3 = cross(row[1], row[2])
if (dot(row[0], pdum3) < 0)
    for (i = 0; i < 3; i++)
        scale[0] *= -1;
        row[i][0] *= -1
        row[i][1] *= -1
        row[i][2] *= -1

So that's just plain old Gram-Schmidt orthonormalization. So what's being
called "rotation" here is just the orthonormalized rows of the original
matrix, and what's being called "scale" here is just the lengths of the
rows of the original matrix. In other words, this decomposition is a QR
decomposition.

Sure enough, if the input matrix is of a very special form, like

    DiagonalMatrix * Rotation

Then this decomposition will recover the expected scaling and rotation. But
that's about it (it works in slightly more generality thanks to the skew
factors, but not much more generality).

For an arbitrary matrix, this decomposition will return "scaling" and
"rotation" components that aren't what one would naturally expect.

Here's an example. Let's reason with 2D transforms for simplicity. Take the
following 2x2 matrix:

  1  3
  3  1

It's a non-uniform scaling along orthogonal axes: it scales by a factor of
4 along the axis generated by the (1,1) vector, and it scales by a factor
of -2 along the axis generated by the (1,-1) vector.

So if a decomposition method is going to claim to be able to recover
"scaling" and "rotation" and "skew" components from this, it should be able
to find scaling factors of (4, -2), the rotation by 45 degrees bringing the
above (1, 1), (1, -1) axes onto the X and Y axes, and no skew.

But if you apply the spec's decompose() to this matrix, you won't find
that. The scaling factors, being computed as the lengths of the rows, will
be claimed to be equal to sqrt(1*1+3*3) == sqrt(10) and -sqrt(10). That's
not useful. That sqrt(10) number is not an interesting thing associated
with this geometric transformation, it's just incidental to how this
transformation is represented in this particular basis. Moreover, giving
both scaling factors as equal to sqrt(10) in absolute value misses the fact
that this is a non-uniform scaling (the geometric scaling factors here are
-2 and 4). The resulting "rotation" matrix will be equally useless from a
geometric perspective.

I'm not saying that the computation performed by decompose() is generally
useless. It's Gram-Schmidt orthonormalization, and it's one of the most
basic, and most useful algorithms in linear algebra. But it's not giving
what is implied by the names of "rotation" and "scaling", and as a matrix
decomposition, it amounts to a QR decomposition, which is useful for linear
solving but is not revealing geometric components of the transformation at
hand here.

What this should all have been, is a polar decomposition

http://en.wikipedia.org/wiki/Polar_decomposition

whereby an arbitrary matrix (even singular) can be decomposed as

    rotation * scaling

or

    scaling * rotation

provided that the scaling is allowed to be along arbitrary orthogonal axes,
not just the X/Y/Z axes. That is the right approach because it has meaning
at the level of the geometric transformation itself, not just for its
matrix in a particular basis. In other words, you can use another basis and
still get the same rotation and scaling.

The skew factors are a poor way to compensate for its inability to account
for other axes than X/Y/Z.

The requirement that the top-left 3x3 block be non-singular is another sad
aspect. It's an artifact of using non-pivoting QR here. It could be solved
by pivoting or, better, by abandoning QR completely and switching to a
polar decomposition, which doesn't have this limitation. Anyway, in its
current form, the algorithm suffers from the same issue as I outlined in my
first email for inverse(): it's really bad to suddenly stop working on
"singular" matrices. (See first email).

At this point I have NO confidence in the W3C working groups to spec
anything doing nontrivial matrix computations. This decomposition should
have been turned down. I had a suspiscion that this working group was
outstretching its domain of competence by speccing matrix APIs, now I'm
convinced of it.



> It was a request from SVG WG members to provide one decomposing operation.
> To use the same as CSS Transforms is a logical way. The composing and
> decomposing are described by the pseudo code at the end of the
> specification (will be removed to avoid duplication with CSS Transforms). I
> understand your concerns about additional decomposing algorithms, but this
> is the reasoning for this specific algorithm.
>
> To point 6. This is not a matrix library. The spec provides a simple set
> of functions to do basic operations. It does not aim to allow full linear
> algebra.


As we just discussed, this offers a QR decomposition method (part of
decompose()) even if it's hidden under misleading geometric names. This
also offers matrix products, and various geometric transformation helpers.
In my book, this _is_ a matrix library; regardless of naming, this is
plenty complex enough to be very hard to optimize fully.

Even an API offering only, say, translate() and scale() and skew() and
transpose() would already have hard problems to solve. First, as these are
cheap operations, the overhead of a DOM API call would be dominant, so
browser developers would be scratching their heads about whether to add
special JS-engine-level shortcuts to avoid the overhead of DOM calls there.
That may sound overengineering until you realize that if a benchmark
stresses these operations, having such shortcuts will allow to get faster
by easily TWO orders of magnitude there. Now suppose that a browser engine
has such shortcuts. The next problem as I mentioned in my first email is
temporaries removal. Indeed if a benchmark (or a real application, for
that's a real use case) does .translate().scale().skew()... then avoiding
copying the intermediate results to/from temporary matrices will allow > 2x
speedups. In short, as soon as you have_any_ computational feature in a
matrix library, it's a tough job to optimize and maintain.



> It just specifies what is necessary to fulfill the goal as an common
> exchange format for transformation matrices. You are mentioning benchmarks
> for browsers. I actually hope that browsers will optimize for performance
> as well. This brings the question of precision over performance. Either you
> make a compromise or decide for one or the other. Again, for me this is not
> the priority. I can live with one or the other direction.
>
> I hope this answers some of your questions.
>

Unfortunately, it doesn't.

Benoit


>
> Greetings,
> Dirk
>
> >
> > Benoit
>
>
Received on Wednesday, 20 March 2013 14:30:27 UTC