Re: [css2-transforms] Re: comments on Matrix from Rik Cabanier on 2013-03-20 (public-fx@w3.org from January to March 2013)

From: Rik Cabanier <cabanier@gmail.com>
Date: Wed, 20 Mar 2013 14:52:03 -0700
To: Benoit Jacob <jacob.benoit.1@gmail.com>
Cc: Dirk Schulze <dschulze@adobe.com>, "public-fx@w3.org" <public-fx@w3.org>
Message-ID: <CAGN7qDBomdt1QFyM_VsydiY=Vo-PcyX9T7ZSHb8raEA=-Vfvjg@mail.gmail.com>
On Wed, Mar 20, 2013 at 12:10 PM, Benoit Jacob <jacob.benoit.1@gmail.com>wrote:

>
>
> 2013/3/20 Rik Cabanier <cabanier@gmail.com>
>
>>
>>
>> On Wed, Mar 20, 2013 at 7:29 AM, Benoit Jacob <jacob.benoit.1@gmail.com>wrote:
>>
>>>
>>>
>>> 2013/3/20 Dirk Schulze <dschulze@adobe.com>
>>>
>>>> It is easier to answer to the answers entirely, even if we can discuss
>>>> details in separate threats later.
>>>>
>>>> The specification describes a unified way to exchange matrices across
>>>> other specifications. This is a very reasonable approach for me.
>>>
>>>
>>> It's reasonable to exchange matrices across APIs but we don't need a
>>> Matrix class for that, we can exchange raw arrays (say Typed Arrays) as is
>>> done in WebGL. We'd just need to agree once and for all on a storage order
>>> (say column-major as in WebGL).
>>>
>>> If we do add a Matrix interface for the purpose of exchanging data, then
>>> at least it does not need to offer any computational features.
>>>
>>>
>>>> We already have matrix definitions in SVG (SVGMatrix). And even if
>>>> SVGMatrix is less specified than with this specifications, we have a huge
>>>> amount of compatible implementations, including all major browsers and even
>>>> more SVG viewers. I am much less concerned about the specification than you
>>>> are. In fact, there is a need for an exchange format of transformation
>>>> descriptions. Currently, HTML Canvas relies on SVGMatrix to describe a CTM.
>>>>
>>>
>>>> The primary goal of the specification is interoperability and backwards
>>>> compatibility. As mentioned before, SVG described SVGMatrix. This
>>>> specification replaces SVGMatrix with the requirement to be as much
>>>> backwards compatible as possible. This requires to follow the naming schema
>>>> chosen in the specification.
>>>
>>>
>>> That SVG has a SVGMatrix doesn't imply that other Web APIs should have a
>>> matrix class. Maybe SVG had a good reason to have a matrix interface, which
>>> I don't know, but I don't understand how that would generalize enough to
>>> have a Web-wide matrix interface,
>>>
>>
>> No, CSS, Canvas and SVG currently all have different matrices. People
>> shouldn't have to learn 3 different ways to do the same thing.
>> A matrix class is needed to bridge the gaps.
>>
>
> In my ideal world these Web APIs wouldn't impose their own matrices, they
> would just take raw arrays. The user would be free to use whatever JS
> matrix library he wants. The user would get more flexibility, no mismatched
> matrix APIs, and better performance too going forward.
>
> In the non-ideal world where matrix APIs have already been allowed to
> sneak into Web APIs, first of all I don't really see how adding yet another
> matrix API fixes anything, since we can't really take anything out from
> existing Web APIs (we don't break Web compatibility) so we're stuck with
> any matrix API we have for a long time anyway. If really we agree to try to
> at least standardize on one matrix interface going forward then I'm at
> least arguing that it should focus solely on data exchange and NOT on
> anything computational.
>

I agree that data exchange is the most important aspect.
However, since this API needs to match SVG's matrix, it HAS to have
computational functions since they're in SVGMatrix too.


>
>
>>
>>
>>> when, as I said above, arrays are enough to exchange matrices, and even
>>> if we really wanted a matrix interface for data exchange, that still
>>> wouldn't justify putting computational features in it.
>>>
>>
>> There is a need for those transformations. Are you suggesting that
>> everyone should roll their own or rely on JaveScript libraries?
>>
>
> Yes, exactly that.
>
> Two years ago it made sense to want to have browser-level support for
> that. Now that JS performance has improved much, and has a clear path to
> close in on native performance (see asm.js within a factor of two of native
> code here: https://bugzilla.mozilla.org/show_bug.cgi?id=840282#c0 ) there
> is little hope that browser-supported matrices could be as fast as JS
> matrices. Again, (as I explained above) JS matrices are at a large
> theoretical performance advantage thanks to letting the JS compiler
> optimize everything. Keep in mind that matrix libraries is _the_ area where
> any matrix library that wants to achieve high performance on nested
> arithmetic expressions (even just a+b+c for three vectors) has to behave
> like a compiler, which is why matrix libraries are about the only area
> where expression templates are widely used in c++.
>

OK. Let's say that we find that JS is so fast that there's no need to
implement the computation on the C++ side. In that case, let's implement it
all in JavaScript.
However, even asm.js which runs in a restricted environment is still twice
as slow. Having all to access all those matrix members in JS to do the
manipulation will also be slow (unless you code VERY carefully)

It's still very useful that this class is available to the user without
having to download external files.


>
>
>>
>>
>>>
>>>
>>>> The interoperability is not limited to SVGMatrix. There is a strong
>>>> relation to CSS Transforms. The decomposing code is entirely based on the
>>>> matrix decomposing for CSS Transforms.
>>>
>>>
>>> It's good to know where bad things (this matrix decomposition) come
>>> from, but that doesn't provide justification to spread them even wider!
>>>
>>
>> The reason we use this particular decomposition is that Apple uses it in
>> their Core Animation framework which is used to do animated transforms.
>> If a user wants to simulate a transition or animation of a transform, he
>> needs to use the exact same algorithm that the browser used.
>>
>
> If Apple found that that decomposition helped them achieve whatever
> particular animations they were interested in doing on iOS or Mac OS ---
> good for them. There is no way that I could be convinced that this
> decomposition is the right one for general enough animations i.e. with
> arbitrary rotations and scalings along arbitrary (not just X, Y, Z) axes.
> That's because interpolating geometric transformations in a sane way should
> give you the same result no matter what basis (what coordinate system) is
> used, and the present decomposition is not covariant as I tried to explain
> in my previous email. If you want to interpolate arbitrary linear (i.e. no
> translation -- this is easy to interpolate separately) transforms in a sane
> way, first do a polar decomposition of the two endpoint transforms,
>
>     transform = rotation1 * scaling
>
> where scaling is an arbitrary symmetric matrix; then diagonalize the
> scaling into
>
>     scaling = rotation2 * diagonal * rotation2^T
>
> then interpolate the rotation1's, the diagonal coefficients, and the
> rotation2's separately. Note that these two decompositions can be achieved
> with a single SVD, but that viewpoint is probably better because if you
> matrices are already rotations you definitely don't want to replace them by
> a product of two rotations, as a direct SVD would do. That interpolation
> will look very natural geometrically because you'll really be interpolating
> singular values and left/right singular vectors i.e. things that have a
> coordinate-free geometric significance. Using a different coordinate system
> wouldn't make any difference.
>
> If you're not convinced yet I'm OK to work out an example where QR-based
> interpolation would give weird results for reasonable endpoints.
>

It's not about 'convincing' me (you already have), it's about what is
already implemented TODAY and shipping in your browser.
We can fix bugs in the pseudo code, but we can't switch to something better
but that will give different results.


>
> After writing that I googled "matrix interpolation" and found that,
>
> http://stackoverflow.com/questions/3093455/3d-geometry-how-to-interpolate-a-matrix
> which links to a paper titled Matrix Animation and Polar Decomposition<https://docs.google.com/viewer?url=http://www.cs.wisc.edu/graphics/Courses/838-s2002/Papers/polar-decomp.pdf>which seems to recommend roughly the same as discussed above although I
> haven't read it.
>
>
>
>>
>>
>>> Since, as you seem to concede below, there is no other documentation for
>>> decompose() than looking at its pseudocode, let's do that.
>>>
>>
>> The code comes from the book 'graphics gems' and is widely accepted as a
>> reasonable way to decompose a matrix.
>> You can see WebKit's implementation of pseudo code here:
>> https://github.com/WebKit/webkit/blob/master/Source/WebCore/platform/graphics/transforms/TransformationMatrix.cpp
>> The function starts on line 298
>>
>
> Neither "Apple does it" nor "it's in Graphics Gems" imply that that's the
> right way to do it. Let's stay away from authority arguments.
>
>
>>
>>
>>>
>>> The key part is where the upper-left 3x3 block is decomposed into
>>> "rotation", "scale" and "skew". I use double quotes here because the
>>> geometric terms here, "rotation" and "scale", are clearly being abused, as
>>> we're going to see:
>>>
>>> // Now get scale and shear. 'row' is a 3 element array of 3 component vectors
>>> for (i = 0; i < 3; i++)
>>>     row[i][0] = matrix[i][0]
>>>     row[i][1] = matrix[i][1]
>>>     row[i][2] = matrix[i][2]
>>>
>>> // Compute X scale factor and normalize first row.
>>> scale[0] = length(row[0])
>>> row[0] = normalize(row[0])
>>>
>>> // Compute XY shear factor and make 2nd row orthogonal to 1st.
>>> skew[0] = dot(row[0], row[1])
>>> row[1] = combine(row[1], row[0], 1.0, -skew[0])
>>>
>>> // Now, compute Y scale and normalize 2nd row.
>>> scale[1] = length(row[1])
>>> row[1] = normalize(row[1])
>>> skew[0] /= scale[1];
>>>
>>> // Compute XZ and YZ shears, orthogonalize 3rd row
>>> skew[1] = dot(row[0], row[2])
>>> row[2] = combine(row[2], row[0], 1.0, -skew[1])
>>> skew[2] = dot(row[1], row[2])
>>> row[2] = combine(row[2], row[1], 1.0, -skew[2])
>>>
>>> // Next, get Z scale and normalize 3rd row.
>>> scale[2] = length(row[2])
>>> row[2] = normalize(row[2])
>>> skew[1] /= scale[2]
>>> skew[2] /= scale[2]
>>>
>>> // At this point, the matrix (in rows) is orthonormal.
>>> // Check for a coordinate system flip.  If the determinant
>>> // is -1, then negate the matrix and the scaling factors.
>>> pdum3 = cross(row[1], row[2])
>>> if (dot(row[0], pdum3) < 0)
>>>     for (i = 0; i < 3; i++)
>>>         scale[0] *= -1;
>>>         row[i][0] *= -1
>>>         row[i][1] *= -1
>>>         row[i][2] *= -1
>>>
>>> So that's just plain old Gram-Schmidt orthonormalization. So what's
>>> being called "rotation" here is just the orthonormalized rows of the
>>> original matrix, and what's being called "scale" here is just the lengths
>>> of the rows of the original matrix. In other words, this decomposition is a
>>> QR decomposition.
>>>
>>> Sure enough, if the input matrix is of a very special form, like
>>>
>>>     DiagonalMatrix * Rotation
>>>
>>> Then this decomposition will recover the expected scaling and rotation.
>>> But that's about it (it works in slightly more generality thanks to the
>>> skew factors, but not much more generality).
>>>
>>> For an arbitrary matrix, this decomposition will return "scaling" and
>>> "rotation" components that aren't what one would naturally expect.
>>>
>>
>> The point is that it matches the internal logic.
>>
>
> No, it doesn't. The internal logic is a Gram-Schmidt orthonormalization
> process. I've never thought of Gram-Schmidt as decomposing a matrix into
> "rotation" "scaling" and "skew". Alternatively this is a QR decomposition
> and sure enough, Q is a rotation but I've never seen it considered as "the
> rotation part of the original matrix". It's rather, "the rotation you can
> obtain by applying row operations on the original matrix". Likewise R is,
> in my book, just the triangular matrix encoding this series of row
> operations, I've never seen its diagonal coeffs seen as "the scaling part
> of the matrix" and it's off-diagonal part as "the skew part". That "skew"
> part doesn't make any geometric sense by the way, it's purely an artifact
> of the coordinate system.
>

Yes, the matrix spec is wrong.
Let's cut out the 'decompose' function and replace it with an 'interpolate'
function.

If people want, we can provide a decompose function (possibly with
different algorithmic options).


>>> Here's an example. Let's reason with 2D transforms for simplicity. Take
>>> the following 2x2 matrix:
>>>
>>>   1  3
>>>   3  1
>>>
>>> It's a non-uniform scaling along orthogonal axes: it scales by a factor
>>> of 4 along the axis generated by the (1,1) vector, and it scales by a
>>> factor of -2 along the axis generated by the (1,-1) vector.
>>>
>>
>>> So if a decomposition method is going to claim to be able to recover
>>> "scaling" and "rotation" and "skew" components from this, it should be able
>>> to find scaling factors of (4, -2), the rotation by 45 degrees bringing the
>>> above (1, 1), (1, -1) axes onto the X and Y axes, and no skew.
>>>
>>> But if you apply the spec's decompose() to this matrix, you won't find
>>> that. The scaling factors, being computed as the lengths of the rows, will
>>> be claimed to be equal to sqrt(1*1+3*3) == sqrt(10) and -sqrt(10). That's
>>> not useful. That sqrt(10) number is not an interesting thing associated
>>> with this geometric transformation, it's just incidental to how this
>>> transformation is represented in this particular basis. Moreover, giving
>>> both scaling factors as equal to sqrt(10) in absolute value misses the fact
>>> that this is a non-uniform scaling (the geometric scaling factors here are
>>> -2 and 4). The resulting "rotation" matrix will be equally useless from a
>>> geometric perspective.
>>>
>>> I'm not saying that the computation performed by decompose() is
>>> generally useless. It's Gram-Schmidt orthonormalization, and it's one of
>>> the most basic, and most useful algorithms in linear algebra. But it's not
>>> giving what is implied by the names of "rotation" and "scaling", and as a
>>> matrix decomposition, it amounts to a QR decomposition, which is useful for
>>> linear solving but is not revealing geometric components of the
>>> transformation at hand here.
>>>
>>> What this should all have been, is a polar decomposition
>>>
>>> http://en.wikipedia.org/wiki/Polar_decomposition
>>>
>>> whereby an arbitrary matrix (even singular) can be decomposed as
>>>
>>>     rotation * scaling
>>>
>>> or
>>>
>>>     scaling * rotation
>>>
>>> provided that the scaling is allowed to be along arbitrary orthogonal
>>> axes, not just the X/Y/Z axes. That is the right approach because it has
>>> meaning at the level of the geometric transformation itself, not just for
>>> its matrix in a particular basis. In other words, you can use another basis
>>> and still get the same rotation and scaling.
>>>
>>> The skew factors are a poor way to compensate for its inability to
>>> account for other axes than X/Y/Z.
>>>
>>> The requirement that the top-left 3x3 block be non-singular is another
>>> sad aspect. It's an artifact of using non-pivoting QR here. It could be
>>> solved by pivoting or, better, by abandoning QR completely and switching to
>>> a polar decomposition, which doesn't have this limitation. Anyway, in its
>>> current form, the algorithm suffers from the same issue as I outlined in my
>>> first email for inverse(): it's really bad to suddenly stop working on
>>> "singular" matrices. (See first email).
>>>
>>
>> I agree that the chosen decomposition method was not the most optimal but
>> it's not an unreasonable. (Flash for instance offers the same decomposition
>> logic)
>>
>
> As I tried to explain above, I think that the strongest sense in which you
> could say it's "not unreasonable" is it may work well enough on particular
> cases.
>
>
>
>>
>>
>>>
>>> At this point I have NO confidence in the W3C working groups to spec
>>> anything doing nontrivial matrix computations. This decomposition should
>>> have been turned down.
>>>
>>
>> We did not 'choose' it. We simply documented what's implemented by the
>> browsers.
>> You are free to choose another algorithm but it won't match what browsers
>> do.
>>
>
> Not sure what you mean. I understand that this decomposition preexists at
> least in CSS transforms. I'm saying that CSS transforms should have been
> turned down in their present form because of that.
>

Too late. It's shipping and lots of people rely on it...


>
>
>
>>
>>
>>>  I had a suspiscion that this working group was outstretching its
>>> domain of competence by speccing matrix APIs, now I'm convinced of it.
>>>
>>
>> This is just a proposal, not an edict. What would you like to see that
>> would put you at ease?
>>
>
> Again, I think the wind has turned and there is no basis anymore to expect
> that a browser-supported matrix library would outperform JS ones in the
> long run. Given that I believe that we should consider any existing
> Web-facing matrix APIs purely legacy and from now on focus on letting users
> use their own JS matrices and let new Web APIs just communicate matrices as
> typed arrays. Or if for whatever reason a matrix interface is deemed to be
> still useful then at least let it focus solely on data exchange and not on
> computation.
>

>
>>
>> Thinking about this some more, I agree that we should redesign the
>> interface.
>> The quaternion/QR decomposition is useful for smooth transitions between
>> 2 different matrices, but it does not produce intuitive output for a
>> decomposition.
>>
>> Maybe it should be replaced with an 'interpolate' method:
>>
>> Matrix interpolate(Matrix endMatrix, float progress)
>>
>> This will give you an interpolated matrix calculated with the pseudocode
>> in the spec.
>
>
> With such a signature, this method would be very slow because it would
> have to redecompose matrices everytime it's called. You'd have to introduce
> an object interface to store your internal decompositions to avoid
> recomputing them on every interpolation step.
>

You could do that internally. Nothing is stopping an implementation from
caching a decomposition.


>
> But people can already do that very well in JS and while that would give
> significantly slower code with current JS engines, saving the cost of a DOM
> API call on every interpolation step might well be enough already to offset
> the performance difference. In the long run, I don't expect that a native
> implementation could keep compete performance-wise with a JS one, except if
> it gets special shortcuts to avoid the cost of DOM API calls, but as a
> browser developer that's what I don't want to happen.
>
>
>>
>> If we still want a reasonable decomposition method, could you provide the
>> pseudo code for that?
>>
>
> The one I mentioned above is best implemented by doing a SVD, so you can
> use any pseudo-code for that --- once you have the SVD you are 99% done as
> I explain below. For simple 3x3 SVD (as we need here) you could find quick
> closed-form implementations though I don't have one ready for you here.
> That SVD will give you
>
>    matrix = U * diagonal * V^T
>
> where U and V are rotations, ^T means transpose. Then letting:
>
>   rotation1 = U * V^T
>   rotation2 = V
>
> gives you the form I discussed above.
>
> So the polar decomposition form discussed above is matrix = rotation1 *
> scaling where scaling is V * diagonal * V^T.
>
> But again I'm not advocating that we add browser-level support for doing
> that.
>

Thanks!
If people want a decomposition method, we could propose this as the default
or one of the possible ones.


>
>
>
>>
>>
>>> It was a request from SVG WG members to provide one decomposing
>>>> operation. To use the same as CSS Transforms is a logical way. The
>>>> composing and decomposing are described by the pseudo code at the end of
>>>> the specification (will be removed to avoid duplication with CSS
>>>> Transforms). I understand your concerns about additional decomposing
>>>> algorithms, but this is the reasoning for this specific algorithm.
>>>>
>>>> To point 6. This is not a matrix library. The spec provides a simple
>>>> set of functions to do basic operations. It does not aim to allow full
>>>> linear algebra.
>>>
>>>
>>> As we just discussed, this offers a QR decomposition method (part of
>>> decompose()) even if it's hidden under misleading geometric names.
>>>
>>
>> Yes. I think there are errors there. Let's fix them!
>>
>>
>>>  This also offers matrix products, and various geometric transformation
>>> helpers. In my book, this _is_ a matrix library; regardless of naming, this
>>> is plenty complex enough to be very hard to optimize fully.
>>>
>>
>> I agree. It is a library.
>>
>>
>>>
>>> Even an API offering only, say, translate() and scale() and skew() and
>>> transpose() would already have hard problems to solve. First, as these are
>>> cheap operations, the overhead of a DOM API call would be dominant, so
>>> browser developers would be scratching their heads about whether to add
>>> special JS-engine-level shortcuts to avoid the overhead of DOM calls there.
>>> That may sound overengineering until you realize that if a benchmark
>>> stresses these operations, having such shortcuts will allow to get faster
>>> by easily TWO orders of magnitude there. Now suppose that a browser engine
>>> has such shortcuts. The next problem as I mentioned in my first email is
>>> temporaries removal. Indeed if a benchmark (or a real application, for
>>> that's a real use case) does .translate().scale().skew()... then avoiding
>>> copying the intermediate results to/from temporary matrices will allow > 2x
>>> speedups. In short, as soon as you have_any_ computational feature in a
>>> matrix library, it's a tough job to optimize and maintain.
>>>
>>
>> I'm unsure if I agree. Once you have the matrix object, all the variables
>> live on the C++ side. JS will just have 1 object that it has to manage.
>>
>
> The problem is most 4x4 matrix operations are cheap enough that the cost
> of DOM API calls would be dominant.
>

Sure. Let's measure and either implement it natively or in js.


>
>
>> There are also calls to modify the matrix in-place that can be used to
>> avoid temporaries.
>>
>
> That's true but it only solves part of the problem. First, users would
> still use the non-in-place calls, so we'd still have to optimize them.
> Second, even so, that only brings part of the advantages of removing
> temporaries. Another big advantage, that that doesn't give us, is
> minimizing the number of traversals of arrays. If you do e.g.
>
> matrix.translateInPlace(...);
> matrix.scaleInPlace(...);
>
> you've traversed the matrix twice while if you had done this with a JS
> library on a good JS compiler it could have reordered instructions to
> traverse the array only once, generating smaller (hence better) code.
>

There are ways to optimize this (ie deferring the calculation until you ask
for it)


>
>
>
>>
>> I believe WebKit already has this matrix internally to speed things up...
>>
>>
>>>
>>>
>>>
>>>> It just specifies what is necessary to fulfill the goal as an common
>>>> exchange format for transformation matrices. You are mentioning benchmarks
>>>> for browsers. I actually hope that browsers will optimize for performance
>>>> as well. This brings the question of precision over performance. Either you
>>>> make a compromise or decide for one or the other. Again, for me this is not
>>>> the priority. I can live with one or the other direction.
>>>>
>>>> I hope this answers some of your questions.
>>>>
>>>
>>> Unfortunately, it doesn't.
>>>
>>> Benoit
>>>
>>>
>>>>
>>>> Greetings,
>>>> Dirk
>>>>
>>>> >
>>>> > Benoit
>>>>
>>>>
>>>
>>
>
Received on Wednesday, 20 March 2013 21:52:31 UTC