Re: Matrix4x4 proposal from Chris Marrin on 2012-01-24 (public-declarative3d@w3.org from January 2012)

From: Chris Marrin <cmarrin@apple.com>
Date: Tue, 24 Jan 2012 15:57:35 -0800
To: Gregg Tavares (勤) <gman@google.com>
Cc: Timm Drevensek <timm.drevensek@igd.fraunhofer.de>, dino@apple.com, igor.oliveira@openbossa.org, public-fx@w3.org, public-d3d mlist <public-declarative3d@w3.org>
Message-id: <80535DF3-9F13-48ED-8375-387E244D0717@apple.com>

On Jan 24, 2012, at 10:52 AM, Gregg Tavares (勤) wrote:

> 
> 
> ...I'm curious what the need is for these Matrix classes. It seems to me the only need is speed. Otherwise just use JavaScript. If the need is speed then the design should do what is needed for speed, not decide "well this is slower but I think it's fast enough". It's never fast enough.

But if the overhead is 0.1% then you'll never see the performance boost of avoiding the copy. That's what I'm saying.

> 
> I don't see what the downside is for basing them on ArrayBufferView and it would let you use them for skinning characters without having to manually upload 15-70 matrices per character per frame.

You always have to upload the matrices. Even with a native matrix class you still need to send the uniform to the GPU. 

> 
> Consider.
> 
>   // at init time
>   var boneArrayLocation = gl.getUniformLocation("u_bones");
>   var boneArray = new Float32Array(numBones * 16);
>   var boneMatrices = [];
>   for (var ii = 0; ii < numBones; ++ii) {
>      boneMatrices.push(new Matrix4x4(boneArray, ii * 16));
>   }
> 
>   // at render time
>   gl.uniformMatrix4fv(boneArrayLocation, false, boneArray);
> 
> vs non ArrayViewBased
> 
>   // at init time
>   var boneLocation = [];
>   var boneMatrices = [];
>   for (var ii = 0; ii < numBones; ++ii) {
>      boneMatrices.push(new Matrix4x4());
>      boneLocations.push(gl.getUniformLocation(program, "u_bones[" + ii + "]"));
>   }
> 
>   // at render time
>   for (var ii = 0; ii < mumBones; ++ii) {
>      gl.uniformMatrix4fv(boneLocations[ii], false, boneMatrices[ii]);
>   }

In these examples you're assuming that a single uniformMatrix4fv call is more efficient than multiples. If that's true, you can still do that with the second approach:

  // at init time
  var boneArrayLocation = gl.getUniformLocation("u_bones");
  var boneArray = new ArrayBuffer(numBones * 16 * 4);
  var boneMatrices = [ ];
  var boneArrays = [ ]
  for (var ii = 0; ii < numBones; ++ii) {
     boneMatrices.push(new Matrix4x4());
     boneArrays.push(new Float32Array(boneArray, ii * 16 * 4, 16));
  }

  // at render time

  // *** This piece is missing from your examples
  // Update the matrices
  for (var ii = 0; ii < numBones; ++ii) {
    <do some matrix math on boneMatrices[ii]>
    boneMatrices[ii]. copyIntoFloat32Array(boneArrays[ii]);
  }
  // ***

  gl.uniformMatrix4fv(boneArrayLocation, false, boneArray);

Here I've created a single ArrayBuffer with enough space for all the bone matrices. Then I create an array in parallel to the boneMatrices array which holds Float32Array views into this ArrayBuffer. The part I added between the asterisks is missing from your examples. Whenever you need to change the matrices with either approach you need to access them to do the math. The only line added with the current approach is the copyIntoFloat32Array, which I'm saying will have insignificant overhead. You can still do a single gl.uniformMatrix4fv call, just like in your approach.

the downside of directly mapping an ArrayBuffer into a Matrix class is it takes away the ability of the Matrix class to hide its data. Matrix might have flags tells it the matrix is identity, or affine or other things that allow it to optimize the operations. Mapping an ArrayBuffer means the matrix data can be changed from the outside which would invalidate these flags. That would at least make the operations less efficient, or at worst could require extra checks to avoid things like divide by zero.

-----
~Chris
cmarrin@apple.com

Received on Tuesday, 24 January 2012 23:58:32 UTC