Re: The Structured Clone Wars

On Thu, Jul 14, 2011 at 12:46 PM, Mark S. Miller <erights@google.com> wrote:

> At the thread "LazyReadCopy experiment and invariant checking for
> [[Extensible]]=false" on es-discuss,
> On Wed, Jul 13, 2011 at 10:29 AM, David Bruant <david.bruant@labri.fr>wrote:
>
>> Hi,
>>
>> Recently, I've been thinking about the structured clone algorithm used in
>> postMessage
>>
>
> Along with Dave Herman <
> http://blog.mozilla.com/dherman/2011/05/25/im-worried-about-structured-clone/>,
> I'm worried about structure clone <
> http://www.w3.org/TR/html5/common-dom-interfaces.html#safe-passing-of-structured-data>.
> In order to understand it better before criticizing it, I tried implementing
> it in ES5 + WeakMaps. My code appears below. In writing it, I noticed some
> ambiguities in the spec, so I implemented my best guess about what the spec
> intended.
>
> Aside: Coding this so that it is successfully defensive against changes to
> primordial bindings proved surprisingly tricky, and the resulting coding
> patterns quite unpleasant. See the explanatory comment early in the code
> below. Separately, we should think about how to better support defensive
> programming for code that must operate in the face of mutable primordials.
>
> Ambiguities:
>
> 1) When the says "If input is an Object object", I assumed it meant 'if the
> input's [[Class]] is "Object" '.
> 2) By "for each enumerable property in input" combined with "Note: This
> does not walk the prototype chain.", I assume it meant "for each enumerable
> own property of input".
> 3) By "the value of the property" combined with "Property descriptors,
> setters, getters, and analogous features are not copied in this process.", I
> assume it meant "the result of calling the [[Get]] internal method of input
> with the property name", even if the enumerable own property is an accessor
> property.
> 4) By "corresponding to the same underlying data", I assume it meant to
> imply direct sharing of read/write access, leading to shared state
> concurrency between otherwise shared-nothing event loops.
>

5) By "add a new property to output having the same name" combined with "in
the output it would just have the default state (typically read-write,
though that could depend on the scripting environment).", I assume it meant
to define the property as writable: true, enumerable: true, configurable:
true, rather than to call the internal [[Put]] method, in order to avoid
inherited setters.



>
> Are the above interpretations correct?
>
> Given the access to shared mutability implied by #4, I'm wondering why
> MessagePorts are passed separately, rather than simply being other special
> case like File in the structured clone algorithm.
>
> I've been advising people to avoid the structured clone algorithm, and send
> only JSON serializations + MessagePorts through postMessage. It's unclear to
> me why structured clone wasn't instead defined to be more equivalent to
> JSON, or to a well chosen subset of JSON. Given that they're going to
> co-exist, it behooves us to understand their differences better, so that we
> know when to advise JSON serialization/unserialization around postMessage
> vs. just using structured clone directly.
>
> There are here a fixed set of data types recognized as special cases by
> this algorithm. Unlike JSON, there are no extension points for a
> user-defined abstraction to cause its own instances to effectively be
> cloned, with behavior, across the boundary. But neither do we gain the
> advantage of avoiding calls to user code interleaved with the structured
> clone algorithm, if my resolution of #3 is correct, since these [[Get]]
> calls can call getters.
>
> In ES6 we intend to reform [[Class]]. Allen's ES6 draft <
> http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts> makes
> a valiant start at this. How would we revise structured clone to account for
> [[Class]] reform?
>
> And finally there's the issue raised by David on the es-discuss thread:
> What should the structured clone algorithm do when encountering a proxy? The
> algorithm as coded below will successfully "clone" proxies, for some meaning
> of clone. Is that the clone behavior we wish for proxies?
>
>
> ------------- sclone.js ------------------------
>
> var sclone;
>
> (function () {
>    "use strict";
>
>    // The following initializations are assumed to capture initial
>    // bindings, so that sclone is insensitive to changes to these
>    // bindings between the creation of the sclone function and calls
>    // to it. Note that {@code call.bind} is only called here during
>    // initialization, so we are insensitive to whether this changes to
>    // something other than the original Function.prototype.bind after
>    // initialization.
>
>    var Obj = Object;
>    var WM = WeakMap;
>    var Bool = Boolean;
>    var Num = Number;
>    var Str = String;
>    var Dat = Date;
>    var RE = RegExp;
>    var Err = Error;
>    var TypeErr = TypeError;
>
>    var call = Function.prototype.call;
>
>    var getValue = call.bind(WeakMap.prototype.get);
>    var setValue = call.bind(WeakMap.prototype.set);
>
>    var getClassRE = (/\[object (.*)\]/);
>    var exec = call.bind(RegExp.prototype.exec);
>    var toClassString = call.bind(Object.prototype.toString);
>    function getClass(obj) {
>      return exec(getClassRE, toClassString(obj))[1];
>    }
>
>    var valueOfBoolean = call.bind(Boolean.prototype.valueOf);
>    var valueOfNumber = call.bind(Number.prototype.valueOf);
>    var valueOfString = call.bind(String.prototype.valueOf);
>    var valueOfDate = call.bind(Date.prototype.valueOf);
>
>    var keys = Object.keys;
>    var forEach = call.bind(Array.prototype.forEach);
>
>    var defProp = Object.defineProperty;
>
>    // Below this line, we should no longer be sensitive to the current
>    // bindings of built-in services we rely on.
>
>    sclone = function(input) {
>
>      function recur(input, memory) {
>        if (input !== Obj(input)) { return input; }
>        var output = getValue(memory, input);
>        if (output) { return output; }
>
>        var klass = getClass(input);
>        switch (klass) {
>          case 'Boolean': {
>            output = new Bool(valueOfBoolean(input));
>            break;
>          }
>          case 'Number': {
>            output = new Num(valueOfNumber(input));
>            break;
>          }
>          case 'String': {
>            output = new Str(valueOfString(input));
>            break;
>          }
>          case 'Date': {
>            output = new Dat(valueOfDate(input));
>            break;
>          }
>          case 'RegExp': {
>            var flags = (input.global ? 'g' : '') +
>                        (input.ignoreCase ? 'i' : '') +
>                        (input.multiline ? 'm' : '');
>            output = new RE(input.source, flags);
>            break;
>          }
>          case 'ImageData':
>          case 'File':
>          case 'Blob':
>          case 'FileList': {
>            // TODO: implement
>            throw new Err('not yet implemented');
>            break;
>          }
>          case 'Array': {
>            output = [];
>            break;
>          }
>          case 'Object': {
>            output = {};
>            break;
>          }
>          default: {
>            throw new TypeErr('Should be DOMException(DATA_CLONE_ERR)');
>            break;
>          }
>        }
>        setValue(memory, input, output);
>
>        if (klass === 'Object' || klass === 'Array') {
>          forEach(keys(input), function(name) {
>            defProp(output, name, {
>              value: recur(input[name], memory),
>              writable: true,
>              enumerable: true,
>              configurable: true
>            });
>          });
>        }
>        return output;
>      }
>
>      return recur(input, WM());
>    };
>  })();
>
>
>
>
>
>


-- 
    Cheers,
    --MarkM

Received on Thursday, 14 July 2011 19:51:57 UTC