W3C home > Mailing lists > Public > public-script-coord@w3.org > July to September 2011

Re: The Structured Clone Wars

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 14 Jul 2011 16:28:11 -0700
Message-ID: <CA+c2ei8-GnVpCJ4WLiqn6eXEWTuH9Fud+ALk-djCszo9eT3UmA@mail.gmail.com>
To: "Mark S. Miller" <erights@google.com>
Cc: Allen Wirfs-Brock <allen@wirfs-brock.com>, public-script-coord@w3.org, es-discuss <es-discuss@mozilla.org>
I think we should remove all the requirements around special handling
of functions that can run script. It can definitely cause head aches
for implementors, but as long as *any* script can run during the
structured cloning then this is a problem we have to deal with anyway.

The fact that scripts can cause us to enter infinite loops, by for
example creating a getter which creates a new object with a getter
which creates a new object etc, is really no worse than dealing with
something like |while(1) {}| appearing anywhere in a page.

However, when I talked to Dave Herman about his concerns about
structured clones, he had an entirely different concern. The fact that
things like prototype chains disappear (and any behavior that went
along with them), any getters (and you just get a snapshot of what
they returned), any setters (and any sideeffects that they implement),
etc meant that the clone risks producing something very different from
what you started with.

His concerns, and my rebuttal, can be read at
http://blog.mozilla.com/dherman/2011/05/25/im-worried-about-structured-clone/

Also, I should say that this is my interpretation of Dave's concerns.
Please don't attribute my words to him. And dave, if you see this,
feel free to speak up :)

One possible solution would be to throw if any of the objects have
getters/setters or prototypes != Object.prototype. This is obviously a
pretty harsh change though.

This also happens with the JSON encoder for what it's worth.

My personal belief is that while this isn't ideal, it's better than
the alternatives. But others might disagree.

/ Jonas

On Thu, Jul 14, 2011 at 2:06 PM, Mark S. Miller <erights@google.com> wrote:
> Hmmm. This revision includes "except if obtaining the value of the property
> involved executing script". Now imagine writing a predicate for that in JS.
> Among native-only objects in ES5.1, you can do it by using the original
> Object.getOwnPropertyDescriptor, to first check if the property is an
> accessor property. But even in ES5.1, that does not work for non-native
> (host) objects, since host objects are free to override [[GetOwnProperty]]
> in ways that don't violate 8.6.2. Running user code
> during [[GetOwnProperty]] does not violate 8.6.2.
> Since ES6 proxies are intended only to uphold invariants that apply to both
> native and non-native objects (as recently discussed on es-discuss), proxies
> may also run user code in response to Object.getOwnPropertyDescriptor. And,
> by design (also as recently discussed on es-discuss), there's no way to test
> whether an object is a proxy.
> Do we really want a structured clone operation that *cannot* be implemented
> in JS? This seems bad.
>
>
> On Thu, Jul 14, 2011 at 1:47 PM, Allen Wirfs-Brock <allen@wirfs-brock.com>
> wrote:
>>
>> Also note that the current editor's
>> draft http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data has
>> some changes.  Also there is some controversy about some of
>> them http://www.w3.org/Bugs/Public/show_bug.cgi?id=12101
>> Something that isn't clear to me is which primordials are used to set the
>> [[Prototype]] of the generated objects.  It isn't covered in the the
>> internal structured cloning algorithm.  Perhaps it is, where structured
>> clone is invoked.
>> Allen
>>
>> On Jul 14, 2011, at 12:46 PM, Mark S. Miller wrote:
>>
>> At the thread "LazyReadCopy experiment and invariant checking for
>> [[Extensible]]=false" on es-discuss,
>> On Wed, Jul 13, 2011 at 10:29 AM, David Bruant <david.bruant@labri.fr>
>> wrote:
>>>
>>> Hi,
>>>
>>> Recently, I've been thinking about the structured clone algorithm used in
>>> postMessage
>>
>> Along with Dave Herman
>> <http://blog.mozilla.com/dherman/2011/05/25/im-worried-about-structured-clone/>,
>> I'm worried about structure clone
>> <http://www.w3.org/TR/html5/common-dom-interfaces.html#safe-passing-of-structured-data>.
>> In order to understand it better before criticizing it, I tried implementing
>> it in ES5 + WeakMaps. My code appears below. In writing it, I noticed some
>> ambiguities in the spec, so I implemented my best guess about what the spec
>> intended.
>> Aside: Coding this so that it is successfully defensive against changes to
>> primordial bindings proved surprisingly tricky, and the resulting coding
>> patterns quite unpleasant. See the explanatory comment early in the code
>> below. Separately, we should think about how to better support defensive
>> programming for code that must operate in the face of mutable primordials.
>> Ambiguities:
>> 1) When the says "If input is an Object object", I assumed it meant 'if
>> the input's [[Class]] is "Object" '.
>> 2) By "for each enumerable property in input" combined with "Note: This
>> does not walk the prototype chain.", I assume it meant "for each enumerable
>> own property of input".
>> 3) By "the value of the property" combined with "Property descriptors,
>> setters, getters, and analogous features are not copied in this process.", I
>> assume it meant "the result of calling the [[Get]] internal method of input
>> with the property name", even if the enumerable own property is an accessor
>> property.
>> 4) By "corresponding to the same underlying data", I assume it meant to
>> imply direct sharing of read/write access, leading to shared state
>> concurrency between otherwise shared-nothing event loops.
>> Are the above interpretations correct?
>> Given the access to shared mutability implied by #4, I'm wondering why
>> MessagePorts are passed separately, rather than simply being other special
>> case like File in the structured clone algorithm.
>> I've been advising people to avoid the structured clone algorithm, and
>> send only JSON serializations + MessagePorts through postMessage. It's
>> unclear to me why structured clone wasn't instead defined to be more
>> equivalent to JSON, or to a well chosen subset of JSON. Given that they're
>> going to co-exist, it behooves us to understand their differences better, so
>> that we know when to advise JSON serialization/unserialization around
>> postMessage vs. just using structured clone directly.
>> There are here a fixed set of data types recognized as special cases by
>> this algorithm. Unlike JSON, there are no extension points for a
>> user-defined abstraction to cause its own instances to effectively be
>> cloned, with behavior, across the boundary. But neither do we gain the
>> advantage of avoiding calls to user code interleaved with the structured
>> clone algorithm, if my resolution of #3 is correct, since these [[Get]]
>> calls can call getters.
>> In ES6 we intend to reform [[Class]]. Allen's ES6 draft
>> <http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts> makes
>> a valiant start at this. How would we revise structured clone to account for
>> [[Class]] reform?
>> And finally there's the issue raised by David on the es-discuss thread:
>> What should the structured clone algorithm do when encountering a proxy? The
>> algorithm as coded below will successfully "clone" proxies, for some meaning
>> of clone. Is that the clone behavior we wish for proxies?
>>
>> ------------- sclone.js ------------------------
>> var sclone;
>> (function () {
>>    "use strict";
>>    // The following initializations are assumed to capture initial
>>    // bindings, so that sclone is insensitive to changes to these
>>    // bindings between the creation of the sclone function and calls
>>    // to it. Note that {@code call.bind} is only called here during
>>    // initialization, so we are insensitive to whether this changes to
>>    // something other than the original Function.prototype.bind after
>>    // initialization.
>>    var Obj = Object;
>>    var WM = WeakMap;
>>    var Bool = Boolean;
>>    var Num = Number;
>>    var Str = String;
>>    var Dat = Date;
>>    var RE = RegExp;
>>    var Err = Error;
>>    var TypeErr = TypeError;
>>    var call = Function.prototype.call;
>>    var getValue = call.bind(WeakMap.prototype.get);
>>    var setValue = call.bind(WeakMap.prototype.set);
>>    var getClassRE = (/\[object (.*)\]/);
>>    var exec = call.bind(RegExp.prototype.exec);
>>    var toClassString = call.bind(Object.prototype.toString);
>>    function getClass(obj) {
>>      return exec(getClassRE, toClassString(obj))[1];
>>    }
>>    var valueOfBoolean = call.bind(Boolean.prototype.valueOf);
>>    var valueOfNumber = call.bind(Number.prototype.valueOf);
>>    var valueOfString = call.bind(String.prototype.valueOf);
>>    var valueOfDate = call.bind(Date.prototype.valueOf);
>>    var keys = Object.keys;
>>    var forEach = call.bind(Array.prototype.forEach);
>>    var defProp = Object.defineProperty;
>>    // Below this line, we should no longer be sensitive to the current
>>    // bindings of built-in services we rely on.
>>    sclone = function(input) {
>>      function recur(input, memory) {
>>        if (input !== Obj(input)) { return input; }
>>        var output = getValue(memory, input);
>>        if (output) { return output; }
>>        var klass = getClass(input);
>>        switch (klass) {
>>          case 'Boolean': {
>>            output = new Bool(valueOfBoolean(input));
>>            break;
>>          }
>>          case 'Number': {
>>            output = new Num(valueOfNumber(input));
>>            break;
>>          }
>>          case 'String': {
>>            output = new Str(valueOfString(input));
>>            break;
>>          }
>>          case 'Date': {
>>            output = new Dat(valueOfDate(input));
>>            break;
>>          }
>>          case 'RegExp': {
>>            var flags = (input.global ? 'g' : '') +
>>                        (input.ignoreCase ? 'i' : '') +
>>                        (input.multiline ? 'm' : '');
>>            output = new RE(input.source, flags);
>>            break;
>>          }
>>          case 'ImageData':
>>          case 'File':
>>          case 'Blob':
>>          case 'FileList': {
>>            // TODO: implement
>>            throw new Err('not yet implemented');
>>            break;
>>          }
>>          case 'Array': {
>>            output = [];
>>            break;
>>          }
>>          case 'Object': {
>>            output = {};
>>            break;
>>          }
>>          default: {
>>            throw new TypeErr('Should be DOMException(DATA_CLONE_ERR)');
>>            break;
>>          }
>>        }
>>        setValue(memory, input, output);
>>        if (klass === 'Object' || klass === 'Array') {
>>          forEach(keys(input), function(name) {
>>            defProp(output, name, {
>>              value: recur(input[name], memory),
>>              writable: true,
>>              enumerable: true,
>>              configurable: true
>>            });
>>          });
>>        }
>>        return output;
>>      }
>>      return recur(input, WM());
>>    };
>>  })();
>>
>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss@mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
>
>
> --
>     Cheers,
>     --MarkM
>
> _______________________________________________
> es-discuss mailing list
> es-discuss@mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
Received on Thursday, 14 July 2011 23:29:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 May 2013 19:30:04 UTC