W3C home > Mailing lists > Public > public-script-coord@w3.org > July to September 2013

Re: Unordered setsmaps, for when ordering is hard/expensive/unwanted?

From: Brendan Eich <brendan@secure.meer.net>
Date: Mon, 02 Sep 2013 09:50:35 -0700
Message-ID: <5224C1DB.4010908@secure.meer.net>
To: Boris Zbarsky <bzbarsky@MIT.EDU>
CC: public-script-coord@w3.org
The proposal is to enumerate properties in a canonical order, "insertion 
order" by some deterministic algorithm that inserts properties into the 
object in question (gCS's result, e.g.). There's no need to 
lexicographically sort by key strings in general. What am I missing?

/be

> Boris Zbarsky <mailto:bzbarsky@MIT.EDU>
> September 2, 2013 7:39 AM
>
>
> What other deterministic sort on arbitrary Unicode strings (custom 
> properties) can you do?
>
> -Boris
>
> Brendan Eich <mailto:brendan@secure.meer.net>
> September 2, 2013 1:03 AM
>> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
>> September 2, 2013 12:37 AM
>>
>> The problem is that "insertion order" simply does not exist for some
>> types of things, or if it does, it's an artifact of individual
>> implementation ordering, and thus not reproducible across browsers.
>
> That could just be an interop bug to fix by specifying a normative 
> order, equivalent via observables to insertion order. My point was 
> that it's totally unclear other than by saying "because 
> implementations differ" why we can't so fix.
>>
>> I gave two examples of this:
>>
>> 1. The property name->value map returned by getComputedStyle(). (As I
>> said, for normal CSS properties it's actually an object, not a Map,
>> for legacy reasons, but the same point holds, since there's still an
>> observable order.) There's an ordering for the individual properties
>> that enter the CSS cascade process, but not for the output of the
>> cascade, which is what gCS returns. For custom properties
>> specifically, defining an alternative order is expensive - we have to
>> define and invoke unicode-aware sorting solely to provide a
>> predictable order for the properties, which seems distasteful.
>
> Why do we have to do any lexicographic sort? CSS may underspecify the 
> output of the cascade as reflected in the order of name/value pairs in 
> the returned object (didn't know this) but if it's the same in 
> browsers, or could be with small work, why not fix the 
> underspecification?
>>
>> 2. I plan to define a FontSet API for exposing the loaded fonts
>> available to a given document/worker. This is a mix of UA-loaded
>> fonts, from @font-face rules in CSS, and author-provided fonts, from
>> explicit constructor calls (either referring to a font by url, or
>> constructing it straight from a TypedArray, for pdf.js use-cases).
>> It's possible to define an analog of "insertion order" at first as the
>> stylesheet-order of @font-face rules, but as you change what
>> stylesheets are loaded and load fonts manually, the ordering becomes
>> harder to define and racy.
>
> How racy? You wrote "available to a given document/worker", not shared 
> mutable among workers -- right?
>
>> For example, what if you disable a
>> stylesheet, then re-enable it? Should the fonts generated by
>> @font-face rules in that sheet move to the end of the Set's ordering,
>> since they were "removed" and then re-added? This sort of ordering is
>> annoying to maintain, and of seemingly no value.
>
> Then don't do it, but taking either choice, there's no race here.
>
>> If you load a
>> stylesheet and a manually-constructed font, do you rely on execution
>> order of the load start to dictate the Set order, or race the network
>> loads against each other and insert them whenever the loads
>> individually complete?
>
> Obviously load start. Are you really planning to expose data races to JS?
>
>> If we have off-main-thread parsing, or you're
>> in an isolated-process worker, can you get a manual load in the middle
>> of a stylesheet's parsing?
>
> Again, are you really planning to expose data races to JS?
>
> Bjoern's point stands: interop testing tends to forge de-facto 
> standards, so adding a second underspecified order with Go-like random 
> start is probably just going to fail to avoid a second 
> de-facto-specified order. Then we'll have two standard orders, plus 
> the random starting thing. Better to have one order.
>>
>> These two cases just recently came up in my work; I doubt they'll be 
>> the last.
>
> I don't buy it. Either you are hiding a canonical order that could be 
> specified, or talking about data races, which we do not want in any 
> event.
>
> /be
>>
>> ~TJ
>>
>> Brendan Eich <mailto:brendan@secure.meer.net>
>> September 1, 2013 2:25 PM
>>> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
>>> September 1, 2013 11:38 AM
>>> [...]
>>>
>>> Private correspondence with Mark Miller revealed that I may have not
>>> been quite as clear as I wanted in describing what I needed.
>>
>> I've been corresponding with Mark too ;-).
>>>
>>> There's nothing wrong with the current "insertion-order" semantics for
>>> author-level Maps and Sets. Those are fine. However, "insertion
>>> order" is meaningless for some *UA-provided* Maps and Sets that we're
>>> producing now or will in future APIs.
>>
>> It doesn't matter who wrote a map or set in what language. If a spec 
>> defineds a map or set, we (Mark first, I'm right behind him) want 
>> determinism. Quoting Mark:
>>
>> "For spec'ed abstraction X that inserts into a visible table M, if we 
>> want the spec of abstraction X to be deterministic we need to specify 
>> in what order it inserts into table M, since that order itself is 
>> visible."
>>
>>> We thus have to define an
>>> alternative ordering,
>>
>> That does not follow. We have to define an ordering. Best if it can 
>> be defined as if the map or set were self-hosted, so insertion order 
>> suffices.
>>
>>> and for some types of content, doing is
>>> difficult or expensive.
>>
>> Which types of content?
>>
>> /be
>> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
>> September 1, 2013 11:38 AM
>> [...]
>>
>> Private correspondence with Mark Miller revealed that I may have not
>> been quite as clear as I wanted in describing what I needed.
>>
>> There's nothing wrong with the current "insertion-order" semantics for
>> author-level Maps and Sets. Those are fine. However, "insertion
>> order" is meaningless for some *UA-provided* Maps and Sets that we're
>> producing now or will in future APIs. We thus have to define an
>> alternative ordering, and for some types of content, doing is
>> difficult or expensive. Finding a cheap way to reduce authors'
>> reliance on ordering, such by adopting Go-ordering, would allow us to
>> let "implementation-defined order" exist in a way that isn't painful.
>>
>> ~TJ
>>
>> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
>> August 30, 2013 4:34 PM
>> Plain JS Sets and Maps are intrinsically ordered - they remember the
>> order that things were added to them, and reflect that when you
>> iterate over them.
>>
>> When returning a SetMap (or [SetClass]/[MapClass] interface),
>> "insertion order" doesn't always make sense, but often there is still
>> some reasonable ordering. For example, when a CSSStyleDeclaration
>> (which is defined with named getters rather than [MapClass] for legacy
>> reasons, but the point stands) is returned from getComputedStyle, it
>> defines the iteration order of its properties to be alphabetical.
>>
>> This is easy to do with CSSStyleDeclaration, because its keys are all
>> ASCII (CSS properties are by convention restricted to the ASCII
>> range). However, CSSVariablesMap can also be returned by
>> getComputedStyle (as the value of the "var" property on
>> CSSStyleDeclaration), and its keys are arbitrary unicode characters.
>> This *can* be sorted, but one of the reasons we made some of the
>> choices we did with CSS Variables was specifically to avoid dealing
>> with collation/etc issues.
>>
>> So, ideally, CSSVariablesMap could be defined in this case to return
>> "unordered" keys. Of course, that's not actually doable right now,
>> because that just implies "implemented-defined order", which means we
>> end up defining one of the impl-orders as the definitive order in a
>> few years anyway.
>>
>> (I have other use-cases for unordered sets - I'm playing around with
>> the Font Load Events API right now, and want to expose a Set of all
>> the fonts available to the window/worker. These come both from CSS and
>> from explicit author action, and both sources can add and remove
>> things at arbitrary points in the document lifecycle. I'd rather not
>> have to define and expose an ordering for this.)
>>
>> I know this idea was tossed around earlier in the development of Maps
>> and Sets. For example, one way to have unordered Maps/Sets would be
>> to randomly order the keys whenever you iterate. I think that was
>> rejected as an unnecessarily large expenditure of entropy and time.
>>
>> The Go language, on the other hand, gets nearly the same benefit out
>> of its "unordered" maps/sets by declaring that the iteration is in a
>> stable, implemention-defined order, but every time you iterate you
>> start from a random index (looping around when you reach the end).
>> This prevents authors from accidentally depending on starting at a
>> particular index, and while they *can* still accidentally depend on
>> "nearby" things remaining near each other (the assumption will only
>> fail when the starting index falls between them), that seems like a
>> much less likely source of fragility.
>>
>> Would it be appropriate to define what Go does, and use it in IDL for
>> Maps/Sets that we don't want to define an ordering for? Is it
>> something that JS would want to take up natively?
>>
>> ~TJ
>>
> Brendan Eich <mailto:brendan@secure.meer.net>
> September 1, 2013 2:25 PM
>> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
>> September 1, 2013 11:38 AM
>> [...]
>>
>> Private correspondence with Mark Miller revealed that I may have not
>> been quite as clear as I wanted in describing what I needed.
>
> I've been corresponding with Mark too ;-).
>>
>> There's nothing wrong with the current "insertion-order" semantics for
>> author-level Maps and Sets. Those are fine. However, "insertion
>> order" is meaningless for some *UA-provided* Maps and Sets that we're
>> producing now or will in future APIs.
>
> It doesn't matter who wrote a map or set in what language. If a spec 
> defineds a map or set, we (Mark first, I'm right behind him) want 
> determinism. Quoting Mark:
>
> "For spec'ed abstraction X that inserts into a visible table M, if we 
> want the spec of abstraction X to be deterministic we need to specify 
> in what order it inserts into table M, since that order itself is 
> visible."
>
>> We thus have to define an
>> alternative ordering,
>
> That does not follow. We have to define an ordering. Best if it can be 
> defined as if the map or set were self-hosted, so insertion order 
> suffices.
>
>> and for some types of content, doing is
>> difficult or expensive.
>
> Which types of content?
>
> /be
> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
> September 1, 2013 11:38 AM
> [...]
>
> Private correspondence with Mark Miller revealed that I may have not
> been quite as clear as I wanted in describing what I needed.
>
> There's nothing wrong with the current "insertion-order" semantics for
> author-level Maps and Sets. Those are fine. However, "insertion
> order" is meaningless for some *UA-provided* Maps and Sets that we're
> producing now or will in future APIs. We thus have to define an
> alternative ordering, and for some types of content, doing is
> difficult or expensive. Finding a cheap way to reduce authors'
> reliance on ordering, such by adopting Go-ordering, would allow us to
> let "implementation-defined order" exist in a way that isn't painful.
>
> ~TJ
>
> Tab Atkins Jr. <mailto:jackalmage@gmail.com>
> August 30, 2013 4:34 PM
> Plain JS Sets and Maps are intrinsically ordered - they remember the
> order that things were added to them, and reflect that when you
> iterate over them.
>
> When returning a SetMap (or [SetClass]/[MapClass] interface),
> "insertion order" doesn't always make sense, but often there is still
> some reasonable ordering. For example, when a CSSStyleDeclaration
> (which is defined with named getters rather than [MapClass] for legacy
> reasons, but the point stands) is returned from getComputedStyle, it
> defines the iteration order of its properties to be alphabetical.
>
> This is easy to do with CSSStyleDeclaration, because its keys are all
> ASCII (CSS properties are by convention restricted to the ASCII
> range). However, CSSVariablesMap can also be returned by
> getComputedStyle (as the value of the "var" property on
> CSSStyleDeclaration), and its keys are arbitrary unicode characters.
> This *can* be sorted, but one of the reasons we made some of the
> choices we did with CSS Variables was specifically to avoid dealing
> with collation/etc issues.
>
> So, ideally, CSSVariablesMap could be defined in this case to return
> "unordered" keys. Of course, that's not actually doable right now,
> because that just implies "implemented-defined order", which means we
> end up defining one of the impl-orders as the definitive order in a
> few years anyway.
>
> (I have other use-cases for unordered sets - I'm playing around with
> the Font Load Events API right now, and want to expose a Set of all
> the fonts available to the window/worker. These come both from CSS and
> from explicit author action, and both sources can add and remove
> things at arbitrary points in the document lifecycle. I'd rather not
> have to define and expose an ordering for this.)
>
> I know this idea was tossed around earlier in the development of Maps
> and Sets. For example, one way to have unordered Maps/Sets would be
> to randomly order the keys whenever you iterate. I think that was
> rejected as an unnecessarily large expenditure of entropy and time.
>
> The Go language, on the other hand, gets nearly the same benefit out
> of its "unordered" maps/sets by declaring that the iteration is in a
> stable, implemention-defined order, but every time you iterate you
> start from a random index (looping around when you reach the end).
> This prevents authors from accidentally depending on starting at a
> particular index, and while they *can* still accidentally depend on
> "nearby" things remaining near each other (the assumption will only
> fail when the starting index falls between them), that seems like a
> much less likely source of fragility.
>
> Would it be appropriate to define what Go does, and use it in IDL for
> Maps/Sets that we don't want to define an ordering for? Is it
> something that JS would want to take up natively?
>
> ~TJ
>
Received on Monday, 2 September 2013 16:51:04 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:37:50 UTC