Re: IDL: number types from Boris Zbarsky on 2013-03-20 (public-script-coord@w3.org from January to March 2013)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Wed, 20 Mar 2013 16:54:53 -0400
To: Allen Wirfs-Brock <allen@wirfs-brock.com>
CC: Marcos Caceres <w3c@marcosc.com>, Yehuda Katz <wycats@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, public-script-coord@w3.org
Message-ID: <514A221D.9030806@mit.edu>
On 3/20/13 2:40 PM, Allen Wirfs-Brock wrote:
> First, given the semantics of ES, it is probably a mistake to think about the various WebIDL numeric types as C++/Java/C#/etc. style nominal types.

Indeed.  They're not.

> It is probably better to think of then as  constraints/assertions about the expected ranges of parameters/results.

Yes.

> Such range annotations are useful documentation for both implementers and consumers of the APIs when they accurately describe the real requirements.  However, saying "unsigned short" or "octet" doesn't does help anyone when the real constraint is "Range(0,4)".

It _can_ help in the sense that it forces a particular coercion to 
integer.  It doesn't help if that coercion is not what's desired here.

> A specification issue with all such constraints is what happens when a parameter constraint isn't satisfied.  There seldom is one right answer.  For example, in ES the violation of an Uint32 range constraint might in different situations: truncate to 32-bits, apply a modulo 32 transformation, or throw an exception.

For what it's worth, in WebIDL those three options would be expressed as 
"[Clamp] unsigned long", "unsigned long" and "[EnforceRange] unsigned 
long" respectively.

I agree that this on its face favors the "apply a modulo 32 
transformation", largely because that one is already somewhat special in 
ES (it's ToUint32).

> Which one is most appropriate is situational and may be mandated by an legacy.

Indeed.

> Another issue is when and if to perform coercion among Number/String/Boolean/Object values.  ES provides both automatic and manual coercions among these primitive "types". Regardless of what an WebIDL description says, ES is going to pass as an argument value whatever the caller provided and when the callee function is coded in ES the corresponding paramter values will generally be automatically coerced when operated upon in any manner that requires the coercion.

While true, to achieve interop it's important to specify exactly when 
the coercions happen and how they're ordered with respect to each other, 
as you note below.  WebIDL generally aims to make this easy for the 
specification writer in common cases.

> (A given value might even be coerced to multiple different types).

This is in fact not something WebIDL does a great job of right now (but 
see below).

> The important concept is that from a ES programmer's perspective the following generally are all valid ways to pass a numeric argument to a function f:
>      f(42)
>      f("42")
>      f({valueOf: function() {return 42})

Indeed, and a WebIDL function defined as taking "long" accepts all of 
these and coerces them all to the number 42 at argument processing time.

> In the ES spec. we take care of these issues by having a set of specification level coercion "abstract operations" that a spec. writer is expected to explicitly apply at the appropriate sequence points in the pseudo-code specification.  Everything is explicit in the pseudo-code.

Indeed.

> WebIDL tries to automatic the coercions by bundling all the of the parameter coercions into an extremely complex "overload resolution algorithm"


WebIDL bundles the parameter coercions into the various "how to convert 
an ECMAScript value to a WebIDL value" algorithms.

The overload resolution algorithm then describes how to decide which 
WebIDL values to convert to.  It's really only relevant if you actually 
have overloads.  If you don't have overloads, you just convert your 
arguments to the single relevant set of WebIDL values, in left-to-right 
order.

> which essentially stands between the actual ES call site and entry into the actual target function.

Agreed that either way the conversion to "WebIDL value" sits between the 
caller and the body of the callee.

This has both pluses an minuses.  A significant plus is that it's easy 
to specify and the implementation of the coercions can be automatically 
implemented based on the declarative description in IDL.  This reduces 
complexity of implementation enormously.

> While this is a fair reflection of how web APIs have been historically implemented (using C++ or some other lower level implementation language)

Note that nothing in the above actually requires the callee to be 
implemented in any particular language...

> it is a terrible match to ES and makes it very difficult to actually implement web APIs using ES code.

I have to ask: why?

> The reason is the complication of interposing the overload resolution algorithm between an ES call site and the body to the target ES function.

This seems like something that can be done automatically, honestly.

> Using ES5 level semantics there is only one way to do it.  The ES programmer must execute an ES implementation of the overload resolution algorithm as the very first step of each function.

Or the ES programmer assumes in his code that the overload resolution 
algorithm has run, and the functions exposed on the prototype run the 
coercions and then call the code the ES programmer has written.  These 
functions on the prototype can then be autogenerated from the IDL.

Or the function objects on the prototype can be hand-coded to run the 
coercions in some generic way but close over the actual function to be 
called and then invoke it in the right spot.

> In practice, I'm sure nobody does this

In practice no one writes ES implementations of WebIDL yet.

Though we're actually doing this in Mozilla now, with an autogenerated 
glue layer that implements the WebIDL coercions (and some other minor 
bits like information hiding that are desirable in our setup but not 
provided in ES out of the box).

> In ES6, a Proxy over functions might be used to factor the overload resolution algorithm into one common place.

This can already be done in ES5 using the closure approach I describe above.

> But requiring every ES implemented web API to use a WebIDL Proxy will preclude use of important language features such as defining methods using the new class declaration syntax.

OK, this is a serious concern.  I would be interested in figuring out 
ways to make it easy to specify things that can be implemented in ES 
using classes _and_ easy to implement them using classes.

> So, my solution would be to require all new Web APIs to be specified using ES code (which in some cases may wrap pseudo-code)

So we're proposing a declarative description but using a Turing-complete 
language?  I suspect that this will be significantly more difficult for 
both specification authors and implementors....

Specifically, a benefit of IDL for implementors is that a simple 
declarative description can be used to automatically generate code that 
ensures constraints about the input such that most implementors never 
have to explicitly worry about those constraints.  It's also possible to 
use the IDL to inform JIT compilation (e.g. by statically informing type 
inference, having the JIT optimize out unnecessary coercions, etc).

While such a setup is theoretically possible if the API specification is 
written in ES directly, it seems like it would be much more difficult to 
do in practice because an ES description would be so much more free-form.

> 1.  Web APIs should be designed first for ES.  The best way to do this is to actually do an initial ES implementation and check it's usability.

They should be designed to be _used_ from ES.  Whether they should be 
designed to be _implemented_ from ES is a separate question.

I agree that field-testing an API is very useful, so it's good to have 
easy ways to implement WebIDL APIs in ES.  But that doesn't make ES the 
ideal specification language for them; in my opinion it gives you too 
much rope in the simple cases (and see below about complex cases).

> 3.  API designer won't specify things (in particular overloads) that are unnatural or hard to implement using ES.

Overloads are commonly used by ES libraries, for what it's worth....

> 3.  By using ES as the specification language, coercions will occur in a natural ES manner

I think the big danger with this approach is that they will occur 
unintentionally without the algorithm writer bothering to think about 
them very much.

One thing I should not: nothing stops an API designer from doing what 
you described right now: just use "any" for everything and then define 
it all in prose as you describe.  The only APIs that do this are ones 
that are trying to do complicated and weird things (e.g. treat 42 
different from "42"!).  Maybe that's because simpler APIs are more 
willing to accept the arbitrary constraints WebIDL imposes, though.

So here's a question.  We're speaking in a lot of generalities here, but 
it might be interesting to look at a specific API where we feel like 
WebIDL (as opposed to legacy constraints) is causing it to be a bad API 
and think about specification approaches that might make it better.  Do 
we have a specific candidate proposed for such a case study?

-Boris
Received on Wednesday, 20 March 2013 20:55:26 UTC