Re: IDL: number types from Allen Wirfs-Brock on 2013-03-21 (public-script-coord@w3.org from January to March 2013)

From: Allen Wirfs-Brock <allen@wirfs-brock.com>
Date: Wed, 20 Mar 2013 20:02:39 -0700
To: Boris Zbarsky <bzbarsky@MIT.EDU>
Cc: Marcos Caceres <w3c@marcosc.com>, Yehuda Katz <wycats@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, public-script-coord@w3.org
Message-Id: <BE3B4A50-8F76-4E42-AFB5-303C2CA2AF78@wirfs-brock.com>
On Mar 20, 2013, at 1:54 PM, Boris Zbarsky wrote:

> On 3/20/13 2:40 PM, Allen Wirfs-Brock wrote:
> 
...snipped assorted mutual agreement.

>> WebIDL tries to automatic the coercions by bundling all the of the parameter coercions into an extremely complex "overload resolution algorithm"
> 
> 
> WebIDL bundles the parameter coercions into the various "how to convert an ECMAScript value to a WebIDL value" algorithms.
> 
> The overload resolution algorithm then describes how to decide which WebIDL values to convert to.  It's really only relevant if you actually have overloads.  If you don't have overloads, you just convert your arguments to the single relevant set of WebIDL values, in left-to-right order.
> 
>> which essentially stands between the actual ES call site and entry into the actual target function.
> 
> Agreed that either way the conversion to "WebIDL value" sits between the caller and the body of the callee.
> 
> This has both pluses an minuses.  A significant plus is that it's easy to specify and the implementation of the coercions can be automatically implemented based on the declarative description in IDL.  This reduces complexity of implementation enormously.
> 
>> While this is a fair reflection of how web APIs have been historically implemented (using C++ or some other lower level implementation language)
> 
> Note that nothing in the above actually requires the callee to be implemented in any particular language...
> 
>> it is a terrible match to ES and makes it very difficult to actually implement web APIs using ES code.
> 
> I have to ask: why?
> 
>> The reason is the complication of interposing the overload resolution algorithm between an ES call site and the body to the target ES function.
> 
> This seems like something that can be done automatically, honestly.

yes, it can be done automatically either by generating a function sub that include the  validation code, or by calling in the first line of a function a validation engine  that is parameterized by the WebIDL text (or some other description) and the actual arguments, or by wrapping the actual implementation function with a validating wrapper as you described below.

However whatever technique is used, it doesn't look like idiomatic JavaScript code.  If this style of pessimistic validation as part of a prolog to every function is a good thing, why don't we see that technique widely used in human written JS libraries?

> 
>> Using ES5 level semantics there is only one way to do it.  The ES programmer must execute an ES implementation of the overload resolution algorithm as the very first step of each function.
> 
> Or the ES programmer assumes in his code that the overload resolution algorithm has run, and the functions exposed on the prototype run the coercions and then call the code the ES programmer has written.  These functions on the prototype can then be autogenerated from the IDL.
> 
> Or the function objects on the prototype can be hand-coded to run the coercions in some generic way but close over the actual function to be called and then invoke it in the right spot.

Right, they all amount to the same thing.   Statically generated validation is done on every function entry.  This is a loose in a dynamically typed language.  Imaging a Uint16 value that gets passed as a parameter through 4 levels of function calls before it is actually used in some manner.  If a statically-typed language that parameter is passed with no runtime checking, because the compiler can guarantee at each level that it will have a Uint16 value at runtime. In a dynamic language the check at each level cost something

So assume we pass the integer 4096 to the first function. In JS the value is passed as a "Number" (a float64).  If the 4 functions each perform WebIDL argument validation then, in four different places the validation is going to check the same value to see it it is in the range 0-65535.  At least 3 of those checks are redundant.  In dynamic language based application I've analyzed in the past, this sort of redundant validation proved to be a real performance bottleneck.

Here's another way to look at the issue.  Static language can do pessimistic type checking because the checking is all done prior to application deployment.  It's free at run time.  Dynamic languages should do optimistic type checking because the type checks all occurs at runtime and redundant checks are a waste. Optimistic checking means that a value is assumed to be of the right type (or meets other  checkable preconditions) up until an actual use of the value that requires a guarantee that the value meets the constraints.  That's the point where a check is performed. A function that simply passes a parameter value along as an argument to another function usually shouldn't do any validity checks on such a value.

> 
>> In practice, I'm sure nobody does this
> 
> In practice no one writes ES implementations of WebIDL yet.

Yes, but people now extensively write mission critical code application code and libraries in ES and we don't see these techniques being widely used.  What is it about the libraries described using WebIDL that make them unique in requiring this sort of auto generated glue code?

> 
> Though we're actually doing this in Mozilla now, with an autogenerated glue layer that implements the WebIDL coercions (and some other minor bits like information hiding that are desirable in our setup but not provided in ES out of the box).

Right, have we done any analysis yet of the systemic cost of those auto-generated coercions.


> 
>> In ES6, a Proxy over functions might be used to factor the overload resolution algorithm into one common place.
> 
> This can already be done in ES5 using the closure approach I describe above.
> 
>> But requiring every ES implemented web API to use a WebIDL Proxy will preclude use of important language features such as defining methods using the new class declaration syntax.
> 
> OK, this is a serious concern.  I would be interested in figuring out ways to make it easy to specify things that can be implemented in ES using classes _and_ easy to implement them using classes.

You would probably have to generate a sub for the entire class including all the prototype method. 


> 
>> So, my solution would be to require all new Web APIs to be specified using ES code (which in some cases may wrap pseudo-code)
> 
> So we're proposing a declarative description but using a Turing-complete language?  I suspect that this will be significantly more difficult for both specification authors and implementors....

Turing complete pseudo-code or prose is needed today to fully specify all of these Web APIs.  The WebIDL signature is presumably only a small part of the specification of most functions.

> 
> Specifically, a benefit of IDL for implementors is that a simple declarative description can be used to automatically generate code that ensures constraints about the input such that most implementors never have to explicitly worry about those constraints.  It's also possible to use the IDL to inform JIT compilation (e.g. by statically informing type inference, having the JIT optimize out unnecessary coercions, etc).
> 
> While such a setup is theoretically possible if the API specification is written in ES directly, it seems like it would be much more difficult to do in practice because an ES description would be so much more free-form.

I know I'm getting redundant, but I have to ask again, what is so special about most of the web APIs that will be specified using WebIDL. If non-WebIDL ES is adequate for complex applications why won't it be adequate for web APIs.  Converse, if there are expressiveness, reliability or performance issues with using ES for implementing WebAPIs those same issues probably exist for other applications and probably should be address in a more general manner.

> 
>> 1.  Web APIs should be designed first for ES.  The best way to do this is to actually do an initial ES implementation and check it's usability.
> 
> They should be designed to be _used_ from ES.  Whether they should be designed to be _implemented_ from ES is a separate question.
> 
> I agree that field-testing an API is very useful, so it's good to have easy ways to implement WebIDL APIs in ES.  But that doesn't make ES the ideal specification language for them; in my opinion it gives you too much rope in the simple cases (and see below about complex cases).
> 
>> 3.  API designer won't specify things (in particular overloads) that are unnatural or hard to implement using ES.
> 
> Overloads are commonly used by ES libraries, for what it's worth....

Do you know of JS libraries that expose the forms of overloads that are expressible in WebIDL?  They are awkward to express in pure JS.

> 
>> 3.  By using ES as the specification language, coercions will occur in a natural ES manner
> 
> I think the big danger with this approach is that they will occur unintentionally without the algorithm writer bothering to think about them very much.

in contrast to WebIDL, where the algorithm writer probably doesn't bother to think about the cost of the coercions that are automatically performed.


> 
> One thing I should not: nothing stops an API designer from doing what you described right now: just use "any" for everything and then define it all in prose as you describe.  The only APIs that do this are ones that are trying to do complicated and weird things (e.g. treat 42 different from "42"!).  Maybe that's because simpler APIs are more willing to accept the arbitrary constraints WebIDL imposes, though.
> 
> So here's a question.  We're speaking in a lot of generalities here, but it might be interesting to look at a specific API where we feel like WebIDL (as opposed to legacy constraints) is causing it to be a bad API and think about specification approaches that might make it better.  Do we have a specific candidate proposed for such a case study?

I agree, this would be a good exercise.

Allen
Received on Thursday, 21 March 2013 03:03:11 UTC