Re: IDL: number types from Allen Wirfs-Brock on 2013-03-22 (public-script-coord@w3.org from January to March 2013)

From: Allen Wirfs-Brock <allen@wirfs-brock.com>
Date: Thu, 21 Mar 2013 19:35:36 -0700
To: Boris Zbarsky <bzbarsky@MIT.EDU>
Cc: Marcos Caceres <w3c@marcosc.com>, Yehuda Katz <wycats@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, public-script-coord@w3.org
Message-Id: <FFB22CBE-EBCC-4B7C-A8C3-83DE0F5E3264@wirfs-brock.com>
Boris,
I'm not going to try to do a point by point rebuttal, as I think we understand each others positions.

As you know, I think there are deeper architectural issues behind these WebIDL concerns, so I'll just point to  http://www.wirfs-brock.com/allen/posts/379 where I tried awhile ago I tried to cast some light on them.

Regarding, overloading.  I took a look at the jQuery routines you cited and I'll argue that they prove my point.  They do select or refine their behavior based upon analysis of some argument "types" and values   (including looking at specific string elements) but largely in ways that I think would be hard or impossible to express via webIDL overloads. Also note that their argument analysis is spread over the bodies of the functions and not just a a prefix that is evaluated before any of the function's specific logic.  Those routines are fine examples of what JS programmer do instead of express static overloads of the sort WebIDL supports.

Allen

On Mar 20, 2013, at 9:03 PM, Boris Zbarsky wrote:

> On 3/20/13 11:02 PM, Allen Wirfs-Brock wrote:
>> However whatever technique is used, it doesn't look like idiomatic
>> JavaScript code.  If this style of pessimistic validation as part of a
>> prolog to every function is a good thing, why don't we see that
>> technique widely used in human written JS libraries?
> 
> Several reasons:
> 
> 1)  Human-written JS libraries don't tend to worry about edge-case interop.  For example, they feel free to change the order they do coercions in, as far as I can tell.
> 
> 2)  Human-written JS libraries generally don't tend to think too much about edge cases, again from what I can tell.  They figure if the caller passes in a "weird" object and that causes something to break, that's the caller's fault.
> 
> These comes back and bites those same human-written JS libraries every so often.
> 
> But most importantly:
> 
> 3)  Human-written JS libraries don't have to assume that the caller is hostile (because they run with the caller's permissions anyway, so they can't do something the caller can't do).  Unfortunately, WebIDL implementations most definitely do NOT have this luxury.
> 
>> Right, they all amount to the same thing.   Statically generated
>> validation is done on every function entry.  This is a loose in a
>> dynamically typed language.  Imaging a Uint16 value that gets passed as
>> a parameter through 4 levels of function calls before it is actually
>> used in some manner.  If a statically-typed language that parameter is
>> passed with no runtime checking, because the compiler can guarantee at
>> each level that it will have a Uint16 value at runtime. In a dynamic
>> language the check at each level cost something
> 
> Or gets optimized away by the JIT's range analysis, as the case may be.
> 
> But yes, I agree that programming defensively in this way is a performance drag in general, and the JIT won't always save you.  All I can say is that if I were implementing such a system and had to write such defensive code for it, I would make versions of the deeper-in callees that do not do validation on arguments and then call them internally, while only exposing APIs that perform validation at the trust boundary.
> 
>> So assume we pass the integer 4096 to the first function. In JS the
>> value is passed as a "Number" (a float64).
> 
> What it's actually passed in cases where performance matters (i.e. after the JIT has kicked in) as depends on what the JIT has inferred about that value and how it will be used, for what it's worth.  Maybe it's being passed as a float64, maybe an int32.
> 
>> In dynamic language
>> based application I've analyzed in the past, this sort of redundant
>> validation proved to be a real performance bottleneck.
> 
> I agree that it can be, for sure.  I wish I had a better answer here...
> 
>> Dynamic languages should do optimistic type checking because the type checks all occurs at
>> runtime and redundant checks are a waste. Optimistic checking means that
>> a value is assumed to be of the right type (or meets other  checkable
>> preconditions) up until an actual use of the value that requires a
>> guarantee that the value meets the constraints.
> 
> This general claim has tons of caveats to it.  For example, detecting that your duck is actually an elephant with a duckbill mask while in the middle of mutating some data structures involves undoing the mutations back to a consistent state... or checking before you start mutating.
> 
> Now obviously this is something that can be decided on a case-by-case basis if you sit down and take the time to analyze all the cases carefully and are competent to do so.  Or you can do checking up front and not have to do the complex and failure-prone analysis.
> 
>> A function that simply passes a parameter value
>> along as an argument to another function usually shouldn't do any
>> validity checks on such a value.
> 
> Such functions are very rare in WebIDL.  The only case I can think of in which a WebIDL method/getter/setter invokes another WebIDL method/getter/setter (as opposed to some internal operation that can just assume its arguments are sane) is [PutForwards].
> 
> So as I see it, in the current WebIDL setup there is trusted implementation code and untrusted consumer code and argument verification happens at the trust boundary only.  Once behind the trust boundary, you just operate on things without doing type checking or coercion except as needed, because you control them all and you know that they don't do insane things.
> 
>> Yes, but people now extensively write mission critical code application
>> code and libraries in ES and we don't see these techniques being widely
>> used.  What is it about the libraries described using WebIDL that make
>> them unique in requiring this sort of auto generated glue code?
> 
> See above.  But to expand on that, applications written in ES control their own code and sanitize incoming data when it comes in, while libraries tend to just punt on "weird" cases.
> 
> Put another way, I'm 100% sure that I can pass arguments to jquery that will corrupt its internal state in interesting ways.  But the jquery authors frankly don't care if I do, because the only consequence is that other scripts on that same page won't work right.
> 
>> Right, have we done any analysis yet of the systemic cost of those
>> auto-generated coercions.
> 
> None of the things for which we plan to use JS-implemented WebIDL are critical to performance (in the sense of there being lots of calls across the boundary).
> 
>> Turing complete pseudo-code or prose is needed today to fully specify
>> all of these Web APIs.  The WebIDL signature is presumably only a small
>> part of the specification of most functions.
> 
> Actually, that presumption is somewhat false.  I've been converting a lot of things to Gecko's new WebIDL bindings recently, and the WebIDL signature is in fact a huge part of the specification of many of them. There's tons of stuff in the web platform (especially for elements) that just returns or sets a member variable, for example.
> 
>> I know I'm getting redundant, but I have to ask again, what is so
>> special about most of the web APIs that will be specified using WebIDL.
>> If non-WebIDL ES is adequate for complex applications why won't it be
>> adequate for web APIs.
> 
> See above.
> 
>> Do you know of JS libraries that expose the forms of overloads that are
>> expressible in WebIDL?  They are awkward to express in pure JS.
> 
> jQuery.protototype.init.
> 
> jQuery's parseHTML (see the 'context' argument in jQuery 1.9.1).
> 
> jQuery's each (see the "arraylike or not" overloading).
> 
> jQuery's makeArray (overloads a string and an array, treating a string as a single-element array internally).
> 
> I'm about 10% into jQuery, and I'm sure I've missed a few above that point, and I'm also sure there are tons more further on.
> 
> I agree that the resulting code is somewhat awkward, of course.  But if that's the API you want to expose....
> 
>>> I think the big danger with this approach is that they will occur
>>> unintentionally without the algorithm writer bothering to think about
>>> them very much.
>> 
>> in contrast to WebIDL, where the algorithm writer probably doesn't
>> bother to think about the cost of the coercions that are automatically
>> performed.
> 
> Yes, but there's a difference, to me, between the severity of a small performance hit and an exception thrown while data structures that affect the behavior of privileged code are in an inconsistent state...
> 
> -Boris
>
Received on Friday, 22 March 2013 02:36:12 UTC