Re: what to do with invalid (or improper) mime-type resources from Aryeh Gregor on 2010-12-20 (public-html@w3.org from December 2010)

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Mon, 20 Dec 2010 11:13:46 -0500
To: Kyle Simpson <getify@gmail.com>
Cc: public-html@w3.org
Message-ID: <AANLkTikduRKBBndf1ycyObEja95v2d0+-ZzHe1qreRpJ@mail.gmail.com>
On Sun, Dec 19, 2010 at 11:29 PM, Kyle Simpson <getify@gmail.com> wrote:
> But I would argue that such freedoms should be restricted to elements that
> are parser-inserted (in the markup). When we're talking about JavaScript
> that programmatically creates a <object>, `Image`, <script>, or <link>
> elements, for the express purpose of loading (or preloading) an external
> resource, I think the author should be able to access a very well defined
> and predictable set of behavior.

Yes, you could have some programmatic APIs saying "cache this
resource" that are implemented reliably in major browsers by default.
What you initially suggested is very different, though -- you
suggested that caching behavior for various existing declarative APIs
be standardized.  That I don't think is feasible.

> The "no matter what the author thinks" part is highly disturbing. How can I
> do anything effectively if the browser is always second-guessing what I'm
> directly telling it to do?

Well, in some cases you can't.  That's one of the key things that
makes the web platform different from native apps: websites are more
constrained in the demands they can make.  But I'll point out that
this can be true even for native apps.  If you request access to a
resource from the OS, there's no guarantee you'll get it right away --
if a higher-priority process is using the resource, you might even be
blocked indefinitely from getting it.

To pick a random example from Unix, fsync() is supposed to guarantee
that file contents are synced to disk.  But even leaving aside cases
where this isn't implemented due to lying hardware or kernel bugs,
there's been discussion in the Linux world of a possible "laptop
mode", in which the kernel would just ignore fsync() so as not to
spend power spinning up the disks.  This is a case where the OS could
be configured to ignore the demands of programs because, in fact, the
OS or the user knows better -- someone has decided in this case that
reducing power usage trumps the risk of a kernel crash losing writes,
and applications are in no position to overrule them.

> A "hint" is completely unuseful in the use-case being discussed. A
> script/css loader needs determinate and predictable behavior, or it's
> completely useless.

"Hints" are still useful if they're widely observed in practice.  The
point of calling them a hint is to make it clear that browsers can
ignore the hint if they have good reason to.  <video preload> is
theoretically a hint, but all browsers plan to implement it and
respect it (although not all do yet).  But they might not respect it
if they know they're on a metered Internet connection, etc.  What I
mean by a "hint" is something like "user agents are expected to
request and cache the resource immediately if practical, but can delay
if they have some specific good reason to".

> Can you imagine if the XHR facility had been specified using this
> wishy-washy "hint" type language? "XHR.send() will tell the browser that
> you'd like an Ajax request to go out at some point in the near future, if
> the browser thinks it's a good idea, and there's not much else going on. And
> the response will probably come back soon, but the browser is free to delay
> handing you back the response if it feels like you're better off without it
> at that moment."

This is basically what happens, actually.  Browsers are not obligated
to send XHR requests immediately.  For instance, browsers will
typically send only a few simultaneous HTTP requests to each domain,
so if all their requests are being used, the XHR might be delayed
until a connection is free.  The point is that this is not black-box
detectable, because you can't rely on the speed at which a site will
respond anyway -- maybe the user just has a slow connection.  So
browsers are free to innovate here for the sake of prioritizing one
site over another, one type of request over another, etc.

> It seems like the uneasy tension here is that the spec wants to let browsers
> innovate in features, which (by virtue of the unpredictability of
> "undefined" and experimentation cross-browser) handcuffs script developers
> that want to create more intelligence/innovation in the same areas. Can't
> there be a better balance where the needs of script-based resource loaders
> get some favor from the spec process instead of leaving everything so open
> only in favor of browsers (which most of us don't have much influence over)?

This should be accomplished by a specific API that requests a resource
be cached, not attempting to specify in detail the caching behavior of
existing functionality.

On Mon, Dec 20, 2010 at 8:58 AM, Kyle Simpson <getify@gmail.com> wrote:
> Another thing that’s curious to me about your assertion that the spec should
> stay out of such details like "loading":
>
> 4.3.1 The `script` element
> http://www.w3.org/TR/html5/scripting-1.html#running-a-script
>
> "6. If the user agent does not support the scripting language given by the
> script block's type for this script element, then the user agent must abort
> these steps at this point. The script is not executed."
>
> The context of this statement is that it's step 6, and it says to abort the
> steps if the `type` value is invalid. Step 12 is where the browser can fetch
> the resource, so this step 6 tells the browser effectively *not* to
> fetch/load a resource if the type is invalid. Other wording in this section
> seems to speak to the intent, which is that loading of the resource is
> specifically being prevented (not just a hint by the spec, but an
> algorithmic requirement), and that the idea is that this prevents an
> unnecessary load of the resource.

This is reliably black-box detectable, unlike preloading and caching.
The question there is not *when* to load the resource, but *whether*
to load it.  There is obviously a black-box detectable difference
between a UA that fetches and runs the script, and one that does not.
Either the script has been run or not.  Timing, on the other hand, is
not reliably black-box detectable, because you cannot reliably tell
whether a request has been deliberately delayed by the browser or
inadvertently delayed by factors beyond the browser's control (network
congestion, slow response by the site, etc.).

Of course, nothing stops the browser from fetching the resource and
then *not* executing it.  But that would be fairly pointless, wouldn't
it?
Received on Monday, 20 December 2010 16:14:47 UTC