Re: Streams and Blobs from Glenn Maynard on 2013-03-09 (public-webapps@w3.org from January to March 2013)

From: Glenn Maynard <glenn@zewt.org>
Date: Sat, 9 Mar 2013 10:03:32 -0600
To: Jonas Sicking <jonas@sicking.cc>
Cc: Anne van Kesteren <annevk@annevk.nl>, WebApps WG <public-webapps@w3.org>
Message-ID: <CABirCh8v46kL2zjrKpOQCgUdW8v9zPAb_r=iHDRy3sPAaz_iiA@mail.gmail.com>
On Fri, Mar 8, 2013 at 10:40 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> > But what about the issues I mentioned (you snipped them)?  We would be
> > introducing overlap between XHR and every consumer of URLs
> > (HTMLImageElement, HTMLVideoElement, CSS loads, CSS subresources, other
> > XHRs), which could each mean all kinds of potential script-visible
> interop
> > subtleties.
>
> As long as we define the order between when data is going into the
> Stream, and when the events are fired on the XHR object, I think that
> takes care of these issues.
>

The isn't just between XHR and Stream, it's between XHR/Stream and the API
receiving the data, such as HTMLImageElement and everything else that takes
a URL.

For example, EventSource queues tasks to fire onmessage in the remote event
task source.  This means that the order of XHR events and EventSource
onmessage events will be unspecified.  Some browsers might always send
those onmessage events before sending XHR events, which guarantees that
onmessage will never be received after XHR's onload.  Other browsers might
prioritize XHR's tasks, causing XHR's onload to happen first, so it
wouldn't have that guarantee.

(I have no idea how many cases like this exist, and that's what worries me.)

This problem doesn't exist with the finish-and-return approach, since
Stream itself doesn't have any events.


> - What happens if you do a sync XHR?  It would block forever, since you'll
> > never see the Stream in time to hook it up to a consumer.  You don't
> want to
> > just disallow this, since then you can't set up streams synchronously at
> > all.  With the "XHR finishes immediately" model, this is straightforward:
> > XHR returns as soon as the headers are finished, giving you the Stream
> to do
> > whatever you need with.
>
> Sync XHR already can't use .responseType, so there is no way for sync
> XHR to return a Stream object. We should put the same restriction on
> Sync XHR accepting a Stream as a request body.
>

What?  Of course sync XHR can use responseType.
https://zewt.org/~glenn/sync-responsetype.html

And again, you don't want to just disallow this, or you couldn't create
streams synchronously in workers.  We can create a Blob synchronously, then
read data out of the blob synchronously; we should also be able to create a
Stream synchronously, and parse data from it synchronously.

 > - What if you create an async XHR, then hook it up to a sync XHR?  Async
> XHR
> > only does work during the event loop, so this would deadlock (the async
> XHR
> > would never run to feed data to the sync one).
>
> Same as above.
>

It's not the same.  It's an async XHR returning a stream, and a plain
responseType="" sync XHR.

var client = new XMLHttpRequest();
client.responseType = "stream";
client.open("GET", url);
client.send();
client.onsomething = function() { // on whatever event means the Stream is
available
    var sync = new XMLHttpRequest();
    var url2 = URL.createObjectURL(client.response);
    sync.open("GET", url2, false);
    sync.send();
};

Any future synchronous worker APIs that could fetch data from a URL would
have the same problem with this design (there aren't any others yet that I
know of).

 > - You could set up an async XHR in one worker, then read it synchronously
> > with XHR in another worker.  This means the first worker could block the
> > second worker at will, eg. by running a blocking operation during an
> > onprogress event, to prevent returning to the event loop.  I'm sure we
> don't
> > want to allow that (at least without careful thought, eg. the
> "synchronous
> > messaging" idea).
>
> This is a good point. We probably shouldn't allow sync XHR in workers
> either to accept or produce Stream objects.
>

Now you can't stream synchronously, and now XHR suddenly cares about
whether an object URL comes from a Stream instead of being
protocol-agnostic (meaning black-box APIs that take a URL won't working
with them either).  This is adding weird restrictions to work around
problems with the design.

> With the supply-the-stream-and-it's-done model, XHR follows the same model
> > it normally does: you start a request, XHR does some work, and onload is
> > fired once the result is ready for you to use.
>
> This is not correct. All of .response, .responseText and .responseXML
> are often available much before that.
>

Nope.  The only time any of these is available (per spec) before the DONE
state is in responseType = "text", making that the exception.  All the rest
are only available in DONE.  The most common pattern with XHR is to call
send(), then wait for onload (or one of the other redundant events fired at
that time), which this approach follows.

> With the runs-for-the-duration-of-the-stream model, when is the .response
> > available?
>
> Ideally as soon as .send() is called. If that causes problem then
> maybe as soon as we enter readystate 3.
>

You need to know the MIME type to create a Stream, so you can't create it
until you've received headers.  This means that you'd have to wait for a
different condition to know when the response is ready to be used than just
about every other use of XHR.

-- 
Glenn Maynard
Received on Saturday, 9 March 2013 16:04:04 UTC