- From: Glenn Maynard <glenn@zewt.org>
- Date: Thu, 7 Mar 2013 18:42:19 -0600
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: Anne van Kesteren <annevk@annevk.nl>, WebApps WG <public-webapps@w3.org>
- Message-ID: <CABirCh8WrxMgwp=OfSiPZO1HKGd88OZrA_XPpcZu2fxiLh5fnQ@mail.gmail.com>
If we decide to do streaming with the Streams API, the StreamReader API, at least, seems to need some work. In particular, it seems designed with only binary protocols that specify block sizes in mind. It can't handle textual protocols at all. For example, it couldn't be used to parse a keepalive HTTP stream, since you don't know the size of the headers in advance. You want to read the data as quickly as possible, whenever new data becomes available. Parsing Event Source has the same problem. Put in socket terms, StreamReader only lets you do a blocking (in the socket sense) read() of a fixed size. It doesn't let you set O_NONBLOCK, monitor the socket (select()) and read data as it becomes available. StreamBuilder, used to source data from script to native, makes me nervous, since unless it's done carefully it seems like it would expose a lot of implementation details. For example, the I/O block size used by HTMLImageElement's network fetch could be exposed as a side-effect of when "thresholdreached" is fired, which could lead to websites that only work with the block size of a particular browser. Also, what happens if a synchronous API (eg. sync XHR) reads from a URL that's sourced from a StreamBuilder? That would deadlock. It could also happen cross-thread, with two XHRs each reading a URL sourced from the other. On Thu, Mar 7, 2013 at 3:37 AM, Jonas Sicking <jonas@sicking.cc> wrote: > This seems awkward and not very future proof. Surely other things than > HTTP requests can generate streams of data. For example the TCPSocket > API seems like a good candidate of something that can generate a > stream. Likewise WebSocket could do the same for large message frames. > > Other potential sources of data streams is obviously local file data, > and simply generating content using javascript. > > So I think it makes a lot more sense to create the concept of Stream > separate from the XHR object. Most of the thread has been about whether to use the Stream API, which already exists (in spec form) and is used by XHR (also in spec form--I don't think any of this is in production): https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm If we do use that API or one like it, another question raised is how XHR and the Stream object interact. I think the Stream object created by XHR should represent the actual stream, with the Stream being the data source. XHR is just a factory for the Stream object. This way, XHR gets out of the picture immediately and a lot of complexity goes away. For example, if you load an image from a Stream created by XHR, both XHR and HTMLImageElement fire onload. Which one is fired first? Are they fired synchronously, or can other things happen in between them? They're on different task sources, so how do you ensure ordering at all? What about onerror? What if you .abort XHR during XHR's 100% .progress event, after the image has loaded the data but before it's fired onload? The same applies to various events in every API that might receive a Stream. The alternative argument is that XHR should represent the data source, reading data from the network and pushing it to Stream. An important difference between a stream and a blob is that you can > read the contents of a Blob multiple times, while a stream is > optimized for lower resource use by throwing out data as soon as it > has been consumed. I think both are needed in the web platform. > For the streaming-to-script case (not to native), Blobs can be used as the API while still allowing the data to be discarded; just use multiple blobs that happen to share a backing store. That's what my "clearResponse()" proposal does. It handles both the "expose the whole response incrementally" and the "expose data in chunks, discarding as you go" cases. If we want the functionality of moz-blob and friends directly on XHR (even if we also have a Stream API and they're redundant with it), I think the clearResponse approach is clearer (the exact place where the response data is emptied is explicit), gives better usability (you can choose when and if to clear it), and doesn't overload responseType with a second axis. But this difference is important to consider with regards to > connecting a Stream to a <video> or <audio> element. Users generally > expect to be able to rewind a media element which would mean that if > you can connect a Stream to a media element, the element would need to > buffer the full stream. > > But this isn't an issue that we need to tackle right now. What I think > the first thing to do is is to create a Stream primitive, figure out > what API would go directly on it, or what new interfaces needs to be > created to allow getting the data out of it, and a way to get XHR to > produce such a primitive. > Just to sum up the earlier discussion: If you only need to do simple fetches (a GET, with no custom headers and so on), then you don't need any of this; all you need is to hand the URL to the API, as with <video>. That's simple and robust: the receiving API can deal with restarting failed connections transparently (you can't do that if you get a stream to a POST, and it's not clear if you can do that even if it's a GET) instead of pushing that onto developers; it's possible to close and reopen the stream, and seek around as needed, and so on. You only need a middleman like Stream for streams that are based on complex requests (eg. a POST, or script-sourced data). -- Glenn Maynard
Received on Friday, 8 March 2013 00:42:47 UTC