Re: Overlap between StreamReader and FileReader

I figured I should chime in with some ideas of my own because, well, why not :-)

First off, I definitely think the semantic model of a Stream shouldn't
be "a Blob without a size", but rather "a Blob without a size that you
can only read from once". I.e. the implementation should be able to
discard data as it passes it to a reader.

That said, many Stream APIs support the concept of a "T". This enables
splitting a Stream into two Streams. This enables having multiple
consumers of the same data source. However when a T is created, it
only returns the data that has so far been unread from the original
Stream. It does not return the data from the beginning of the stream
since that would prevent streams from discarding data as soon as it
has been read.

If we are going to have a StreamReader API, then I don't think we
should model it after FileReader. FileReader unfortunately followed
the model of XMLHttpRequest (based on request from several
developers), however this is a pretty terrible API, and I believe we
can do much better. And obviously we should do something based on
Futures :-)

For File reading I would now instead do something like

partial interface Blob {
  AbortableProgressFuture<ArrayBuffer> readBinary(BlobReadParams);
  AbortableProgressFuture<DOMString> readText(BlobReadTextParams);
  Stream readStream(BlobReadParams);
};

dictionary BlobReadParams {
  long long start;
  long long length;
};

dictionary BlobReadTextParams : BlobReadParams {
  DOMString encoding;
};

For Stream reading, I think I would do something like the following:

interface Stream {
  AbortableProgressFuture<ArrayBuffer> readBinary(optional unsigned
long long size);
  AbortableProgressFuture<String> readText(optional unsigned long long
size, optional DOMString encoding);
  AbortableProgressFuture<Blob> readBlob(optional unsigned long long size);

  ChunkedData readBinaryChunked(optional unsigned long long size);
  ChunkedData readTextChunked(optional unsigned long long size);
};

interface ChunkedData : EventTarget {
  attribute EventHandler ondata;
  attribute EventHandler onload;
  attribute EventHandler onerror;
};

For all of the above function, if a size is not passed, the rest of
the Stream is read.

The ChunkedData interface allows incremental reading of a stream. I.e.
as soon as there is data available a "data" event is fired on the
ChunkedData object which contains the data since last "data" event
fired. Once we've reached the end of the stream, or the requested
size, the "load" event is fired on the ChunkedData object.

So the read* functions allow a consumer to pull data, whereas the
read*Chunked allow consumers to have the data pushed at them. There's
also other potential functions we can add which allow hybrids, but
that seems overly complex for now.

Other functions we could add is peekText and peekBinary which allows
looking into the stream to determine if you're able to consume the
data that's there, or if you should pass the Stream to some other
consumer.

We might also want to add a "eof" flag to the Stream interface, as
well as an event which is fired when the end of the stream is reached
(or should that be modeled using a Future?)

/ Jonas

On Fri, May 17, 2013 at 5:02 AM, Takeshi Yoshino <tyoshino@google.com> wrote:
> On Fri, May 17, 2013 at 6:15 PM, Anne van Kesteren <annevk@annevk.nl> wrote:
>>
>> The main problem is that Stream per Streams API is not what you expect
>> from an IO stream, but it's more what Blob should've been (Blob
>> without synchronous size). What we want I think is a real IO stream.
>> If we also need Blob without synchronous size is less clear to me.
>
>
> Forgetting File API completely, for example, ... how about simple socket
> like interface?
>
> // Downloading big data
>
> var remaining;
> var type = null;
> var payload = '';
> function processData(data) {
>   var offset = 0;
>   while (offset < data.length) {
>     if (!type) {
>       var type = data.substr(offset, 1);
>       remaining = payloadSize(type);
>     } else if (remaining > 0) {
>       var consume = Math.min(remaining, data.length - offset);
>       payload += data.substr(offset, consume);
>       offset += consume;
>     } else if (remaining == 0) {
>       if (type == FOO) {
>         foo(payload);
>       } else if (type == BAR) {
>         bar(payload);
>       }
>       type = null;
>     }
>   }
> }
>
> var client = new XMLHttpRequest();
> client.onreadystatechange = function() {
>   if (this.readyState == this.LOADING) {
>     var responseStream = this.response;
>     responseStream.setBufferSize(1024);
>     responseStream.ondata = function(evt) {
>       processData(evt.data);
>       // Consumed data will be invalidated and memory used for the data will
> be released.
>     };
>     responseStream.onclose = function() {
>       // Reached end of response body
>       ...
>     };
>     responseStream.start();
>     // Now responseStream starts forwarding events happen on XHR to its
> callbacks.
>   }
> };
> client.open("GET", "/foobar");
> client.responseType = "stream";
> client.send();
>
> // Uploading big data
>
> var client = new XMLHttpRequest();
> client.open("POST", "/foobar");
>
> var requestStream = new WriteStream(1024);
>
> var producer = new Producer();
> producer.ondata = function(evt.data) {
>   requestStream.send(evt.data);
> };
> producer.onclose = function() {
>   requestStream.close();
> };
>
> client.send(requestStream);
>

Received on Saturday, 18 May 2013 04:39:14 UTC