Re: XMLHttpRequest.responseBlob

On Wed, Apr 28, 2010 at 12:45 PM, Darin Fisher <darin@chromium.org> wrote:
> On Wed, Apr 28, 2010 at 11:57 AM, Michael Nordman <michaeln@google.com>
> wrote:
>> On Wed, Apr 28, 2010 at 11:21 AM, Jonas Sicking <jonas@sicking.cc> wrote:
>>>
>>> Ugh, sent this originally to just Darin. Resending to the list.
>>>
>>> On Wed, Apr 28, 2010 at 10:11 AM, Darin Fisher <darin@chromium.org>
>>> wrote:
>>> > On Tue, Apr 27, 2010 at 2:04 PM, Jonas Sicking <jonas@sicking.cc>
>>> > wrote:
>>> >>
>>> >> On Tue, Apr 27, 2010 at 1:59 PM, Darin Fisher <darin@chromium.org>
>>> >> wrote:
>>> >> > On Tue, Apr 27, 2010 at 1:33 PM, Jonas Sicking <jonas@sicking.cc>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Tue, Apr 27, 2010 at 1:26 PM, Darin Fisher <darin@chromium.org>
>>> >> >> wrote:
>>> >> >> >> It would be nice to be able to allow streaming such that every
>>> >> >> >> time
>>> >> >> >> a
>>> >> >> >> progress event is fired only the newly downloaded data is
>>> >> >> >> available.
>>> >> >> >> The UA is then free to throw away that data once the event is
>>> >> >> >> done
>>> >> >> >> firing. This would be useful in the cases when the page is able
>>> >> >> >> to
>>> >> >> >> do
>>> >> >> >> incremental parsing of the resulting document.
>>> >> >> >>
>>> >> >> >> If we add a 'load mode' flag on XMLHttpRequest, which can't be
>>> >> >> >> modified after send() is called, then streaming to a Blob could
>>> >> >> >> simply
>>> >> >> >> be another enum value for such a flag.
>>> >> >> >>
>>> >> >> >> There is still the problem of how the actual blob works. I.e.
>>> >> >> >> does
>>> >> >> >> .responseBlob return a new blob every time more data is
>>> >> >> >> returned? Or
>>> >> >> >> should the same Blob be constantly modifying? If modifying, what
>>> >> >> >> happens to any in-progress reads when the file is modified? Or
>>> >> >> >> do
>>> >> >> >> you
>>> >> >> >> just make the Blob available once the whole resource has been
>>> >> >> >> downloaded?
>>> >> >> >>
>>> >> >> >
>>> >> >> >
>>> >> >> > This is why I suggested using FileWriter.  FileWriter already has
>>> >> >> > to
>>> >> >> > deal with
>>> >> >> > most of the problems you mentioned above,
>>> >> >>
>>> >> >> Actually, as far as I can tell FileWriter is write-only so it
>>> >> >> doesn't
>>> >> >> deal with any of the problems above.
>>> >> >
>>> >> > When you use createWriter, you are creating a FileWriter to an
>>> >> > existing
>>> >> > File.
>>> >> > The user could attempt to create a FileReader to the very same File
>>> >> > while
>>> >> > a FileWriter is open to it.
>>> >> > It is true that for <input type=saveas> there is no way to get at
>>> >> > the
>>> >> > underlying
>>> >> > File object.  That is perhaps a good thing for the use case of
>>> >> > downloading
>>> >> > to
>>> >> > a location specified by the user.
>>> >>
>>> >> Ah. But as far as I can tell (and remember), it's still fairly
>>> >> undefined what happens when the OS file under a File/Blob object is
>>> >> mutated.
>>> >>
>>> >> / Jonas
>>> >
>>> > Agreed.  I don't see it as a big problem.  Do you?  The application
>>> > developer is
>>> > in control.  They get to specify the output file (via FileWriter) that
>>> > XHR
>>> > sends its
>>> > output to, and they get to know when XHR is done writing.  So, the
>>> > application
>>> > developer can avoid reading from the file until XHR is done writing.
>>>
>>> Well, it seems like a bigger deal here since the file is being
>>> constantly modified as we're downloading data into it, no? So for
>>> example if you grab a File object after the first progress event, what
>>> does that File object contain after the second? Does it contain the
>>> whole file, including the newly downloaded data? Or does it contain
>>> only the data after the first progress event? Or is the File object
>>> now invalid and can't be used?
>>
>> What gears did about that was to provide a 'snapshot' of the
>> downloaded data each time responseBlob was called, with
>> the 'snapshot' being consistent with the progress events
>> having been seen by the caller. The 'snapshot' would remain
>> valid until discarded by the caller. Each snapshot just provided
>> a view onto the same data which maybe was in memory or
>> maybe had spilled over to disk unbeknownst to the caller.

That sounds like a good solution.

It does mean that if you want to read the whole file, you can't even
*start* reading until the whole file has been downloaded. But that
might be ok given that file reads should be relatively fast.

I have always liked the idea of a Blob that's backed by a network
request, but that's mostly convenience sugar over simply using XHR, so
not terribly important.

>>> Sure, if the use case is actually downloading and saving a file for
>>> the user to use, rather than for the page to use, then a FileWriter
>>> seems like it would work. I.e. if you want something like
>>> "Content-Disposition: attachment", but where you can specify request
>>> headers. Is that the use case?
>>
>> Mods to xhr to access the response more opaquely is a fairly general
>> feature request. One specific use case is to download a resource via xhr
>> and then save the results in a sandboxed file system. So "for the page to
>> use".
>
> ^^^ That is the use case I'm primarily interested in.

Ah, I see. I guess I don't care very much one way or another, mostly
because we don't have any immediate plans to implement the file system
spec. But the use case makes sense to me.

>> The notion of having a streaming interface on xhr is interesting. That
>> with a
>> BlobBuilder capability could work. If a streaming xhr mode provided new
>> data in the form of 'blobs' where each blob was just the newly received
>> data,
>> the caller could use a BlobBuilder instance to concatenate the set of
>> received
>> data blobs. And then take blobBuilder.getBlob() and do what they will with
>> it.   xhr.ondatareceived = function (data) { builder.appendBlob(data); }
>
> ^^^ I like that proposal for streaming.

The only thing I would say is that I'm not sure that in the streaming
case it really makes sense to provide a Blob that contains the data
received since the last progress event. I would imagine that in most
cases this won't be very much data, and usually the data will be in
memory anyway. So providing it in the form of a BinaryArray (or
whatever) seems better.

You could still use the same code as in your example above as the
BlobBuilder should be able to append a BinaryArray.

/ Jonas

Received on Wednesday, 28 April 2010 20:39:25 UTC