Re: XMLHttpRequest.responseBlob

On Mon, Apr 26, 2010 at 11:03 PM, Darin Fisher <darin@chromium.org> wrote:
> On Mon, Apr 26, 2010 at 3:52 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Mon, Apr 26, 2010 at 3:39 PM, Darin Fisher <darin@chromium.org> wrote:
>> > On Mon, Apr 26, 2010 at 3:29 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> >>
>> >> On Mon, Apr 26, 2010 at 3:21 PM, Darin Fisher <darin@chromium.org>
>> >> wrote:
>> >> > There is some interest from application developers at Google in being
>> >> > able
>> >> > to get a Blob corresponding to the response body of a XMLHttpRequest.
>> >> > The use case is to improve the efficiency of getting a Blob from a
>> >> > binary
>> >> > resource downloaded via XHR.
>> >> > The alternative is to play games with character encodings so that
>> >> > responseText can be used to fetch an image as a string, and then use
>> >> > BlobBuilder to reconstruct the image file, again being careful with
>> >> > the
>> >> > implicit character conversions.  All of this is very inefficient.
>> >> > Is there any appetite for adding a responseBlob getter on XHR?
>> >>
>> >> There has been talk about exposing a responseBody property which would
>> >> contain the binary response. However ECMAScript is still lacking a
>> >> binary type.
>> >>
>> >> Blob does fit the bill in that it represents binary data, however it's
>> >> asynchronous nature is probably not ideal here, right?
>> >>
>> >> / Jonas
>> >
>> >
>> > I think there are applications that do not require direct access to the
>> > response data.
>> > For example,
>> > 1- Download a binary resource (e.g., an image) via XHR.
>> > 2- Load the resource using Blob.URN (assuming URN moves from File to
>> > Blob).
>> > It may be the case that providing direct access to the response data may
>> > be
>> > more
>> > expensive than just providing the application with a handle to the data.
>> >  Consider
>> > the case of large files.
>>
>> Ah, so you want the ability to have the XHR implementation stream to
>> disk and then use a Blob to read from there? If so, you need more
>> modifications as currently the XHR implementation is required to keep
>> the whole response in memory anyway in order to be able to implement
>> the .responseText property.
>>
>> So we'll need to add some way for the page to indicate to the
>> implementation "I don't care about the .responseText or .responseXML
>> properties, just responseBlob"
>
> I thought about this more, and I came to the same conclusion as you.  I
> agree that we wouldn't want to support .responseText or .responseXML if we
> were streaming directly to a file because the implied synchronous readback
> from disk would suck.
> I'm not sure how to add such a feature to XHR in a way that is not awkward.
>  Perhaps if there was a way to bind a FileWriter to an XMLHttpRequest object
> prior to calling send?

Hmm.. what would that look like? Can you give an example of an API?

I've been thinking for a while that we should add a 'streaming mode'
for XHR anyway. Right now XHR is very inefficient when loading large
amounts of data for two reasons. First of all the data needs to be
appended to an existing buffer constantly, resulting in O(n^2)
behavior. I.e. if you receive data in 1500 bytes packets, and download
1Mb of data, you end up first allocating 1500 bytes, then 3000, 4500,
6000, etc. All the way to the total file size. Second, as discussed
here, you have to store the whole resulting file in memory.

It would be nice to be able to allow streaming such that every time a
progress event is fired only the newly downloaded data is available.
The UA is then free to throw away that data once the event is done
firing. This would be useful in the cases when the page is able to do
incremental parsing of the resulting document.

If we add a 'load mode' flag on XMLHttpRequest, which can't be
modified after send() is called, then streaming to a Blob could simply
be another enum value for such a flag.

There is still the problem of how the actual blob works. I.e. does
.responseBlob return a new blob every time more data is returned? Or
should the same Blob be constantly modifying? If modifying, what
happens to any in-progress reads when the file is modified? Or do you
just make the Blob available once the whole resource has been
downloaded?

/ Jonas

Received on Tuesday, 27 April 2010 18:02:24 UTC