Re: XHR responseArrayBuffer attribute: suggestion to replace "asBlob" with "responseType" from Darin Fisher on 2010-10-27 (public-webapps@w3.org from October to December 2010)

From: Darin Fisher <darin@chromium.org>
Date: Tue, 26 Oct 2010 23:39:10 -0700
To: Boris Zbarsky <bzbarsky@mit.edu>
Cc: Chris Rogers <crogers@google.com>, Web Applications Working Group WG <public-webapps@w3.org>, Anne van Kesteren <annevk@opera.com>, Eric Uhrhane <ericu@google.com>, michaeln@google.com, Alexey Proskuryakov <ap@webkit.org>, Chris Marrin <cmarrin@apple.com>, Geoffrey Garen <ggaren@apple.com>, jorlow@google.com, jamesr@chromium.org
Message-ID: <AANLkTi=ArOA5mQOOfuMUOotfX93Uvd6Z9U1KWE2Ls8O-@mail.gmail.com>

On Mon, Oct 25, 2010 at 3:33 PM, Boris Zbarsky <bzbarsky@mit.edu> wrote:

> On 10/25/10 6:21 PM, Chris Rogers wrote:
>
>>  People are concerned that it would require keeping two copies of the
>> data around (raw bytes, and unicode text version) since it's unknown
>> up-front whether "responseText", or "responseArrayBuffer" will be
>> accessed.
>>
>
> Note that Gecko does exactly that, and we've seen no problems with it...
>  It's very rare to have really large XHR bodies, for what it's worth.
>
>
>  This approach does seem a little strange because of the mutually
>> exclusive nature of the access.  However, it seems that it would be hard
>> to come up for a reasonable use case where both the raw bytes *and* the
>> text would be needed for the same XHR.
>>
>
> You could have a situation where a library dispatches the XHR for you and
> always looks at the text for some reason (libraries tend to do that sort of
> thing) while you actually want the bytes.
>
> In general, this approach seems really fragile and hostile to web
> developers.
>
>
>  2. Darin Fisher has suggested this approach: Add an "asArrayBuffer"
>> attribute (similar to "asBlob") which *must* be set before sending the
>> request if there will be a subsequent access of the
>> "responseArrayBuffer" attribute.
>>
>
> This make it impossible to decide whether to look at the text or the bytes
> based on the content-type of the response (unless you allow setting the
> attribute in some early-enough onStateChange callback _and_ libraries expose
> XHRs in that early a state to consumers); having that ability seems like a
> desirable feature.

I think XMLHttpRequest is trying to be too much of a kitchen-sink.  It seems
pretty unfortunate that our networking API has XML parsing features, for
example.  That should be a separate component, but alas, we cannot change
history.

Ideally, there'd be separate components that operate on an ArrayBuffer and
produce a decoded string / XML document.  Then, for the use case you are
talking about, people could just ask for the response as an ArrayBuffer,
inspect the response headers, and then optionally invoke a text decoder
interface or a XML parser / DOM builder interface.

As for the performance discussion, we learned the hard way that it was
valuable to only keep one copy of the XHR's data.  There are some sites out
there that load large documents.  Sad, but true.  Maybe James Robinson or
someone else can dig up some examples.  I think we should try to design for
a future where we don't have to compromise performance for capabilities.

If we keep the ArrayBuffer up front and only decode on demand, then we will
be doing more work in the common case (in which someone only wants the
responseText).  That seems bad.

-Darin

>
>  3. Get rid of the "asBlob" attribute and add a new "responseType"
>> attribute which could be:
>> "Text" <--- this is the default
>> "XML"
>> "Bytes"
>> ... other types yet to be defined
>>
>
> I'm not sure I follow this proposal.
>
>
>  I can accept any of the proposed solutions and would like to hear what
>> others think about these or other approaches.
>>
>
> How about:
>
> 4)  Make things easy to use for authors; that means supporting responseText
> and responseArrayBuffer, with access to both on the same XHR object without
> weird restrictions and without clairvoyance required on the part of the
> author.  If UAs feel that they don't want to keep both copies of the data in
> memory on a permanent basis, they can optimize this by dropping the
> responseText (on a timer or via a low-memory detection mechanism, or in
> various other ways) and regenerating it from the raw bytes as needed.
>
> ?
>
> The other suggestions really feel like a case of overoptimizing for
> simplicity and ease of creation of UA code at the expense of authors.
> Sometimes that's warranted, but in this case the UA code needed to produce
> sane behavior is really just not that complicated or hard....
>
> -Boris
>

Received on Wednesday, 27 October 2010 06:39:48 UTC