Re: XHR responseArrayBuffer attribute: suggestion to replace "asBlob" with "responseType" from Jonas Sicking on 2010-10-29 (public-webapps@w3.org from October to December 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 28 Oct 2010 20:29:28 -0700
To: Maciej Stachowiak <mjs@apple.com>
Cc: Boris Zbarsky <bzbarsky@mit.edu>, Geoffrey Garen <ggaren@apple.com>, Darin Fisher <darin@chromium.org>, Chris Rogers <crogers@google.com>, Web Applications Working Group WG <public-webapps@w3.org>, Anne van Kesteren <annevk@opera.com>, Eric Uhrhane <ericu@google.com>, michaeln@google.com, Alexey Proskuryakov <ap@webkit.org>, Chris Marrin <cmarrin@apple.com>, jorlow@google.com, jamesr@chromium.org
Message-ID: <AANLkTimOY2oYeAtyRt20miQnyGaAn2i-RvMdqGq9-RkT@mail.gmail.com>

On Thu, Oct 28, 2010 at 10:33 AM, Maciej Stachowiak <mjs@apple.com> wrote:
>
> On Oct 27, 2010, at 5:36 PM, Boris Zbarsky wrote:
>
>>
>>> But both approaches would reliably throw exceptions if a client got things wrong.
>>
>> See, there's the thing.  Neither approach is all that reliable (even to the point of throwing sometimes but not others for identical code), and access is more prone to issues where which code the exception is thrown in is not consistent (including being timing-dependent), if multiple listeners are involved.
>>
>> Do people really think that "action at a distance" situations where pulling slightly and innocuously on one bit of a system perturbs other parts of the system in fatal ways are acceptable for the web platform? They're the sort of things that one avoids as much as possible in other systems, but this thread is all about proposing such behaviors for the web platform...
>
> I don't think that kind of approach is good design. When design APIs (especially for a platform as widely used as the Web), it's better to design them with fewer possible ways to use them wrong. Making a subtle mistake impossible by design is better than throwing an exception when you make that mistake.
>
> I realize memory use is a concern and it's definitely easy to use too much memory in all sorts of ways. But a fragile API is an even bigger problem, in my opinion.

Personally I like the proposed responseType solution.

I agree that it has a downside in that it doesn't allow figuring out
the type as data starts coming in. However I think this is a much less
common case then knowing the type before the request is made. Both for
the case when downloading from your own site as when downloading
cross-origin. It makes sense to me that this is the common case too as
it makes sense that the author is loading a particular set of
information, which is presented in a particular format.

I do think that supporting the case of downloading something which
type you don't know is a use-case that we need to support. But I don't
think that the way to do that is to have XHR parse things into every
conceivable format at the same time. I also am not a big fan of the
lazy-decode-into-whatever-format-users-want. It makes it much too easy
for a site to use up more memory and CPU than it needs. Maciej pointed
out one good example of when that can happen with authors using
responseText.length to measure progress.

We already have the situation of too much memory use in a couple of
cases today. The simplest example is if someone uses XMLHttpRequest
the way the name actually encourages, downloading an XML file. In that
case we need to store both the parsed Document as well as the unparsed
string (or binary data) in memory until the XHR is GCed.

We also have the situation in that data is continuously concatenated
to the end of the already downloaded data. For users that can handle
progressive handling of the downloaded content, keeping all the
so-far-downloaded data in memory is pure waste. (To make matters
worse, we don't just keep all the data in memory, but for each
additional downloaded piece of content, we over and over reallocate a
ever-expanding block of memory. This is behavior that we can improve
but never make perfect).

Another problem we currently have, which Darin pointed out, is the
kitchensink issue. Currently XMLHttpRequest deals both with the
network request, as well as parsing the response.

Having a "mode" parameter, like responseType, which controls how to
treat the response has the potential to address many of these issues.
For the case when it's known what type of response is expected this
seems to work very well. So you could set .responseType = "document"
to have the result parsed into a document, .responseType = "text" to
parse into text, .responseType = "binary" to parse into a ArrayBuffer,
and .responseType = "blob" to stream to a blob. We can even add
.responseType = "stream" to have the result as a Stream object which
can be used for various streaming solutions.

I'm struggling a bit with what an ideal API looks like to support the
case of downloading something of an unknown type. For the case of the
Content-Type header containing enough information we could have an
event which is fired as soon as all header data is available, but
before any of the response body has been processed. At that point the
.responseType property could still be allowed to be modified.

For the case when looking at the response body is required to
determine how the response should be handled I'm less sure. One
solution would be to say that people can just use .responseType =
"binary" or .responseType = "blob" and then do the processing
themselves. Another, somewhat hacky, solution is to say that it's
allowed to change .responseType from "binary" to any other value at
any point. At that point the XHR object would reparse the contents
using that type.

However I'd rather prefer to move parsing into various types out from
the XHR object. So have the ability to parse binary into XML, HTML,
JSON, text, etc could be separated from the XHR object. These parsers
could then be handed either a Stream or a ArrayBuffer and be reused
for both Files, XHR, and JS-created ArrayBuffers.

Somewhat ambitious, but wanted to put some thoughts that has been
poking around in my head out there.

/ Jonas

Received on Friday, 29 October 2010 03:30:22 UTC