Re: [XHR] chunked from Charles Pritchard on 2011-12-01 (public-webapps@w3.org from October to December 2011)

From: Charles Pritchard <chuck@jumis.com>
Date: Wed, 30 Nov 2011 17:45:27 -0800
To: Jonas Sicking <jonas@sicking.cc>
CC: Anne van Kesteren <annevk@opera.com>, WebApps WG <public-webapps@w3.org>
Message-ID: <4ED6DC37.4000404@jumis.com>
On 11/30/11 3:44 PM, Jonas Sicking wrote:
> On Wed, Nov 30, 2011 at 1:03 PM, Charles Pritchard<chuck@jumis.com>  wrote:
>> On Nov 30, 2011, at 12:41 PM, Jonas Sicking<jonas@sicking.cc>  wrote:
>>
>>> On Wed, Nov 30, 2011 at 12:13 PM, Charles Pritchard<chuck@jumis.com>  wrote:
>>>> On Nov 30, 2011, at 11:32 AM, Jonas Sicking<jonas@sicking.cc>  wrote:
>>>>
>>>>>> Charles asked whether "chunked-text" was really needed (and whether we
>>>>>> should have "chunked" which implies ArrayBuffer instead). Nobody got back to
>>>>>> him on that.
>>>>> Any text based format would benefit from chunked-text. While the
>>>>> example above uses a binary format, it applies equally to text based
>>>>> formats. And given how much we in this group seem to prefer text based
>>>>> formats, (HTML, CSS, Javascript, EventSource, JSON) I think we should
>>>>> assume that other people at least use them, if not prefer them.
>>>> My thinking was that ArrayBuffer can easily be converted to String by authors. Even with text-based formats, I prefer to fetch data as blob and buffer.
>>> Why?
>> Because it helps with general methods to pass buffers around, and stream processing is buffer based. I don't need to worry about character sets or errant binary data. It's raw and easy to port, being a Transferable object.
> Why couldn't you use strings instead of buffers when dealing with
> textual data? And why don't you need need to worry less about

I can, it just requires extra calls to and from String.fromCharCode and 
charCodeAt. Using buffers is going to be faster, of course, as they are 
based on Typed Arrays, whenever I'm doing heavy computation. I do use 
strings. But for heavy processing and abstracted methods, I use buffers 
and typed arrays.

> characters sets if you are doing the charset conversion rather than
> the UA? I would imagine you'd have to worry more about character sets
> if you take on that burden rather than let the UA do it.

I don't worry about character sets at all. I treat the content as 
opaque. If there's an error in the character set, that's up to some 
other part of the process to figure out. I treat the data as fairly 
opaque. I rarely am trying to sanitize or validate data. I'm just 
looking for particular values in it.

> I can see the argument of being a Transferable object. But why
> transfer the chunked data between threads? Just do the load on the
> thread that is going to interpret the data.

That may often be the case, but when I transfer processed data back to 
the main thread, I still may want it as a buffer.
I've worked a lot on string processing, in old JS and JS with typed 
arrays. I've certainly ridden the ins and outs and use all methods 
available.

>> Many of my encoding and decoding methods expect byte arrays.
> And these methods are dealing with textual data? Note that no-one is
> proposing the ability to do chunked-arraybuffer. The question is if we
> should have chunked-text as well.

It's a good question, I don't have an answer. I'd sure like an 
ArrayBuffer to string method. Converting to a Blob then using 
FileReader, is bit of extra work, as is running fromCharCode over the 
array to build a string.  A conversion method would make 
chunked-arraybuffer quite easy to convert into a string...

Your suggestion that chunked-text would always return complete 
multi-byte codes was pretty good, as is the notion that whatever 
chunked-text returns is DOMString-safe, so it'd always work with 
something like localStorage.setItem. And those are perhaps features that 
a conversion method would not handle as well,.

Sorry if I created confusion or chaos here. Just reporting on what I 
tend to do.


-Charles
Received on Thursday, 1 December 2011 01:45:52 UTC