Re: File API proposal - marrying two alternatives from Jonas Sicking on 2009-10-07 (public-webapps@w3.org from October to December 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 6 Oct 2009 23:11:27 -0700
To: Garrett Smith <dhtmlkitchen@gmail.com>
Cc: "Nikunj R. Mehta" <nikunj.mehta@oracle.com>, Web Applications Working Group WG <public-webapps@w3.org>, arun@mozilla.com
Message-ID: <63df84f0910062311l6850a4a4vfe9525534a72cf4c@mail.gmail.com>
On Tue, Oct 6, 2009 at 10:55 PM, Garrett Smith <dhtmlkitchen@gmail.com> wrote:
> On Tue, Oct 6, 2009 at 10:07 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> On Tue, Oct 6, 2009 at 9:58 PM, Garrett Smith <dhtmlkitchen@gmail.com> wrote:
>>> On Tue, Oct 6, 2009 at 7:32 PM, Nikunj R. Mehta <nikunj.mehta@oracle.com> wrote:
>>>> I figure I could rewrite Jonas' proposal to make it more palatable (at least
>>>> to me) and satisfy the use cases and priorities I mentioned in [1]. Here's
>>>> his proposal to combine with File and FileData from the current ED.
>>>> interface FileData {
>>>> readonly atribute DOMString url;
>>>> readonly attribute unsigned long long size;
>>>>   FileData slice(in long long offset, in long long length);
>>>> };
>>>> interface File : FileData {
>>>>   readonly attribute DOMString name;
>>>> readonly attribute DOMString mediaType;
>>>> };
>>>> typedef sequence<File> FileList;
>>>> [Constructor, Implements=EventTarget]
>>>> interface FileRequest {
>>>>  readAsBinaryString(in FileData filedata);
>>>>  readAsText(in FileData filedata, [Optional] in DOMString encoding);
>>>>  readAsDataURL(in File file);
>>>>
>>>>  abort();
>>>>
>>>>  const unsigned short INITIAL = 0;
>>>>  const unsigned short LOADING = 1;
>>>>  const unsigned short DONE = 2;
>>>>  readonly attribute unsigned short readyState;
>>>>
>>>>  readonly attribute DOMString response;
>>>>  readonly attribute unsigned long status;
>>>>
>>>>  attribute Function onloadstart;
>>>>  attribute Function onprogress;
>>>>  attribute Function onload;
>>>>  attribute Function onabort;
>>>>  attribute Function onerror;
>>>>  attribute Function onloadend;
>>>> };
>>>> My main issues are the following:
>>>>
>>>> "File" interface is separate from FileData and that makes little sense at
>>>> this time. Can't the two be merged in to "File"? (Use case 3 - all the
>>>> metadata)
>>>> "FileRequest" should be renamed as "FileReader" as Arun pointed out [2].
>>>> The attributes "response" and "status" from the "FileRequest" interface make
>>>> no sense. They are copy-pasted from XHR but their purpose is unclear. This
>>>> is why I said that plainly copying XHR as the template for FileReader is not
>>>> a good idea.
>>>
>>> Yes, I mentioned XHR for example of registering a callback for an
>>> asynchronous action. I did not mean to encourage copying the XHR API.
>>>
>>>> It'd be better to define the actual "FileRequest" separately from a factory
>>>> of "FileRequest" objects. Consider what would happen if  a
>>>> single "FileRequest" object is used multiple times to read as the same or
>>>> different data types? What happens when I abort()? (Use case 2 - concurrent
>>>> access & priority 2)
>>>
>>> Yes, or what if two reads are called, who gets the success callback. I
>>> already mentioned that, and again, in code comment below.
>>
>> Same as with XHR. The second read cancels the first one.
>>
>>>> What is the meaning of LOADING and DONE? Once I create the reader, it should
>>>> be in the LOADING state automatically. FileReader, unlike XHR, does not have
>>>> an explicit send step.
>>>>
>>>
>>> The analogous "send" step, for FileReader, is "read".
>>>
>>> var reader = getAReader();
>>> reader.onsuccess = handler;
>>>
>>> // Kick of "read" request.
>>> reader.read(aFile);
>>>
>>> Why must the reader be in LOADING state automatically?
>>
>> I guess I don't feel strongly about if readystate should exist or not.
>
> Oh come on, we all know you know you love IE :-D.
>
>> I've never fully understood what makes it so useful, but it seems
>> popular. HTML5 adds it to a number of objects.
>>
> It is used for polling. Is there is another usage for readyState
> besides polling?
>
> The callback firing would be more efficient, and result in simpler
> code than polling the readyState.

The use case I heard for implementing document.readystate was for
scripts that were loaded into a document (or handed a document) and
didn't know if the load event had fired yet. If it had fired the
script could do some processing on the document. If it hadn't fired
the script would register a eventhandler for the "load" event and do
the processing once that fired.

I guess the same could be done here.

I've never heard of, or seen, anyone using readystate for polling,
though I imagine that there's code out there that does (no matter how
whacky something is, there's always code out there that does it).

>>> Why not get a Reader as:-
>>>
>>>  var reader = FileReader.create(FileReader.BINARY);
>>>  reader.onload = handler;
>>>  reader.read(myFile);
>>
>> I don't see this proposal changing any of the questions you've raised
>> above. The only thing it's changed is that once you've created a
>> reader you can only read data in one format. I'm not sure I see the
>> advantage of that API.
>>
>
> Calling read() on the reader twice would result in an error. That
> error can be avoided by create a new FileReader for each read. Same
> for either of our proposals.

It seems like that proposal could be done equally well for either of
our proposals. I don't really feel strongly if calling read/readAsX
twice should result in an error or the previous load being cancelled.
The main argument I can think of is that XHR allows reading multiple
times, and consistency is always nice.

> A bit of a side-topic, but how is a large file (2mb, etc) to be read?

I think the reading part is ok. I believe the main problem is that
strings weren't designed to hold such large amounts of data, but so
far it's the only data type we have to work with. Ideal would be to
have a ByteArray type which implementations could under the hood store
as several disconnected segments if desired.

Technically an implementation could do that for strings too, however I
believe the performance/complexity costs would be too high. In Firefox
we switched our string implementation from allowing segmented strings,
to requiring single-buffer strings a few years ago and it reduced
complexity and improved performance significantly.

/ Jonas
Received on Wednesday, 7 October 2009 06:12:22 UTC