Re: Alternative File API

Nikunj,

>>>
>> While the above API does have the advantages that we agree come with 
>> a model that stems from EventTarget and events, I'm concerned that 
>> we've complicated the API for an edge case.  I *do* agree that 
>> progress events are desirable, especially given leaky abstractions 
>> for file systems (e.g. the user plugs in a networked drive, which is 
>> surfaced from the input type="file" picker) which could behave 
>> slowly.  But this seems like a desirable edge case which we should 
>> find another solution for, and not overhaul the entire API.
>
> Do we need asynchronous APIs if files are local and file system access 
> is fast? If we do, then why do we not also need progress events? 
Note that I say that I *do* agree that progress feedback is useful.  
However, I'm not convinced that they will always be used, since in 
*most* cases, file access should be fast.  The discussion here is *how 
best* to integrate progress feedback, not *if* we should integrate it. I 
describe progress feedback for making file data available to read as an 
edge case because I think that in most cases we won't need it.  But 
again, I accept that it is useful.
> It seems that the whole WebApps WG has accepted the "desirable edge 
> case" of dealing with system delays such as in SQL databases and file 
> systems through the use of asynchronous APIs.
>
Yes, but that is different than giving feedback about progress!  
Asynchronous APIs are desirable in general, so as not to block in the 
main thread.  The existing file read mechanism in Firefox today (which 
is non-standard!) is, in fact, synchronous [1].  Standardizing this kind 
of an API was a non-starter (but it is used), although the present "TR" 
of the File specification (which I think should be obsolete) still 
stipulates synchronous reads [2].
>> In fact, progress events for file APIs seem pretty sugary in general; 
>> many of the other platforms I've looked at for File APIs don't have 
>> them.
>
> Please cite the platforms you have researched.
Check out :

1. Silverlight OpenFileDialog API (API reference with SDK download): 
http://www.microsoft.com/video/en/us/details/4f14da66-e263-4ef2-8d42-f90dc4c00384

"File reading" doesn't generate its own progress events, but you could 
use 'bytes read' for progress feedback, especially over network upload 
scenarios, and use asynchronous callbacks.  Again, this isn't the same 
as a dedicated Progress Event; Silverlight developers should correct me 
if I'm off here.  To get the API reference, you may have to download the 
SDK (Windows only AFAICT).  I'd like to get more feedback from 
developers who use Silverlight about what they'd like from a File API 
for the web.

Google Gears:

2. http://code.google.com/apis/gears/api_desktop.html#File
3. http://code.google.com/apis/gears/api_blob.html

Stuff like desktop.openFiles(callback) follows callback mechanisms 
similar to what's in the existing API, but without dedicated 
asynchronous accessors.  You can get the File's data as a Blob.  But if 
you want ProgressEvent, you can get it using HttpRequest 
(http://code.google.com/apis/gears/api_httprequest.html and 
http://code.google.com/apis/gears/api_httprequest.html#ProgressEvent), 
but only for upload to a server.   I'd like to get more feedback from 
developers who use Gears regularly.

Then, there's Java File I/O, which has been modified by JSRs at least 
twice (JSR51, and JSR203).  Each of these evolved new capabilities, 
including asynchronous I/O and polling (non-blocking) reads.  In 
particular, check out:

4. http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html
5. http://java.sun.com/j2se/1.4.2/docs/api/java/io/FileInputStream.html
6. 
http://java.sun.com/javase/6/docs/api/javax/swing/ProgressMonitorInputStream.html

Here, you *can* get progress updates using ProgressMonitorInputStream 
(6. above) with the FileInputStream (5. above); this is relatively 
easy.  Or, you could monitor byte updates on the FileChannel.  This 
platform gives you optional progress feedback, e.g. for large files.

But the most compelling case is Flash (Flex AS3 ref):

7. http://livedocs.adobe.com/flex/3/langref/
8. In particular: 
http://livedocs.adobe.com/flex/3/langref/flash/filesystem/FileStream.html
9. Also: 
http://livedocs.adobe.com/flex/3/langref/flash/filesystem/FileStream.html#event:progress

FileStream uses progress events, and you can listen for these (doesn't 
bubble) when dealing with large files (bytesLoaded, bytesTotal, etc. -- 
see 9. above).  We've discussed the fact that GMail uses Flash for file 
access scenarios on this listserv and *falls back* to <input 
type="file"> if Flash is disabled or simply not on the system [3].  In 
that discussion [3], the use case for progress events is: "file upload 
progress" which we get with XHR and the File API, even as currently 
written.  We can also slice( ) files.  Again, I'm in favor of progress 
feedback, but I stick to my guns when I assert that they aren't that 
important for *file reads* :-)

Flash has progress events for File reads, however, and so should the 
web.  Again, to date, this wasn't cited as the reason why Flash is used 
[3], at least for GMail.  This discussion isn't about whether or not we 
should have progress detection ability for file reads.  In general, I'd 
like to get more feedback than what we got with [3] from developers who 
use Flash about what they'd like from a File API for the web.

Finally, there's Adobe's JavaScript API for Flash extensions on the 
authoring platform:

10. http://www.adobe.com/devnet/flash/articles/jsapi.html

This has no progress detection capability AFAICT, but I think you can 
add event handling using other mechanisms.

So to summarize: I am amenable to progress feedback in the File API on 
reads; when discussing what mechanism to do this is best, the sense I 
got is that the "alternative API" proposal [4] (or something like it) 
was deemed "a more correct API" [5] than adding a callback to the 
existing draft.  The draft should absolutely change to be "more correct" 
but my concerns about simplicity aren't going away :-)
>
>>  That's not to say that the web shouldn't have it -- I'm just 
>> pointing out that I think most users of the API will simply call the 
>> API to get the file.  And, I think that in the lion's share of use 
>> cases, things will behave rapidly enough to not warrant the use of 
>> progress events (except during the networked/plugged in scenarios).
>
> or for that matter asynchronous callbacks? Why do you think users will 
> want asynchronous callbacks in the "lion's share of use cases"?
I assert that the lion's share of use cases will simply want to get the 
file (without progress events), do something, and then upload the file 
as efficiently as possible (with progress events).  This is what I mean 
by the lion's share of use cases.  I accept that network drives and 
plugged in devices are a good use case for progress events, but do not 
think they constitute a majority use case.  Do you disagree?

 >>Honestly, I don't like to use events for file access.

We *could* also have something like FileStream (as Flash does) or 
FileInputStream (as Java does).  One reason to not do that currently is 
we lack primitives for bytes or byte arrays in JavaScript.  This could 
change over the course of time with subsequent versions of the 
ECMAScript standard.  It's been pointed out before that asynchronous 
callbacks on the event loop or event callbacks both lead to asynchronous 
access to a file's contents.  One advantage of the alternative API [4] 
is that it resembles what XHR does.

 >>I don't know of another programming library that does [use events for 
file access].

This is generally what I found, yes.  Aaron points out that the web 
platform is inconsistent anyway [6] which I agree with :)  My initial 
draft did not use events for reads.
 
 >>However, there needs to be a way to separate the reading of a file 
from the file itself. Properties of a file such as its length as well as 
a temporary URI belong on the file.

I think the alternative API [4] reflects this separation of reading a 
file from the file itself; I think "size" is an attribute that should be 
on Data (or FileData).   File, which inherits from Data (or FileData), 
should have the temporary URL as an attribute.
 
-- A*

[1] https://developer.mozilla.org/En/NsIDOMFile
[2] http://www.w3.org/TR/file-upload/
[3] http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/1110.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0565.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0664.html
[6] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0685.html

Received on Wednesday, 19 August 2009 01:05:09 UTC