Re: File API to separate reading from files from イアンフェッティ on 2009-09-01 (public-webapps@w3.org from July to September 2009)

From: イアンフェッティ <ifette@google.com>
Date: Mon, 31 Aug 2009 18:35:27 -0700
To: Garrett Smith <dhtmlkitchen@gmail.com>
Cc: "Nikunj R. Mehta" <nikunj.mehta@oracle.com>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <bbeaa26f0908311835r5bb04187v2e8ecbcb83d415f3@mail.gmail.com>
I would like to make another plug for
http://dev.w3.org/2006/webapi/fileio/fileIO.htm
This had the notion of writing files, file streams, directories, and
being able to integrate into the host filesystem. All of these are
important for reasons I outlined in
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022388.html
and subsequent replies.

Quoting that email:
"I would much rather have a well thought-out local filesystem
proposal, than continued creep of the existing File and Local Storage
proposal. These proposals are both designed from the perspective of "I
want to take some existing data and either put it into the cloud or
make it available offline". They don't really handle the use case of
"I want to create new data and save it to the local filesystem", or "I
want to modify existing data on the filesystem", or "I want to
maintain a virtual filesystem for my application, and potentially map
in the existing filesystem" (e.g. if I'm flickr and I want to be able
to read the user's "My Photos" folder, send those up, but also make
thumbnails that I want to save locally and don't care if they get
uploaded, maintain an index file with image metadata / thumbnails /
.... locally, save off some intermediate files, ... For this, I would
really like to see us take another look at
http://dev.w3.org/2006/webapi/fileio/fileIO.htm (I don't think this
spec is exactly what we need, but I like the general approach of
"origins get a virtual filesystem tucked away that they can use, they
can fread/fwrite/fseek, and optionally if they want to interact with
the host FS they can request that and then get some sub-set of that
(e.g. "my documents" or "my photos") mapped in.."


2009/8/31 Garrett Smith <dhtmlkitchen@gmail.com>
>
> On Wed, Aug 19, 2009 at 11:47 AM, Nikunj R.
> Mehta<nikunj.mehta@oracle.com> wrote:
> > Here's an alternative, more easily extensible, proposal for reading files.
> > It provides applications a way to read small amounts of data at a time. It
> > also allows applications to concurrently read the same file.
> I Agree.
>
> [snip]
>
>
> [snip example]
>
>
> > Secondly, a list of files can be obtained using some UI.
> > typedef sequence<File> FileList;
>
> Agree.
>
> > Thirdly, an abstract interface is an input stream that is not limited to
> > files. It works at the level of bytes that files are made of. The read()
> > operation can specify the extent that is required. If an application wishes
> > to read small increments, it can thus specify those increments. Of course,
> > the File interface identifies its size, so the application can suitably
> > choose increments. Processing of blocks read from the file occurs in
> > callbacks. XHR could also consider taking an InputStream parameter during
> > the send() operation.
>
> Would it be possible to have a reader handle creating the input stream
> and making the decision based on what type of Reader it is, passing
> byte offset lengths to the input stream -- essentially hiding those
> details?
>
> [snip example]
>
> > Fifthly, a file can be used for reading an input stream by specifying the
> > name of a file when constructing the stream
> > [Constructor(in File toOpen)]
> > interface FileInputStream : InputStream {
> > }
> > Sixthly, one can create various kinds of derived readers such as text
> > reader, binary string reader, and data URL reader. By inheriting from
> > InputStream, the basic mechanisms such as abort and onerror are inherited.
> > Moreover, the base read behavior is altered by the subclass although it
> > behaves in a similar manner, except that the data seen outside is different.
> > [Constructor(in InputStream base)]
> > interface BinaryStringInputStream : InputStream {
> >   read(in StringDataHandler, [optional in] long long offset, [optional in]
> > long long length);
> > }
> > The callback is provided a DOMString. The String's length is expected to
> > match the increment requested.
> > [CallBack=FunctionOnly]
> > interface StringDataHandler {
> > handle(in DOMString data);
> > }
> > For text reading, encoding is optionally specified.
> > [Constructor(in InputStream base, [optional in] DOMString encoding)]
> > interface TextInputStream : InputStream {
> >   read(in StringDataHandler, [optional in] long long offset, [optional in]
> > long long length);
> > }
> >
> > A file can be alternatively read as a dataURL using a similar kind of
> > handler as above.
> > [Constructor(in InputStream base)]
> > interface FileDataURL: InputStream {
> >   read(in StringDataHandler, [optional in] long long offset, [optional in]
> > long long length);
> > }
> > This API has the advantage that it can cleanly be extended to deal with both
> > writing use cases and binary data. Furthermore, it can also support
> > extensions that perform cryptographic, compression, or coding on top of the
> > basic interfaces.
> > To compare with the editor's draft, here's a typical programming case in
> > JavaScript:
> > var fileList = ...
> > // There is a mistake in the example provided in Section 3 where it does
> > fileList.files[0]
> > var myFile = fileList[0];
>
> That's odd.
>
> > // *According to my proposal*
> > var stream = new TextInputStream(new FileInputStream(myFile), "UTF-16");
> > stream.read(handleDataAsText);
>
> // don't you need to add the onerror before "read()"?
>
> > stream.onerror = errorHandler;
> > function handleDataAsText(fileContent) {
> > }
> > function errorHandler(error) {
> > }
> > Note the two differences:
> > 1. Error handling is separated from file reading
>
> Right. Method handleDataAsText does one thing only, as does the error handler.
>
> You seem to have misplaced the "onerror". Shouldn't that, as
> commented, be assigned before - read - is called? Could read() raise
> an exception immediately?
>
> Why put the callback as an argument? What is wrong with having a
> success callback?
>
> A generic "read" method puts the type of reading on the stream, as you
> would have it. Read just sends a message: "read", but does not specify
> the details.
>
> > 2. Two extra objects are needed to read text data out of the file. However,
> > the composability of input streams enables a far richer library to operate.
>
> I don't see why this is important.
>
> For the purpose of the goals of this specification, is it the
> complexity justified? I had the "Reader" idea and that was deemed too
> complex, but what I see you proposing sounds, well, flexible, but more
> involved. There's more busywork just to read a file.
>
> > This API matches more closely the Java API for IO.
>
> That is not necessarily ideal.
>
> Design decisions a decade ago in a different language, for different
> contexts might not be the best decisions for this context.
>
> I feel a bit odd about giving an API critique to someone who seems to
> be a lot more knowledgeable and experienced. But anyway, this proposal
> is extensible. It does not paint itself into a corner like the other.
>
> Is it possible to simplify the interface a little bit? I'm not married
> to the Reader idea, but it was a simpler API.
>
> Regards,
>
> Garrett
>
Received on Tuesday, 1 September 2009 01:38:17 UTC