Re: [File API: FileSystem] Path restrictions and case-sensitivity from Glenn Maynard on 2011-05-09 (public-webapps@w3.org from April to June 2011)

From: Glenn Maynard <glenn@zewt.org>
Date: Sun, 8 May 2011 20:32:11 -0400
To: timeless <timeless@gmail.com>
Cc: Eric U <ericu@google.com>, Web Applications Working Group WG <public-webapps@w3.org>, Charles Pritchard <chuck@jumis.com>, Kinuko Yasuda <kinuko@google.com>
Message-ID: <BANLkTi=tM6Bk2OTFisQuYgpdOxMVjL1pjQ@mail.gmail.com>

A detail which somehow hasn't been mentioned: Unicode case folding is
locale-indepedent, with Turkish-I as a special case (
http://unicode.org/Public/UNIDATA/CaseFolding.txt) which can ordinarily be
disabled.

Just the same, having an interop dependency on Unicode's definition of case
folding is uncomfortable.  The case folding table is over 1000 rows and will
doubtless grow as characters are added, so not every implementation will
match (and the table may even change over time as the browser is updated,
which could cause some strange situations).

It would also require case folding to be exposed directly to scripts
(string.foldCaseFull() or string.foldCaseSimple()), so scripts are able to
compare filenames to see if they will refer to the same file.  That would
probably be nice to have anyway, but here it would be a new source of
obscure bugs, because many (probably "most") developers would use
string.toLowerCase() for this purpose, which would work often enough for
developers to not notice the problems.

Again, if the API is case-sensitive, applications can hide this fact from
users if wanted, without the API forcing it on them.  For example, when a
user enters a filename to save a file, the application can do its own
case-insensitive search for files with the same filename.  It can then
normalize the new filename to match the existing one, and presumably at that
point show its "overwrite file?" prompt.

> Were the IndexedDB spec in a more mature state, it would be much easier
> from a developer perspective, to write a FileSystem API layer on top of
it,
> and use all of its goodness including arbitrary file names. The layer is
reasonably
> small, very easy to write in JS.

You can't implement the FileWriter API this way, though, and that's a lot of
File-API use cases.

I wonder if Blob and IndexedDB implementations will mature enough to
efficiently handle downloading and saving large blocks of data.  For
example, a game installer should be able to download arbitrarily large game
data files.  In principle this can be done efficiently: just download the
file into a Blob and pass it to IndexedDB.  The browser should scratch large
Blobs to disk transparently.  However, making the second part efficient is
harder: saving the Blob to IndexedDB without a second on-disk copy (possibly
totalling several GB) being made, copying from the Blob scratch space to
IndexedDB.

Another issue that comes to mind: a game installation page (which handles
downloading and storing game data) would want to write data incrementally.
If it's downloading a 100 MB video file, and the download is interrupted, it
will want to resume where it left off.  With FileWriter and Filesystem API
that's straightforward, but with Blobs and IndexedDB it's much trickier.  I
suppose, in principle, you could store the file in 1MB slices to the
database as it's downloading, and combine them within the database when it
completes.  This seems hard or impossible for implementations to handle
efficiently, though.

It'll be great if IndexedDB/Blob implementations are pushed this far, but
I'm not holding my breath.

-- 
Glenn Maynard

Received on Monday, 9 May 2011 00:32:39 UTC