Re: New filesystem/directory API proposal

On Fri, Jan 29, 2010 at 2:02 PM, Arve Bersvendsen <arveb@opera.com> wrote:
> On Fri, 29 Jan 2010 22:32:11 +0100, Eric Uhrhane <ericu@google.com> wrote:
>
>> What the app sees should be a
>> case-insensitive, case-preserving filesystem.  That's easy to implement on
>> top of the most-commonly-used filesystems, and not hard to emulate on
>> case-sensitive systems.  Problems due to interactions with applications
>> running outside the browser should be extremely rare and minor.
>
> On case-sensitive file systems, this would leave an application unable to
> correctly resolve and distinguish a file system entry named 'foo' where
> there is another entry named 'Foo'. Without even attempting to speak for
> anyone else, I've had this situation occur on multiple instances when
> re-digitizing my old music collection: The older rip would simply have
> different capitalization from the newer one.

Yes, this is the kind of rare case I was talking about.  It wouldn't
happen from inside a web app; you'd have to do it from a client-side
app, on a case-sensitive filesystem, have it make this kind of
situation, not clean up the extra copies, and then expose that to the
browser.  There's no reason that the UA couldn't provide a way to view
both files anyway; Windows has for many years dealt with filename
collisions by generating new aliases for the affected files.  Do you
think it's worth speccing something like that?  Possibly I should just
add some non-normative discussion about it.

>> No file or directory may be named any of [CON, PRN, AUX, NUL, COM1, COM2,
>> COM3,  COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5,
>> LPT6, LPT7, LPT8, LPT9], nor may a file or directory's name begin with any
>> of those strings followed by a period.
>
> Note: These are perfectly legal on many filesystems, but will specifically
> fail with Windows.
>
>> Filenames may not end in period or space.
>
> These are perfectly legal filenames on many systems.
>
>> Paths can't contain any character in the set [<>:"/\|?*] or whose
>> representation
>> in UTF-8 is in [0-31].
>
> This is the NTFS limitation, right?

IIRC the FAT family of filesystems also don't like those characters,
but I'd have to check.  I pulled that list from a document I wrote up
a while back; if you want the sources, I'll dig them up.  We should
also talk about filename and path length restrictions, and the number
of files a directory can hold, areas where I believe NTFS actually
does better than many others.

> It's at this stage I would like to note that file system limitations, if
> they are to be made into a common subset of common Unix filesystems, HFS/OS
> X and FAT/NTFS (Windows), we are looking at a rather restrictive subset, and
> we would possibly be left in the situation where a web application will be
> unable to address files already present on the user's file system.

Yes, generally Windows is the most restrictive environment w.r.t.
filenames.  However, I disagree that the subset is all that
restrictive.  I very rarely come across files that would run afoul of
the above limits.  Add in the requirement that I'd need to access
those files from a web app...how likely is that?  Examples of likely
problem cases would be most appreciated.  While this spec is designed
to cover sandbox access from inside the browser, I certainly do
anticipate the browser sharing files with client apps.

On the plus side, with these restrictions, anything that runs when you
test it on one platform should run exactly the same on other
platforms.  I'd really like to avoid passing the buck on filenames to
the app developer, so that they can easily write apps that only work
on the platform on which they're testing.  That means either
specifying the LCD or specifying something more powerful.  How would
UAs emulate case-sensitive filesystems on top of case-insensitive
ones, without obfuscating the filenames such that they didn't look
right to external apps?

> While we might enforce these rules for creation of new file handles, we
> can't enforce them on reading

Hmm...that's an interesting idea.  It would mean that you could grab a
filename from a file, try to make a copy of in another directory, and
fail for the illegal filename...but only on e.g. Linux.  That seems
like a poor user experience, but I could be convinced otherwise.
Given that I believe that any of these problems are rare, I can't
really object to this all that much ;'>.

> Any text for file name system limitations should therefore be informational
> in a spec, not normative.  (In general, I do support discouraging creating
> non-portable filenames)

I disagree; I think that if we want web apps that work all over the
place, we should do our best to restrict behavior that would get in
the way.  Let's make it very easy to write portable code.

     Eric

Received on Friday, 29 January 2010 22:47:48 UTC