- From: Eric U <ericu@google.com>
- Date: Wed, 11 May 2011 17:13:46 -0700
- To: Glenn Maynard <glenn@zewt.org>
- Cc: Jonas Sicking <jonas@sicking.cc>, timeless <timeless@gmail.com>, Web Applications Working Group WG <public-webapps@w3.org>, Charles Pritchard <chuck@jumis.com>, Kinuko Yasuda <kinuko@google.com>
On Wed, May 11, 2011 at 4:52 PM, Glenn Maynard <glenn@zewt.org> wrote: > On Wed, May 11, 2011 at 7:08 PM, Eric U <ericu@google.com> wrote: >> >> > *everywhere*, both on Turkish and on English systems. Things could >> > only be case sensitive when serialized to a real file system outside >> > of the API. I'm not proposing a case insensitive system which is >> > locale aware, i'm proposing one which always folds. >> >> > no, if the api is case insensitive, then it's case insensitive >> You're proposing not just a case-insensitive system, but one that forces >> e.g. an >> English locale on all users, even those in a Turkish locale. I don't >> think >> that's an acceptable solution. >> >> I also don't think having code that works in one locale and not another >> [Glenn's "image.jpg" example] is fantastic. It was what we were stuck >> with when >> I was trying to allow implementers the choice of a pass-through >> implementation, >> but given that that's fallen to the realities of path lengths on Windows, >> I feel >> like we should try to do better. > > To clarify something which I wasn't aware of before digging into this > deeper: Unicode case folding is *not* locale-sensitive. Unlike lowercasing, > it uses the same rules in all locales, except Turkish. Turkish isn't just > an easy-to-explain example of one of many differences (as it is with Unicode > lowercasing); it is, as far as I see, the *only* exception. Unicode's case > folding rules have a special flag to enable Turkish in case folding, which > we can safely ignore here--nobody uses it for filenames. (Windows filenames > don't honor that special case on Turkish systems, so those users are already > accustomed to that.) So it's not locale-sensitive unless it is, but nobody does that anyway, so don't worry about it? I'm a bit uneasy about that in general, but Windows not supporting it is a good point. Anyone know about Mac or Linux systems? > That said, it's still uncomfortable having a dependency on the Unicode > folding table here: if it ever changes, it'll cause both interop problems > and data consistency problems (two files which used to be distinct filenames > turning into two files with the same filenames due to a browser update > updating its Unicode data). Granted, either case would probably be > vanishingly rare in practice at this point. Agreed [both in the discomfort and the rarity], but I think it's a very ugly dependency anyway. > All that aside, I think a much stronger argument for case-sensitive > filenames is the ability to import files from essentially any environment; > this API's filename rules are almost entirely a superset of all other > filesystems and file containers. For example, sites can allow importing > (once the needed APIs are in place) directories of data into the sandbox, > without having to modify any filenames to make it fit a more constrained > API. Similarly, sites can extract tarballs directly into the sandbox. > (I've seen tars containing both "Makefile" and "makefile"; maybe people only > do that to confound Windows users, but they exist.) I've actually ended up in that situation on Linux, with tools that autogenerated makefiles, but were run from Makefiles. It's not a situation I really wanted to be in, but it was nice that it actually worked without me having to hack around it. > I'm not liking the backslash exception. It's the only thing that prevents > this API from being a complete superset, as far as I can see, of all > production filesystems. Can we drop that rule? It might be a little > surprising to developers who have only worked in Windows, but they'll be > surprised anyway, and it shouldn't lead to latent bugs. It can't be a complete superset of all filesystems in that it doesn't allow forward slash in filenames either. However, I see your point. You could certainly have a filename with a backslash in it on a Linux/ext2 system. Does anyone else have an opinion on whether it's worth the confusion potential? >> Glenn: >> > This can be solved at the application layer in applications that want >> > it, without baking it into the filesystem API. >> >> This is mostly true; you'd have to make sure that all alterations to the >> filesystem went through a single choke-point or you'd have the potential >> for >> race conditions [or you'd need to store the original-case filenames >> yourself, >> and send the folded case down to the filesystem API]. > > Yeah, it's not necessarily easy to get right, particularly if you have > multiple threads running... > > > > (The rest was Charles, by the way.) Ah, sorry Glenn and Charles. >> > A virtual FS as the backing for the filesystem API does not resolve that >> > core >> > issue. It makes sense to encourage authors to gracefully handle errors >> > thrown >> > by creating files and directories. Such a need has already been >> > introduced >> > via Google Chrome's unfortunate limitation of a 255 byte max path >> > length. > > > -- > Glenn Maynard > > >
Received on Thursday, 12 May 2011 00:16:46 UTC