W3C home > Mailing lists > Public > public-appformats@w3.org > November 2007

Long File Names and Widgets

From: Marcos Caceres <marcosscaceres@gmail.com>
Date: Mon, 19 Nov 2007 16:08:33 +1000
Message-ID: <b21a10670711182208k444f7143xf5dacf9bca861737@mail.gmail.com>
To: "public-appformats@w3.org" <public-appformats@w3.org>
Hi,
At the TPAC Face-to-Face meeting in Boston we started discussing what should
be the maximum allowed length for file and folder names in a widget archive.
I've been some tests in Windows XP and discovered that when you create a
file on the Windows desktop, the name can only be 218 characters long and
folders can only be 208 characters. This seemed very strange, as it is
generally assumed that long file names can be 255 characters long. So, why
was it that Windows XP was not allowing me to create a 255 character long
file name on the desktop? Initially I thought it must be some kind of issue
in XP, so I also tried it out on Windows Vista: instead of 218 characters, I
was only allowed to create a 235 character long file name (essentially the
same problem).

I started doing a bit of reading on MSDN about naming files in Windows [1],
and it turns out that the maximum path length for Windows is 260 characters:


"In the Windows API, the maximum length for a path is MAX_PATH, which is
defined as 260 characters. A path is structured in the following order:
drive letter, colon, backslash, components separated by backslashes, and a
null-terminating character, for example, the maximum path on the D drive is
"D:\<256 chars>NUL"."

So it's not that the file name can be 255 characters! They can be 256
characters but ONLY if you create them at the root of a drive. This explains
why in Windows XP I can only create a file name on the desktop that is 218
characters long. Consider the path to my Windows XP desktop is:

C:\Documents and Settings\Marcos\Desktop\ (41 Characters)
218 + 41 (+1 NUL terminator) = 260

Windows Vista uses:
C:\Users\Marcos\Desktop\ = 24 chars
235 + 24 (+1 NUL terminator) = 260

Upon further testing, it turns out that the maximum allowed length of a
folder name is 244 characters. According to the MSDN article, this limit is
imposed so the folder can contain at least one file that has a name that is
12 characters (eg. hello12.txt). So, you get a complete path that is 259
chars long + a NUL terminator. For example:
C:\bbbbbbbbbbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaacccccaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\hello12.txt
(The funny thing is that Windows breaks its own rules here because one
cannot then insert a file into that directory that follows the 8.3convention:
hello12.txt = 1234567.123! According to the MSDN article on filenames, one
should be able to.... anyways.)

The implications for widgets packages are significant in some instances.
Consider the case for Yahoo! Widgets. When one instantiates a widget on the
Yahoo Widget engine, the widget unzipped representation is stored at:

C:\Documents and Settings\[username]\Local Settings\Application
Data\Yahoo\Widget Engine\Unzipped\[widgetname].widget\ (118 characters)

Where [username] is the name of the logged in user, and is variable in
length. And [widgetname] is the name of the widget, which is also variable
in length. In effect, the sum of the paths inside the Zip file cannot exceed
256-118 = 138 characters or the widget engine is unable to address resources
required by the widget. I did some tests to confirm this.  (Should it be
considered a security risk that widget resources are decompressed and left
exposed to modification by other programs and widgets on the operating
system? Having unprotected decompressed widgets on the hard disk makes them
susceptible to modification.)

This limitation does not apply to widget engines that don't decompress files
to disk (or use other techniques to store uncompressed data), for example
Opera Widgets. I'm going to assume that Opera is storing the decompressed
widget data in RAM (however, I have no idea what Opera does in regards to
decompressions of widgets????).

For the record, I did some tests on a Mac (MacOS 10.4) too. On a Mac you can
create a file or folder name with 255 characters anywhere. I had to create
over one hundred nested folders in order to get MacOSX to start choking.
However, one cannot create more than three folders with a 255 character long
file name before Zipping fails to work on a Mac. (Another thing I discovered
is that Mac supports Unicode file names. How it stores Unicode names in a
Zip archive is a mystery, and Unicode file names still get mangled when you
decompress a Mac-created Zip file on Windows and vice versa).

The question now remains, what is the maximum file name/path length and
folder length that Zip can handle? According to the Zip spec, file name
(including the relative path) is variable in length, but the length can only
be 2 bytes (65,535) and the combined length of any directory record should
not exceed 65,535 bytes. From my testing, when 65,535 bytes are exceeded on
a Mac, MacOs fails to create an archive.

Some options for standardization are:
1.     Mandate no maximum folder and filename lengths, with limitless
relative path length and warn implementers of potential issues but not in
normative words:
    Pro: sidesteps the problem by making it Zip's problem.
    Cons: potentially break existing implementations that decompress data to
disk on Windows (eg. Yahoo! Widgets).  Potentially increase interoperability
issues because folder and path lengths may be too long. Makes it an
implementor's problem, which is probably bad.
2.    Mandate Windows (Win32) restrictions: a maximum path length of 255
characters, with a maximum folder size of 243 characters; recommend that
authors stick to short paths around no more than 100 characters in length:
    Pro: Compatibility with Windows XP/Vista and other Win32-based Zip
implementations.
    Cons: potentially break a very tiny number of MacOSX based widgets.
Imposes Windows limitations on MacOS/Linux-based platforms.

My feeling is to go with option 2, but warn implementers that they should be
prepared to deal with path lengths that are longer than that allowed by the
file system.

Thoughts?

Kind regards,
Marcos

[1] http://msdn2.microsoft.com/en-us/library/aa365247.aspx

-- 
Marcos Caceres
http://datadriven.com.au
Received on Monday, 19 November 2007 06:08:50 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:50:08 UTC