- From: Marcos Caceres <marcosscaceres@gmail.com>
- Date: Mon, 19 Nov 2007 16:08:33 +1000
- To: "public-appformats@w3.org" <public-appformats@w3.org>
- Message-ID: <b21a10670711182208k444f7143xf5dacf9bca861737@mail.gmail.com>
Hi, At the TPAC Face-to-Face meeting in Boston we started discussing what should be the maximum allowed length for file and folder names in a widget archive. I've been some tests in Windows XP and discovered that when you create a file on the Windows desktop, the name can only be 218 characters long and folders can only be 208 characters. This seemed very strange, as it is generally assumed that long file names can be 255 characters long. So, why was it that Windows XP was not allowing me to create a 255 character long file name on the desktop? Initially I thought it must be some kind of issue in XP, so I also tried it out on Windows Vista: instead of 218 characters, I was only allowed to create a 235 character long file name (essentially the same problem). I started doing a bit of reading on MSDN about naming files in Windows [1], and it turns out that the maximum path length for Windows is 260 characters: "In the Windows API, the maximum length for a path is MAX_PATH, which is defined as 260 characters. A path is structured in the following order: drive letter, colon, backslash, components separated by backslashes, and a null-terminating character, for example, the maximum path on the D drive is "D:\<256 chars>NUL"." So it's not that the file name can be 255 characters! They can be 256 characters but ONLY if you create them at the root of a drive. This explains why in Windows XP I can only create a file name on the desktop that is 218 characters long. Consider the path to my Windows XP desktop is: C:\Documents and Settings\Marcos\Desktop\ (41 Characters) 218 + 41 (+1 NUL terminator) = 260 Windows Vista uses: C:\Users\Marcos\Desktop\ = 24 chars 235 + 24 (+1 NUL terminator) = 260 Upon further testing, it turns out that the maximum allowed length of a folder name is 244 characters. According to the MSDN article, this limit is imposed so the folder can contain at least one file that has a name that is 12 characters (eg. hello12.txt). So, you get a complete path that is 259 chars long + a NUL terminator. For example: C:\bbbbbbbbbbaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaacccccaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\hello12.txt (The funny thing is that Windows breaks its own rules here because one cannot then insert a file into that directory that follows the 8.3convention: hello12.txt = 1234567.123! According to the MSDN article on filenames, one should be able to.... anyways.) The implications for widgets packages are significant in some instances. Consider the case for Yahoo! Widgets. When one instantiates a widget on the Yahoo Widget engine, the widget unzipped representation is stored at: C:\Documents and Settings\[username]\Local Settings\Application Data\Yahoo\Widget Engine\Unzipped\[widgetname].widget\ (118 characters) Where [username] is the name of the logged in user, and is variable in length. And [widgetname] is the name of the widget, which is also variable in length. In effect, the sum of the paths inside the Zip file cannot exceed 256-118 = 138 characters or the widget engine is unable to address resources required by the widget. I did some tests to confirm this. (Should it be considered a security risk that widget resources are decompressed and left exposed to modification by other programs and widgets on the operating system? Having unprotected decompressed widgets on the hard disk makes them susceptible to modification.) This limitation does not apply to widget engines that don't decompress files to disk (or use other techniques to store uncompressed data), for example Opera Widgets. I'm going to assume that Opera is storing the decompressed widget data in RAM (however, I have no idea what Opera does in regards to decompressions of widgets????). For the record, I did some tests on a Mac (MacOS 10.4) too. On a Mac you can create a file or folder name with 255 characters anywhere. I had to create over one hundred nested folders in order to get MacOSX to start choking. However, one cannot create more than three folders with a 255 character long file name before Zipping fails to work on a Mac. (Another thing I discovered is that Mac supports Unicode file names. How it stores Unicode names in a Zip archive is a mystery, and Unicode file names still get mangled when you decompress a Mac-created Zip file on Windows and vice versa). The question now remains, what is the maximum file name/path length and folder length that Zip can handle? According to the Zip spec, file name (including the relative path) is variable in length, but the length can only be 2 bytes (65,535) and the combined length of any directory record should not exceed 65,535 bytes. From my testing, when 65,535 bytes are exceeded on a Mac, MacOs fails to create an archive. Some options for standardization are: 1. Mandate no maximum folder and filename lengths, with limitless relative path length and warn implementers of potential issues but not in normative words: Pro: sidesteps the problem by making it Zip's problem. Cons: potentially break existing implementations that decompress data to disk on Windows (eg. Yahoo! Widgets). Potentially increase interoperability issues because folder and path lengths may be too long. Makes it an implementor's problem, which is probably bad. 2. Mandate Windows (Win32) restrictions: a maximum path length of 255 characters, with a maximum folder size of 243 characters; recommend that authors stick to short paths around no more than 100 characters in length: Pro: Compatibility with Windows XP/Vista and other Win32-based Zip implementations. Cons: potentially break a very tiny number of MacOSX based widgets. Imposes Windows limitations on MacOS/Linux-based platforms. My feeling is to go with option 2, but warn implementers that they should be prepared to deal with path lengths that are longer than that allowed by the file system. Thoughts? Kind regards, Marcos [1] http://msdn2.microsoft.com/en-us/library/aa365247.aspx -- Marcos Caceres http://datadriven.com.au
Received on Monday, 19 November 2007 06:08:50 UTC