- From: Marcos Caceres <marcosscaceres@gmail.com>
- Date: Thu, 22 Nov 2007 17:02:44 +1000
- To: "public-appformats@w3.org" <public-appformats@w3.org>, "Arve Bersvendsen" <arveb@opera.com>
- Message-ID: <b21a10670711212302p385b55eet702177f7a0963ee6@mail.gmail.com>
Hi all, I've drafted some initial text for file and folder naming restrictions for widgets. I would really appreciate any feedback: =File and folder names= For the purpose of this specification, a zip relative path is the variable-length string value of the file name field of a local file header of a Zip archive (see [Zip] for definitions and details of the file name field and local file header). Each file stored in the Zip archive is assigned its own local file header [Zip]. A zip relative path is said to be "relative" as it stores the string that represents file and folder names relative to where the zip archive was created on a file system (eg. images/bg.png), as opposed to storing an absolute path on the file system (eg. c:\temp\images\bg.png). The value of a zip relative path will generally match the string value of a name of the file or folder(s) on the device on which the zip archive was created. The zip relative path will represent one of: * the name of a file (eg. index.html), * the name of a folder (eg. logs/), * the name of a folder within a hierarchy of folders (eg. styles/sounds/), * or the name of a file within a hierarchy of folders (eg. styles/images/background.png). For each file name field in a Zip archive, the zip relative path must be encoded as either US-ASCII or UTF-8. Other encodings must not be used and if encountered a widget user agent must treat the zip archive as an invalid Zip archive. For interoperability, and where possible, encoding in US-ASCII is preferred. In a Zip archive, when general purpose bit 11 of a local file header is set to 0, the zip relative path must be processed as US-ASCII in accordance with the rules for validating US-ASCII paths (below). When general purpose bit 11 of a local file header is set to 1, the zip relative path must be processed as UTF-8 in accordance with the rules for validating UTF-8 paths. Irrespective of encoding, a zip relative path must be treated as case insensitive. As such, if a Zip archive contains two or more file names in the same folder that map to the same string following normalization on caseless matching as described in [Unicode Case Mapping], then the widget user agent must treat the zip archive as being an invalid Zip archive. ==Rules for validating US-ASCII paths== Unless otherwise stated, any violation of the following conformance statements means that the Zip archive is non-conforming and a widget user agent must treat it as an invalid Zip archive. A US-ASCII relative path is the string derived from the zip relative path that matches the production for ascii-rel-path in the following ABNF and conforms to the proceeding conformance clauses of this section: ascii-rel-path = ( *folder [ filename ] ) folder-name = 1*243allowed-characters delimiter delimiter = "/" filename = 1*255( *basename [file-extension] ) basename = allowed-characters file-extension = "." 1*allowed-characters allowed-characters = ALPHA / DIGIT / SP / "$" / "%" / "'" / "-" / "_" / "@" / "~" / "`" / "!" / "(" / ")" / "^" / "#" / "&" / "+" / "," / "." / ";" / "=" / "[" / "]" / %x80-FF ALPHA, DIGIT, and SP are defined in the [ABNF] specification, but essentially represent alphanumerical characters and the space (x20) character. The first or last character of US-ASCII relative path must not be space characters. A US-ASCII relative path must not be an empty string, meaning that widget resources must not be created by storing or compressing data from standard out straight into the zip archive. The last character of a US-ASCII relative path must not be a "." (x2E). The following forbidden characters must not appear anywhere in a US-ASCII relative path: * < (0x3C) *> (0x3E) * : (0x3A) * " (0x22) * \ (0x5C) *| (0x7C) * ?( 0x3F) * * ( 0x2A) * / ( 0x2F) * control characters (x0-1F) In addition, the following reserved words must not appear as either a folder or a basename in a US-ASCII relative path: CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9. For example, the following files and folder names are allowed: "CON-tact.txt", "LPT11/", "DCOM1.pdf". The following names are not allowed: "com3.txt" "Lpt1/", "COM9.gif" For interoperability, it is preferred that total number of characters in US-ASCII relative path does not exceed 255 characters. === Kind regards, -- Marcos Caceres http://datadriven.com.au
Received on Thursday, 22 November 2007 07:03:00 UTC