- From: Marcos Caceres <marcosscaceres@gmail.com>
- Date: Sun, 7 Oct 2007 16:02:01 +1000
- To: "Jon Ferraiolo" <jferrai@us.ibm.com>
- Cc: "public-appformats@w3.org" <public-appformats@w3.org>
- Message-ID: <b21a10670710062302s1ed7214ex93ed6a66ee571fe7@mail.gmail.com>
> The ideal approach from a standards perspective would be to separate out > the ZIP writeup into a separate standalone spec (i.e., don't reference the > OCF spec, just "repurpose" its technical approaches) so that it can be > reused by other initiatives (W3C or otherwise). When I was involved in the > OCF spec, the IDPF folks were amenable to updating their eBook specs to > point to an official standard packaging standard from other standards > bodies, where W3C and OASIS were the presumed likely choices, and W3C was > the top preference. Maybe the next version of ODF would reference such a W3C > standard. I am pretty sure they would conclude it's the right thing to do. > What you are proposing is a good idea, I'll remove the references to OCF and continue to repurpose the technical details. However, creating an independent Zip-based spec might be beyond the scope of WAF (although it would be nice if one day PKWARE contributed their spec to the W3C)... in any case, you will have to ask our working group chair if defining a distinct packaging spec can be part of the WAF charter (will someone at IBM be willing to edit it?;-)). Also, the W3C tried to standardize (XML) packaging in the past [1]. But, from what I gather, the working group was disbanded because of lack of industry interest/support. I haven't had time to think through the UTF-8 issues. A minor red flag is > raised when I see the word "MAY". Are you saying it is OK to use > platform-native encodings, Shift-JIS encoding or (showing my age) EBCDIC > encodings? Maybe there is an encoding field in the ZIP spec. (If I ever knew > about this field, I have forgotten it by now.) Remember, the goal of > standards are to promote interoperability, and if file name encodings are a > free-for-all, then interoperability might suffer. > My understanding of [2] (Appendix D) is that Zip either allows the IBM Code Page 437 encoding by default (general purpose bit 11 is off) or UTF-8 (general purpose bit 11 is on) . However, it then says: "Applications may choose to supplement this file name storage through the use of the 0x0008 Extra Field....Examples of the intended usage for this field is to store whether "modified-UTF-8" (JAVA) is used, or UTF-8-MAC. Similarly, other commonly used character encoding (code page) designations can be indicated through this field. Formalized values for use of the 0x0008 record remain undefined at this time. The definition for the layout of the 0x0008 field will be published when available." Regarding the issue of proprietary extensions, how about just staying silent > on the issue? Basically, the above OCF-like approach is a whitelisting > approach which identifies the fields that producers and consumers must > support. Other fields, whether define in the ZIP spec or extensions defined > by vendors, can be ignored by the consumer. For example, there isn't a > problem with MS (for example) extending ZIP to make the format do special > magic on Windows so long as the resulting ZIP file will still open with > non-MS software (e.g., WinZip) and continue to work on Mac and Linux > systems. > Agreed... but something in the way OCF specifies the ZIP subset is not sitting right (eg, it doesn't say which bits need to be turned on and off); that's why I went looking for alternatives like the one proposed by OOXML (OPC). Unless MS/ECMA are pulling a fast one on me (which may be likely from what I've been reading, eg.[3,4]), I'm not sure that OOXML packaging does any special Windows magic. (If anyone has any evidence to the contrary, please let me know). And, although there is plenty of justified criticism against the XML formats defined by OOXML, I haven't yet encountered evidence to suggests that OPC's usage/definition of Zip is broken. OOXML's competitor, ODF (ISO/IEC 26300), also defines a zip-based packaging format. However, having read section 17 of ODF, which defines packaging, I found it to be underspecified (I imagine that the IDPF folks did too, given that OCF seems to be heavily based on ODF): On page 697, for example, it talks about a "standard zip file" yet gives no reference or definition to what that is. Also, where the Zip specification is referenced, it points to an version of the Zip APPNote that "has been unofficially corrected and extended by Info-ZIP without explicit permission by PKWARE." Worst still, the Zip specification they reference is almost 11years old and may possibly be incompatible with current OSs implementations (I'm still waiting to hear from Microsoft about which version of the Zip Appnote they actually implemented; does anyone know which version Apple implemented in OSX?). Anyway, IMO, that pretty much cancels out ODF as a potential reference for Widgets. </rant> Marcos [1] http://www.w3.org/XML/2000/07/xml-packaging-charter [2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT [3] http://en.wikipedia.org/wiki/Office_Open_XML [4] http://www.noooxml.org/ -- Marcos Caceres http://datadriven.com.au
Received on Sunday, 7 October 2007 06:02:12 UTC