W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > April 2012

Some notes about pxp:zip and pxp:unzip

From: <vojtech.toman@emc.com>
Date: Thu, 19 Apr 2012 06:16:53 -0400
To: <public-xml-processing-model-wg@w3.org>
Message-ID: <F3C7EBECE80AC346BE4D1C5A9BB4A41F2ED03DD69E@MX11A.corp.emc.com>
> * [10]A-206-02: Jim to start drafting a note for p:zip/p:unzip

Perhaps Jim has already run into this, but I thought I should bring up some issues (those that I remember) with implementing the current EXProc pxp:zip/pxp:unzip steps.

pxp:zip:
- What about source files that are not included in the pxp:zip manifest? Is that an error or do they end up in the ZIP archive under their original base URI?
- Serialization. At the moment, pxp:zip does not allow to specify how XML documents are serialized in the ZIP archive. I ended up with adding serialization options to pxp:zip which are applied to each XML file and are therefore archive-global. It might be useful, though, to be able to specify different serialization options per file - but that would probably require putting the serialization options into the pxp:zip manifest somehow.
- Not sure about the compression level names "smallest" | "fastest" | "default" | "huffman" | "none". They are a direct lift from the Java java.util.zip.Deflater API. Plus, the "huffman" constant is not a compression level, but a compression strategy. I think it should not be in the list.
- The pxp:zip step returns a c:zipfile representation of the ZIP archive on the "result" port. While I understand that this might be useful, it is not consistent with existing standard steps that write output to an external location (p:store, p:xsl-formatter) and that return a URI reference to the external data. 

pxp:unzip:
- I think for non-XML data, the step should behave as p:data or p:http-request. Right now, the pxp:unzip spec says that: "If the content-type specified is not an XML content type, the file is base64 encoded and returned in a single c:data element." This obviously does not match the behavior of p:data wrt text media types. The pxp:unzip step also does not insert the "content-type" and "encoding" attributes on the c:data wrapper.
- What happens if the file specified through the "file" option is not found in the archive (I assume a dynamic error)?

Regards,
Vojtech

--
Vojtech Toman
Consultant Software Engineer
EMC | Information Intelligence Group
vojtech.toman@emc.com
http://developer.emc.com/xmltech
Received on Thursday, 19 April 2012 10:17:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 April 2012 10:17:50 GMT