- From: Alex Miłowski <alex@milowski.com>
- Date: Tue, 3 Jun 2014 14:13:54 -0700
- To: XProc WG <public-xml-processing-model-wg@w3.org>
Here's a minimalist version that can address the needs of EPUB 1. We have a zip step that just zip files and directories with control over compression. <p:declare-step type="p:zip"> <p:input port="source" primary="true"/> <p:output port="result" primary="true"/> <p:option name="target"/> <p:option name="brief" select="'true'"/> </p:declare-step> The input is a c:archive element and the output is a c:archive. If the 'brief' element is true, only the c:archive element is output. Otherwise, the full list of every entry is provided on the output. If the 'target' option is not specified, the c:archive element must have an 'href' attribute. 2. We have an unzip step that: * can list a manifest of what is in the zip file * extract the zip locally (e.g. on disk) with the location specified via an option. <p:declare-step type="p:unzip"> <p:output port="result" primary="true"/> <p:option name="href" required="true"/> <p:option name="target"/> <p:option name="brief" select="'true'"/> <p:option name="manifest-only" select="'true'"/> </p:declare-step> The archive is specified via the 'href' option. The result is extracted to the target location specified by the 'target' option. If that option is not specified, the target is generated from the source. The output of the step is a c:directory element. If 'brief' is true, only the directory is listed. Otherwise, every file and subdirectory is listed in the output. Alternatively, if the 'manifest-only' is true, the output is a c:archive element listing all the entries in the zip file. The 'target' and 'brief' options are ignored when 'manifest-only' is true. 3. The manifest uses c:entry elements instead of files: element c:archive { & attribute href { text }?, & attribute base { text}?, c:file* } element c:entry { & attribute path { text }, & attribute modified { text }, & attribute size { text }, & attribute comment { text }?, & attribute compressed { "true" | "false" }?, & attribute directory { "true" | "false" }? } 4. A new step p:zip-extract extracts a single entry from a zip file as the output of the step: <p:declare-step type="p:zip-extract"> <p:input port="source" primary="true"/> <p:output port="result" primary="true"/> <p:option name="href" required="true"/> </p:declare-step> The input is expected to be a single c:entry element. We could consider allowing a c:archive element to extract multiple files. We would need to provide a way to designate whether the results are outputs or written to local storage. We could consider allowing a 'target' option so that the entries are extracted to local storage. 5. In the future, a p:zip-modify step can handle updating or deleting entries as well as merging zip files. 6. In the future, we could consider allowing directory entries to have inclusion/exclusion patterns for handling file inclusion. This would allow one to zip only files of certain extensions within a directory. Use cases: 1. Creating an EPUB file: <p:zip> <p:input port="source" brief="false"> <p:inline> <c:archive href="book.epub" base="book"> <c:entry path="mimetype" compressed="false"/> <c:entry path="META-INF" directory="true"/> <c:entry path="content" directory="true"/> </c:archive> </p:inline> </p:input> <p:zip> produces (for example): <c:archive href="book.epub"> <c:entry path="mime type" compressed="false"/> <c:entry path="META-INF/" directory="true"/> <c:entry path="META-INF/container.xml" compressed="true"/> <c:entry path="content/" directory="true"/> <c:entry path="content/book.opf" compressed="true"/> <c:entry path="content/book.ncx" compressed="true"/> <c:entry path="content/book.xhtml" compressed="true"/> </c:archive> 2. Unpack an EPUB: <p:unzip href="book.epub" target="book" brief="false"> produces (for example): <c:directory href="book/"> <c:file href="mimetype"/> <c:directory href="book/META-INF/"> <c:file path="book/META-INF/container.xml"/> </c:directory> <c:directory href="book/content/"> <c:file href="book/content/book.opf"/> <c:file href="book/content/book.ncx"/> <c:file href="book/content/book.xhtml"/> </c:directory> </c:archive> 3. Getting content from an EPUB file: <p:zip-extract href="book.epub"> <p:input port="source"> <p:inline> <c:entry path="content/book.xhtml"/> </p:inline> </p:input> </p:zip-extract> produces (for example): <html xmlns="http://www.w3.org/1999/xhtml"> ... </html> -- --Alex Milowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics
Received on Tuesday, 3 June 2014 21:14:22 UTC