W3C home > Mailing lists > Public > xproc-dev@w3.org > February 2015

Re: Unzipping large ZIP files

From: Kraetke, Martin, le-tex <martin.kraetke@le-tex.de>
Date: Thu, 26 Feb 2015 17:36:36 +0100
Message-ID: <54EF4B94.1070101@le-tex.de>
To: Jostein Austvik Jacobsen <josteinaj@gmail.com>
CC: XProc Dev <xproc-dev@w3.org>
Hi Jostein,

we developed a simple unzip step that extracts a zip file into a output 
directory. You can either check out our entire calabash repository via 
SVN here or simply adopt our Java class and xproc-configuration.

https://subversion.le-tex.de/common/calabash/

The output of the step is a XML representation of the unzip directory. 
Here is a XProc example:

<p:declare-step
     xmlns:p="http://www.w3.org/ns/xproc"
     xmlns:c="http://www.w3.org/ns/xproc-step"
     xmlns:letex="http://www.le-tex.de/namespace"
     version="1.0">
     
     <p:input port="source">
         <p:empty/>
     </p:input>
     <p:output port="result"/>
     
     <p:option name="zip" required="true"/>
     <p:option name="path" required="true"/>
     
     <p:import href="../ltx-lib.xpl"/>
     
     <letex:unzip name="unzip">
         <p:with-option name="zip" select="$zip"/>
         <p:with-option name="dest-dir" select="$path"/>
         <p:with-option name="overwrite" select="'yes'"/>
     </letex:unzip>
     
</p:declare-step>

Kind regards,

Martin

-- 
Martin Kraetke
Lead Content Engineer
le-tex publishing services GmbH

Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 143, Fax +49 341 355356 543
martin.kraetke@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler

Am 26.02.2015 um 16:36 schrieb Jostein Austvik Jacobsen:
> I want to unzip an EPUB which contains a lot of big image files.
>
> The EXProc extension step pxp:unzip 
> <http://exproc.org/proposed/steps/other.html#unzip> seems to require 
> that I unzip the image files as base64-encoded c:data elements which I 
> then have to p:store. This loads all the images into memory.
>
> I want to basically unzip all the contents of a ZIP file to a 
> directory without loading all its contents into memory. Is there a way 
> to do this that I'm missing or do I have to implement a custom 
> extension step for this?
>
> If unzip gets standardized into XProc 2.0, could a "target" or 
> "output-dir" option maybe be added that allows unzipping directly to a 
> directory instead of a c:data element?
>
> Regards
> Jostein
Received on Thursday, 26 February 2015 16:37:08 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 26 February 2015 16:37:08 UTC