W3C home > Mailing lists > Public > xproc-dev@w3.org > February 2011

RE: Unzipping .bz2 ?

From: <vojtech.toman@emc.com>
Date: Wed, 9 Feb 2011 08:41:51 -0500
To: <xproc-dev@w3.org>
Message-ID: <3799D0FD120AD940B731A37E36DAF3FE32B2DF3D4B@MX20A.corp.emc.com>
I think this should work:
<p:with-option name="args" select="concat('-d -k ', $filename)"/>

That the output of p:data cannot be processed by bunzip2 is because p:data base64 encodes the byte stream and wraps it in an XML wrapper element, which - from the bunzip2 perspective - totally destroys the original byte stream.


Regards,
Vojtech


--
Vojtech Toman
Consultant Software Engineer
EMC | Information Intelligence Group
vojtech.toman@emc.com
http://developer.emc.com/xmltech



> -----Original Message-----
> From: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] On
> Behalf Of Stefanie Haupt
> Sent: Wednesday, February 09, 2011 2:24 PM
> To: xproc-dev@w3.org
> Subject: Re: Unzipping .bz2 ?
> 
> Hi list,
> 
> I've found the error in my attempt and thought I'd share:
> 
> This works (there's still one thing that bothers me):
> <?xml version="1.0" encoding="UTF-8"?>
> <p:pipeline xmlns:p="http://www.w3.org/ns/xproc" version="1.0">
> 
>   <p:exec command="/bin/bzip2" result-is-xml="false" args="-d -k
> filename.xml.bz2"
>     wrap-result-lines="true">
> 
>     <p:input port="source">
>       <p:empty/>
>     </p:input>
> 
>   </p:exec>
> 
>   <p:identity/>
> </p:pipeline>
> 
> You might notice that the filename is hardcoded inside the arguments.
> I have not been able to use bzip2 without doing this. I've tried with
> <p:data href="filename.xml.bz2"/> instead of p:empty without luck.
> Reading the filename from commandline using p:option and repacing the
> string of args by "-d -k $filename" or "-d -k {$filename}" weren't
> lucky either. If you know how to solve this, I'd be happy to hear!
> Perhaps it's just a silly mistake I've made but I don't see the
> answer, so any help is appreciated, thank you! Even if you tell me
> simply: You can't! Many thanks in advance and
> best regards
> Stefanie
> 
> 
> 
> On Wed, Feb 2, 2011 at 11:43 AM, Stefanie Haupt <st.haupt@gmail.com>
> wrote:
> > Hi Jostein,
> >
> > I did that, sorry should have mentioned it - it does not change the
> error.
> > I have the impression that the engine somehow chokes on bzip2/bunzip2
> > (tried both variants) - I've never read a *module with no systemId*
> > error message before and can't find somehting helpful by googling.
> And
> > the error message would be different, if the engine would not be able
> > to access bzip2/bunzip2 at all.
> >
> > Kind Regards
> > Stefanie
> >
> > On Wed, Feb 2, 2011 at 11:31 AM, Jostein Austvik Jacobsen
> > <josteinaj@gmail.com> wrote:
> >> Are you sure that the result of the p:exec is valid XML? You could
> >> try result-is-xml="false" and see if that produces valid output...
> >> Regards
> >> Jostein
> >>
> >> 2011/2/2 Stefanie Haupt <st.haupt@gmail.com>
> >>>
> >>> Hello list,
> >>>
> >>> I'm trying to unzip some .bz2 file using XProc (using calabash
> >>> 0.9.32). Since they are not handled by cx:unzip (the archive is
> read
> >>> as empty) I thought I'd write a p:exec step. But that fails with a
> >>> fatal error. Can you tell me what's wrong?  I guess the most
> >>> interesting line of the error message would be this:  *module with
> no
> >>> systemId*:1:java.io.IOException: Broken pipe, however, I've
> included
> >>> the complete pipe and error message below.
> >>>
> >>> Many thanks in advance and kind regards,
> >>> Stefanie
> >>>
> >>> This is the pipeline:
> >>> <?xml version="1.0" encoding="UTF-8"?>
> >>> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> >>> xmlns:c="http://www.w3.org/ns/xproc-step"
> >>>  xmlns:cx="http://xmlcalabash.com/ns/extensions" version="1.0">
> >>>
> >>>  <p:input port="source">
> >>>    <p:data href="test.xml.bz2"/>
> >>>  </p:input>
> >>>
> >>>  <p:exec command="/bin/bunzip2" source-is-xml="false" result-is-
> xml="true"
> >>>    wrap-result-lines="false" name="unzip">
> >>>    <p:with-option name="args"
> >>>      select="'--keep'" />
> >>>  </p:exec>
> >>>
> >>>  <p:store href="test-unzipped.xml"/>
> >>>
> >>> </p:declare-step>
> >>>
> >>>
> >>> Error-message:
> >>> calabash --debug unzip.xpl
> >>> 02.02.2011 10:08:40
> com.xmlcalabash.util.DefaultXProcMessageListener info
> >>> INFO: Running pipeline !1
> >>> 02.02.2011 10:08:40
> com.xmlcalabash.util.DefaultXProcMessageListener info
> >>> INFO: Running exec unzip
> >>> 02.02.2011 10:08:40
> com.xmlcalabash.util.DefaultXProcMessageListener info
> >>> INFO: unzip.xpl:10:44:Exec: /bin/bunzip2 --keep
> >>> 02.02.2011 10:08:40
> com.xmlcalabash.util.DefaultXProcMessageListener error
> >>> SCHWERWIEGEND: *module with no systemId*:1:java.io.IOException:
> Broken
> >>> pipe
> >>> 02.02.2011 10:08:40
> com.xmlcalabash.util.DefaultXProcMessageListener error
> >>> SCHWERWIEGEND: java.io.IOException: Broken pipe
> >>> 02.02.2011 10:08:40 com.xmlcalabash.drivers.Main error
> >>> SCHWERWIEGEND: Pipeline failed:
> net.sf.saxon.s9api.SaxonApiException:
> >>> java.io.IOException: Broken pipe
> >>> 02.02.2011 10:08:40 com.xmlcalabash.drivers.Main error
> >>> SCHWERWIEGEND: Underlying exception:
> >>> net.sf.saxon.trans.XPathException: java.io.IOException: Broken pipe
> >>> net.sf.saxon.s9api.SaxonApiException: java.io.IOException: Broken
> pipe
> >>>        at
> net.sf.saxon.s9api.XQueryEvaluator.run(XQueryEvaluator.java:303)
> >>>        at com.xmlcalabash.library.Exec.run(Unknown Source)
> >>>        at com.xmlcalabash.runtime.XAtomicStep.run(Unknown Source)
> >>>        at com.xmlcalabash.runtime.XPipeline.doRun(Unknown Source)
> >>>        at com.xmlcalabash.runtime.XPipeline.run(Unknown Source)
> >>>        at com.xmlcalabash.drivers.Main.run(Unknown Source)
> >>>        at com.xmlcalabash.drivers.Main.main(Unknown Source)
> >>> Caused by: net.sf.saxon.trans.XPathException: java.io.IOException:
> Broken
> >>> pipe
> >>>        at
> >>> net.sf.saxon.serialize.TEXTEmitter.characters(TEXTEmitter.java:101)
> >>>        at
> >>> net.sf.saxon.event.ProxyReceiver.characters(ProxyReceiver.java:186)
> >>>        at
> >>>
> net.sf.saxon.event.ComplexContentOutputter.characters(ComplexContentOut
> putter.java:165)
> >>>        at
> net.sf.saxon.tree.tiny.TinyTextImpl.copy(TinyTextImpl.java:76)
> >>>        at
> >>>
> net.sf.saxon.event.ComplexContentOutputter.append(ComplexContentOutputt
> er.java:521)
> >>>        at net.sf.saxon.expr.Expression.process(Expression.java:503)
> >>>        at
> >>> net.sf.saxon.query.XQueryExpression.run(XQueryExpression.java:390)
> >>>        at
> net.sf.saxon.s9api.XQueryEvaluator.run(XQueryEvaluator.java:299)
> >>>        ... 6 more
> >>> Caused by: java.io.IOException: Broken pipe
> >>>        at java.io.FileOutputStream.writeBytes(Native Method)
> >>>        at java.io.FileOutputStream.write(FileOutputStream.java:297)
> >>>        at
> >>>
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> >>>        at
> >>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
> >>>        at
> net.sf.saxon.serialize.UTF8Writer.write(UTF8Writer.java:286)
> >>>        at
> net.sf.saxon.serialize.UTF8Writer.write(UTF8Writer.java:253)
> >>>        at
> >>> net.sf.saxon.serialize.TEXTEmitter.characters(TEXTEmitter.java:99)
> >>>        ... 13 more
> >>>
> >>>
> >>> --
> >>> Stefanie Haupt, M.A.
> >>>
> >>
> >>
> >
> >
> >
> > --
> > Stefanie Haupt, M.A.
> >
> 
> 
> 
> --
> Stefanie Haupt, M.A.
> 

Received on Wednesday, 9 February 2011 13:43:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 February 2011 13:43:51 GMT