W3C home > Mailing lists > Public > xproc-dev@w3.org > August 2015

Re: Newbie question

From: Paul Tyson <phtyson@sbcglobal.net>
Date: Mon, 10 Aug 2015 21:31:53 -0500
To: Dan Vint <dvint@dvint.com>
Cc: xproc-dev@w3.org
Message-ID: <1439260313.2358.28.camel@aquinas.attlocal.net>
Take courage, Dan!

I'm several months past a few complex xproc jobs, during which I learned
quite a bit, starting from very little. I won't venture to offer many
specifics because I don't recall all the details, and haven't worked
with xproc since then. But a few general comments below.

On Mon, 2015-08-10 at 16:32 -0700, Dan Vint wrote:
> Ok, moving the serialization information to the 
> store makes better sense to me. I was also able 
> to work around  my issue with generating text 
> instead of XML. I setup the stylesheet to wrap 
> the text output in a dummy element. Then setting 
> the store output to be text then strips out the 
> new element, so I get the text the way I want it.
> 
> In working through these issues I think I may not 
> be using the correct tools. Is XPROC just meant 
> for passing XML from one task to another and not 
> a more general scripting tool that understands typical XML processes?
> 

XProc 1.0 is specifically focused on XML pipelines, and the spec only
expects XML to go between steps. (Xproc 2 may relax this restriction.)
But as you've already discovered, all it takes to turn text into xml are
element start and end tags.

> Here is what I'm trying to do overall:
> 
> 1) Take a DITA Map file 
> and  process  the  structure and links to make a new flattened single file map.

No problem, that's in xproc's wheelhouse, as you've probably already
learned.

> 2) I want to store that flattened map to an XML 
> file that I will keep around, but I also want to 
> pass it to another stylesheet that creates a 
> batch file based upon the information in the XML. 
> I don't want to process this result anymore.

If you have the freedom to reimplement the batch file, you might be able
to recast it into xproc. 

When you talk about "keeping the flattened xml file around", you can of
course write it to file if needed, but you can also just reuse it in the
pipeline by piping the output to more than one subsequent steps as
needed.

> 3) I want to execute the batch file.
> 4) I also want to process all the XML files in a 
> given directory with the following:
>    4a) Run an ant script against all of the files.
>     4b) Run the resulting files through an 
> additional stylesheet to process the XML file further

I don't have any experience calling ant scripts from xproc, but p:exec
should work. Or, if xproc can do what the ant scripts are doing, you
could reimplement the ant script. Are these scripts from the DITA-OT by
any chance?

> 5) Zip all the resulting files together.

You want an extension step, pxp:zip
(http://xmlcalabash.com/docs/reference/pxp-zip.html).

> 
> Below is the XPROC file I have. It works properly 
> now for step 1 and 2. When I add the processing 
> for step 3 I start getting new errors. I've also 
> noticed that it appears that XPROC wants the file 
> to exist before it executes, so I can't pass the 
> file that I store to the second stylesheet. Looks 
> like I have to some how allow the pipeline of the 
> XML generation just work and somehow fork an 
> additional store/file create operation.

Looks like you have the right ideas, but you're not exploiting the
pipeline capability. Is step "store-flat" necessary, or can you just use
the output of "flatten-map" as input to "build-batfile"?

In general, it appears you are using shell script idioms instead of
pipeline idioms. In particular, it's never necessary to write a file
from one pipeline step and read it (using document()) from another step.
(Of course, if you need the file for other reasons, that's ok, but even
so, within the pipeline you can just connect the ports rather than read
the file in the consuming step). 

Depending on what your batch files are doing, xproc might be a natural
fit, or you might be better off just orchestrating a few xslt
conversions from a batch file. It took me a while to really "think in
pipelines", but it can lead to very elegant solutions for complex xml
processing.

Regards,
--Paul
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
>      name="Preprocess-translations"  version="1.0"
>      >
> 
>      <p:documentation>
>          This XPROC script runs all the steps 
> required to take an Ixiasoft Localization kit
>          and modify the source files and organize 
> them into a directory structure and zip
>          that content to be sent on for localization.
>      </p:documentation>
> 
> 
> 
>      <p:input port="source">
>          <p:document 
> href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_mobile-dan1370798002064.ditamap"/>
>      </p:input>
>      <p:output port="result" sequence="true">
>              <!-- Write the result to "store-flat" -->
>          <p:pipe port="result" step="store-flat"/>
>          <p:pipe port="result" step="store-batchfile"/>
>      </p:output>
> 
> 
> 
>          <!-- Process the DITAMAP file source to 
> travel all the included maps and topics
>              with stylesheet to flatten the 
> structure and gather information for later processing. -->
>      <p:xslt name="flatten-map">
>          <p:input port="source"/>
>          <p:input port="stylesheet">
>              <p:document 
> href="file:/C:/work/svn/scripts/translations/make-translations-map.xsl"/>
>          </p:input>
>          <p:input port="parameters" sequence="true">
>              <p:empty/>
>          </p:input>
> 
>      </p:xslt>
> 
>      <p:store 
> href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml" 
> 
>          name="store-flat"
>                encoding="utf-8"
>          method="xml"
>          indent="true"
>          omit-xml-declaration="false"/>
> 
>          <!-- Process the flattend map file to 
> create a batch file the copies topics/maps/images into a
>               new file structure for zipping to send to translators
>             -->
> 
>              <!-- XPROC seems to check for the 
> existence of the imput file before the prvious step creates it -->
>      <p:xslt name="build-batfile">
>          <p:input port="source">
> <!--            <p:document 
> href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml"/>
> -->        </p:input>
> 
> 
>          <p:input port="stylesheet">
>              <p:document 
> href="file:/C:/work/svn/scripts/translations/fileorg.xsl"/>
>          </p:input>
>          <p:input port="parameters" sequence="true">
>              <p:empty/>
>          </p:input>
> 
>      </p:xslt>
> 
>      <p:store 
> href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_copy-files.bat" 
> 
>          name="store-batchfile"
>          method="text"/>
> 
>           <!-- Run Ant script to remove the 
> oXy_delete processing instructions -->
> 
>           <!-- Process all the topics and maps in 
> the source folder and write to a new folder -->
> 
>           <!-- Run the copy.bat file against processed files -->
>          <p:exec name="execute-batchfile" 
> command="C:\Users\dan.vint\Desktop\DITAsource\fr-fr-loc-v1-source-maptesting\_copy-files.bat">
>          <p:input port="source">
>              <p:empty/>
>          </p:input>
>      </p:exec>
> 
>      <p:sink/>
> 
> 
> 
>           <!-- Zip the files for translation -->
> 
> 
> 
> </p:declare-step>
> 
> I'm thinking I need to switch to some other 
> scripting language and then possibly call a few 
> steps in XPROC. I started down this path because 
> the people using what I build will have oXygen 
> already installed. I was trying to avoid having 
> to configure a lot of other tools to make this 
> process work. So this is why I'm trying to stay in XPROC.
> 
> ..dan
> 
> 
> 
> At 10:44 AM 8/9/2015, Imsieke, Gerrit, le-tex wrote:
> >Serialization options in p:serialization refer to a port with the given
> >name. In your pipeline, there is a commented-out step that your
> >serialization settings should probably pertain to. You can add the same
> >serialization options (indent, etc.) to p:store as to p:serialization.
> >Instead of p:storing XML, you could output it on a named port with
> >individual p:serialization settings and store it to the respective files
> >in your invocation, e.g., -o flat=/path/to/flatmap2.xml
> >
> >Gerrit
> >
> >On 09.08.2015 19:35, Dan Vint wrote:
> > > thanks.
> > >
> > > The error I get in oXygen is:
> > >
> > > err:XS0039 : A p:serialization specifies a non-existant port. It is a
> > > static error if the port specified on the p:serialization is not the
> > > name of an output port on the pipeline in which it appears or if more
> > > than one p:serialization element is applied to the same port.
> > >
> > > Here is the code:
> > >
> > > <?xml version="1.0" encoding="UTF-8"?>
> > > <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> > >     name="Preprocess-translations"  version="1.0"
> > >     >
> > >
> > >     <p:documentation>
> > >         This XPROC script runs all the steps required to take an
> > > Ixiasoft Localization kit
> > >         and modify the source files and organize them into a directory
> > > structure and zip
> > >         that content to be sent on for localization.
> > >     </p:documentation>
> > >
> > >     <p:serialization
> > >         port="store-flat"
> > >         encoding="utf-8"
> > >         method="xml"
> > >         indent="true"
> > >         omit-xml-declaration="false"/>
> > >
> > >     <p:input port="source">
> > >         <p:document
> > > 
> > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_mobile-dan1370798002064.ditamap"/>
> > >
> > >     </p:input>
> > >     <p:output port="result ">
> > >             <!-- Write the result to "store-flat" -->
> > > <!--        <p:pipe port="result" step="store-flat"></p:pipe>
> > > -->    </p:output>
> > >
> > >
> > >
> > >         <!-- Process the DITAMAP file source to travel all the included
> > > maps and topics
> > >             with stylesheet to flatten the structure and gather
> > > information for later processing. -->
> > >     <p:xslt name="flatten-map">
> > >         <p:input port="source"/>
> > >         <p:input port="stylesheet">
> > >             <p:document
> > > href="file:/C:/work/svn/scripts/translations/make-translations-map.xsl"/>
> > >         </p:input>
> > >         <p:input port="parameters" sequence="true">
> > >             <p:empty/>
> > >         </p:input>
> > >
> > >     </p:xslt>
> > >
> > >  <!--   <p:store
> > > 
> > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml"
> > > name="store-flat" /> -->
> > >
> > >
> > > </p:declare-step>
> > >
> > > Ultimately I want to write the output to a specific file. I took out the
> > > store information as I thought it might be contributing to the problem
> > >
> > >
> > >
> > >
> > > At 04:46 PM 8/7/2015, Imsieke, Gerrit, le-tex wrote:
> > >> I think it would help us help you if we saw the actual pipeline.
> > >>
> > >> On 08.08.2015 01:42, Dan Vint wrote:
> > >> > Thanks for the pointer. So I tried to make se of this based upon some
> > >> > examples I found and I'm getting a message about the port not being
> > >> > bound. I'm using the result port, but nothing seems to be making this
> > >> > work. I need to do more digging, thanks.
> > >> >
> > >> > .dan
> > >
> > > ---------------------------------------------------------------------------
> > > Danny Vint
> > >
> > > Panoramic Photography
> > > http://www.dvint.com
> > >
> > > voice: 619-647-5780
> > >
> > >
> >
> >--
> >Gerrit Imsieke
> >Geschäftsführer / Managing Director
> >le-tex publishing services GmbH
> >Weissenfelser Str. 84, 04229 Leipzig, Germany
> >Phone +49 341 355356 110, Fax +49 341 355356 510
> >gerrit.imsieke@le-tex.de, http://www.le-tex.de
> >
> >Registergericht / Commercial Register: Amtsgericht Leipzig
> >Registernummer / Registration Number: HRB 24930
> >
> >Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
> >Thomas Schmidt, Dr. Reinhard Vöckler
> 
> ---------------------------------------------------------------------------
> Danny Vint
> 
> Panoramic Photography
> http://www.dvint.com
> 
> voice: 619-647-5780
>      
> 
> 
Received on Tuesday, 11 August 2015 02:32:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:17:25 UTC