W3C home > Mailing lists > Public > xproc-dev@w3.org > June 2010

Re: Can one within a for-each loop wrap, output, sink a set of files and continue processing with remaining files?

From: Romain Deltour <rdeltour@gmail.com>
Date: Thu, 3 Jun 2010 20:43:44 +0200
Message-Id: <ED422C7B-84E2-4B50-A1F6-8E321E44CAA8@gmail.com>
To: Alex Muir <alex.g.muir@gmail.com>, xproc-dev@w3.org
Hi Alex,

If I'm understanding correctly your intent and your pipeline, you  
should rather use the @group-adjacent attribute of the p:wrap-sequence  
step to pack 200 files at a time.

Explanation:
In your pipeline, almost everything happens in one big p:for-each that  
iterates over the 1000 files. The p:choose subpipeline is executed  
only every 200 file, and the wrapper's input is a sequence of this  
unique file (modulo 200).
Actually, rather that grouping files by sets of 200, you ignore 199  
files and wrap only the 200th in an element before processing it.

What I would do is:

p:for-each => to iterate through the 1000 files and load the documents
(note the result of this first p:for-each is a sequence of 1000  
documents)
p:wrap-seqence[@group-adjacent] => split the sequence of 1000 into 200- 
sets
p:for-each => another iteration over the 5 packs of 200 files, to  
process each pack at a time

I hope this helps and I'm not missing your point...

BR
Romain.

Le 3 juin 10 à 18:32, Alex Muir a écrit :

> Hi,
>
> I'm trying to read ~10000 files within a for-each loop, wrap a  
> selection from each set of 200 files and process them to output 1  
> html file, sink the processed files and continue with the remaining  
> files processing 200 at a time.
>
> Is that possible in xproc?
>
> I've got something like the following which I can't get to work. I  
> think that wrapper cannot be used within a for-each, is that the case?
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step 
> "
>     xmlns:cx="http://xmlcalabash.com/ns/extensions"  
> name="wrapWithinForEach" version="1.0">
>
>     <p:input port="source">
>         <p:inline>
>             <xml/>
>         </p:inline>
>     </p:input>
>
>     <p:output port="result" sequence="true"/>
>
>     <p:declare-step type="cx:message" version="1.0">
>         <p:input port="source"/>
>         <p:output port="result"/>
>         <p:option name="message" required="true"/>
>     </p:declare-step>
>
>
>     <!-- ***** Starting and Ending File Numbers ***** -->
>     <p:variable name="startingFileNumber" select="'1'"/>
>     <p:variable name="endingFileNumber" select="'10000'"/>
>     <p:variable name="numberPerFile" select="'200'"/>
>
>     <!-- source and output folder variables -->
>     <p:variable name="source-folder" select="'completed/XML/'"/>
>     <p:variable name="output-folder" select="'MDNA/'"/>
>     <p:variable name="error-folder" select="'MDNA/error/'"/>
>     <p:variable name="exception-folder" select="'MDNA/exception/'"/>
>
>
>     <p:directory-list>
>         <p:with-option name="path" select="$source-folder">
>             <p:empty/>
>         </p:with-option>
>     </p:directory-list>
>
>
>     <p:for-each name="MDNA">
>
>
>         <p:iteration-source
>             select="//c:file[position() ge  
> number($startingFileNumber) and position() le  
> number($endingFileNumber)]"/>
>
>         <p:variable name="fileName" select="c:file/@name"/>
>         <p:variable name="startingIterationPosition"
>             select="number(p:iteration-position()) +  
> number($startingFileNumber)-1"/>
>
>        <cx:message>
>             <p:with-option name="message"
>                 select="concat('-----------------------------',  
> 'Iteration-position:','  ', $startingIterationPosition, '  File: ',  
> $fileName,'-----------------------------')"
>             />
>         </cx:message>
>
>         <p:load>
>             <p:with-option name="href" select="concat($source-folder, 
> $fileName)"/>
>         </p:load>
>
>         <cx:message>
>             <p:with-option name="message" select="'######    
> ExtractContent'"/>
>         </cx:message>
>         <p:xslt name="ExtractContent">
>             <p:input port="source"/>
>             <p:input port="stylesheet">
>                 <p:document href="ExtractContent.xsl"/>
>             </p:input>
>             <p:input port="parameters">
>                 <p:empty/>
>             </p:input>
>         </p:xslt>
>
>         <p:identity name="wrap"/>
>
>
>         <p:choose>
>             <p:when test="position() mod $numberPerFile eq 0">
>                 <p:wrap-sequence wrapper="WRAP" name="wrapper">
>                     <p:input port="source">
>                         <p:pipe port="result" step="wrap"/>
>                     </p:input>
>                 </p:wrap-sequence>
>
>
>                 <p:xslt name="CreateHTML">
>                     <p:input port="source"/>
>                     <p:input port="stylesheet">
>                         <p:document href="CreateHTML.xsl"/>
>                     </p:input>
>                     <p:input port="parameters">
>                         <p:empty/>
>                     </p:input>
>                 </p:xslt>
>
>
>                 <p:identity name="out_file"/>
>
>                 <p:store name="OUT">
>                     <p:with-option name="href"
>                         select="concat($output-folder,  
> 'MDNASections','-',$startingFileNumber,'-' , 
> $endingFileNumber,'.html')">
>                         <p:pipe step="out_file" port="result"/>
>                     </p:with-option>
>                 </p:store>
>
>                 <p:sink name="sinkIt"/>
>
>             </p:when>
>         </p:choose>
>
>     </p:for-each>
>
>
> </p:declare-step>
>
>
>
>
> Regards
>
>
> -- 
> Alex
>
> An informal recording with one mic under a tree leads to some pretty  
> sweet acoustic sounds.
> https://sites.google.com/site/greigconteh/albums/diabarte-and-sons
Received on Thursday, 3 June 2010 18:44:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 3 June 2010 18:44:21 GMT