W3C home > Mailing lists > Public > xproc-dev@w3.org > October 2011

Re: sequence of filtered documents as input of p:xslt

From: Matthieu Ricaud-Dussarget <matthieu.ricaud@igs-cp.fr>
Date: Tue, 18 Oct 2011 10:43:38 +0200
Message-ID: <4E9D3C3A.1010307@igs-cp.fr>
To: Geert Josten <geert.josten@daidalos.nl>
CC: "xproc-dev@w3.org" <xproc-dev@w3.org>
Hi Geert,

Sorry for my late reply, I had to do other things than XProc... not so 
exciting things actually ;-)!
That's clear with p:load needed here.. I could actually have guess it !

The spec is clear about this  : viewport is just a scope on specific 
elements selected by xpath, it's not aiming at loading any external document
"Each matching node in the source document is wrapped in a document 
node, as necessary, and provided, one at a time, to the 
viewport's/<http://www.w3.org/TR/xproc/#dt-subpipeline> subpipeline/"

The goal of p:viewport seems to change the content of selected elements 
in the source document :
"The result of thep:viewportis a copy of the original document where the 
selected subtrees have been replaced by the results of applying the 
subpipeline to them."
which is not really what I want to do here, I might be able to use this 
for my purpose but I think an iteration element is more adapted.

The difference I see between p:viewport and p:for-each is about the way 
to select elements "catch" elements : p:viewport uses a match attribute 
and whereas p:for-each uses a select attribute. I suppose 
match="h:object" is more efficient than select="//h:object" (on big 
documents).

My documents are not so big, so for the moment I keep on using 
p:for-each, but I think it's good to be aware of this (If i'm true with 
my interpretation)

Keep on learning,
Thanks for your help,

Best Regards,
Matthieu

Le 14/10/2011 20:25, Geert Josten a écrit :
> Hi Matthieu,
>
> Actually, you would need a p:load in front of the p:xslt step in my example as well (at least, that would be most logical). I overlooked it, sorry.. :P
>
> Kind regards,
> Geert
>
> -----Oorspronkelijk bericht-----
> Van: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] Namens Matthieu Ricaud-Dussarget
> Verzonden: vrijdag 14 oktober 2011 16:33
> Aan: xproc-dev@w3.org
> Onderwerp: Re: sequence of filtered documents as input of p:xslt
>
> Hi Geert,
>
> I finally find the time to test p:viewport, but I don't manage to get it
> do what I want.
> As far as I understand p:viewport split my whole XHTML document into
> small ones with only<object>  element in it.
> That means my XSLT has to read the @data attribute and then switch to
> the SVG file itselft with document() function.
> I actually would like to keep the XSLT simple and independant from xproc
> pipeline (so I can run it directly on any SVG file on my system)
>
> This is the code I've tried :
> <p:viewport match="h:object">
> <p:output port="result" primary="true"/>
> <p:xslt name="PDFTronSVG2epubFixedSVG">
> <p:input port="stylesheet">
> <p:document href="../xslt/PDFTronSVG2epubFixedSVG.xsl"/>
> </p:input>
> <p:input port="parameters"><p:empty/></p:input>
> </p:xslt>
> <p:store encoding="UTF-8" omit-xml-declaration="false" indent="true">
> <p:with-option name="href" select="concat('PAGES_SVG_01/',
> tokenize(document-uri(/),'/')[last()])"></p:with-option>
> </p:store>
> <p:identity>
> <p:input port="source">
> <p:pipe step="PDFTronSVG2epubFixedSVG" port="result"/>
> </p:input>
> </p:identity>
> </p:viewport>
> The p:store doesn't work as expected cause document-uri(/) doesn't refer
> to the SVG file as I expected.. it seems document-uri(/) is a nul
> sequence when processing a "virtual" xproc document ?
> A document called "PAGES_SVG_01" has been created on my file system
> which is a copy of the whole XHTML doc I had as input.
>
> Hoping my interpretation are correct. Let's say it if not !
>
> I think I'll turn to p:load solution as Romain suggests.
> Anyway it's good to learn new step which might be usefull in another
> context.
>
> Thanks for your help,
>
> Regards,
> Matthieu
>
> Le 13/10/2011 14:09, Geert Josten a écrit :
>> Hi Matthieu,
>>
>> I did something similar a few times. I just used a p:viewport to do the trick. The p:viewport takes the XHTML with the objects as input. You give the viewport a match on object. That will cause the viewport to return the object elements as documents one by one. You can pass that straight-forward into p:xslt. Put a p:store behind p:xslt to store the output of the XSLT. Add an empty identity after p:store to make sure p:viewport has output.
>>
>> <p:viewport match="object">
>> 	<p:xslt .../>
>> 	<p:store .../>
>> 	<p:identity... />
>> </p:viewport>
>>
>> Note that this will remove the objects from the XHTML stream. You will need to reroute input here or there if you want to do more with them, or with the XHTML in whole..
>>
>> Kind regards,
>> Geert
>>
>> -----Oorspronkelijk bericht-----
>> Van: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] Namens Matthieu Ricaud-Dussarget
>> Verzonden: donderdag 13 oktober 2011 13:32
>> Aan: XProc Dev
>> Onderwerp: sequence of filtered documents as input of p:xslt
>>
>> Hi all,
>>
>> Continuing discovering xproc, I have a new problem today :
>>
>> One of the steps of my pipeline gives for result an XHTML document that
>> contains many
>> <object data="fileA.svg">,<object data="fileB.svg">   elements.
>>
>> Of course each referenced "file{?}.svg" exists in a specific directory,
>> let's call it "SVGdir".
>>
>> I'd like to iterate on those file{?}.svg and apply an XSLT
>> transformation to each one (and store the result in separate directoty).
>>
>> I don't want to iterate the "SVGdir" directly cause it can contains some
>> svg files which I don't want to transform cause they are not referenced
>> in the XHTML document.
>>
>> I think there are 2 options :
>> - filtering the directory files according to the XHTML
>> - starting from the XHTML to iterate on the good SVG files in the directory
>> I thought the 2nd option is better.
>>
>> (remind : I don't want the XSLT to be applied on the XHTML document
>> itself, cause I'd like it to be independant : it takes one SVG in input
>> and get one SVG in output.)
>>
>> The spec says that<p:xslt>   can have a sequence of documents as input.
>> But I don't find a way to make that work :
>> - should I give an xpath collection() for p:xslt/p:input/@select =>
>> xproc error
>> - should I iterate the<object>   elements within the XHTML doc and then
>> getting the svg document as input :
>> <p:for-each name="for-each-html-object">
>> <p:iteration-source select="//h:object">
>> <p:pipe step="generateECF" port="result"/>
>> </p:iteration-source>
>> <p:xslt name="PDFTronSVG2epubFixedSVG">
>> <p:input port="source" select="document(concat('PAGES_SVG_0',
>> tokenize(@data,'/')[last()] ))">
>> <p:pipe step="for-each-html-object" port="current"/>
>> </p:input>
>>              [...]
>> </p:for-each>
>> =>   xproc err:XD0023:Invalid XPath expression
>>
>> I will continue to investigate for other solutions, but if you have any
>> advices, they are welcome !
>>
>> Kind Regards,
>>
>> Matthieu.
>>
>


-- 
Matthieu Ricaud
IGS-CP
Service Livre numérique
Received on Tuesday, 18 October 2011 08:44:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 18 October 2011 08:44:24 GMT