Re: Matching element pairs from two documents - variable value type problem from Romain Deltour on 2012-06-08 (xproc-dev@w3.org from June 2012)

From: Romain Deltour <rdeltour@gmail.com>
Date: Fri, 8 Jun 2012 18:03:43 +0200
To: Dr. Yves Forkl (SRZ) <Y.Forkl@srz.de>
Cc: XProc Dev <xproc-dev@w3.org>
Message-Id: <44535206-2F12-4590-9ED9-06F93882032D@gmail.com>
There are several way to do this. Here

A) Using viewport to filter eX elements, you store the id in a variable and then choose whether to keep it or delete it by setting the XPath context of the choose element to the refdoc document:

<p:declare-step name="main" xmlns:p="http://www.w3.org/ns/xproc"
    xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
    <p:input port="maindoc">
        <p:inline>
            <maindoc>
                <e1 id="a" xx="123"/>
                <e1 id="c" xx="456"/>
            </maindoc>
        </p:inline>
    </p:input>
    <p:input port="refdoc">
        <p:inline>
            <refdoc>
                <e2 id="a" yy="789"/>
                <e2 id="b" yy="012"/>
            </refdoc>
        </p:inline>
    </p:input>
    <p:output port="result" sequence="true"/>
    <p:viewport match="e1" name="maindoc-filtered">
        <p:viewport-source>
            <p:pipe port="maindoc" step="main"></p:pipe>
        </p:viewport-source>
        <p:variable name="id" select="//@id"/>
        <p:choose>
            <p:xpath-context>
                <p:pipe port="refdoc" step="main"/>
            </p:xpath-context>
            <p:when test="//e2[@id=$id]">
                <p:identity/>
            </p:when>
            <p:otherwise>
                <p:delete match="*"/>
            </p:otherwise>
        </p:choose>
    </p:viewport>
</p:declare-step>

B) Or, you wrap both documents in a single doc that you can filter and later split:

<p:declare-step name="main" xmlns:p="http://www.w3.org/ns/xproc"
    xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
    <p:input port="main">
        <p:inline>
            <maindoc>
                <e1 id="a" xx="123"/>
                <e1 id="c" xx="456"/>
            </maindoc>
        </p:inline>
    </p:input>
    <p:input port="ref">
        <p:inline>
            <refdoc>
                <e2 id="a" yy="789"/>
                <e2 id="b" yy="012"/>
            </refdoc>
        </p:inline>
    </p:input>
    <p:output port="result" sequence="true"/>
    <p:wrap-sequence wrapper="wrapper">
        <p:input port="source">
            <p:pipe port="main" step="main"/>
            <p:pipe port="ref" step="main"/>
        </p:input>
    </p:wrap-sequence>
    <p:delete match="e1[not(@id=(//e2/@id))]"/>
    <p:delete match="e2[not(@id=(//e1/@id))]" name="lastdelete"/>
    <p:identity name="maindoc">
        <p:input port="source" select="*/maindoc">
            <p:pipe port="result" step="lastdelete" />
        </p:input>
    </p:identity>
    <!--<p:identity name="refdoc">
        <p:input port="source" select="*/refdoc">
            <p:pipe port="result" step="lastdelete" />
        </p:input>
    </p:identity>-->
</p:declare-step>


In general wrapping documents as a single doc is a useful workaround to keep in mind.
As for the p:xslt step, you can provide a sequence as input, the documents in the sequence are available from XSLT as the default collection.
To provide several docs to the p:input, simply add several connections:

<p:input>
  <p:pipe>…</p:pipe>
  <p:pipe>…</p:pipe>
<p:input>

Hope this helps
Romain.



On 8 juin 2012, at 17:09, Dr. Yves Forkl (SRZ) wrote:

> With the help of XProc, I would like to collect element pairs that match each other within two documents, where the matching is based on the criterion of having the same value in the "id" attribute. For instance, if I have two documents:
> 
> <maindoc>
>  <e1 id="a" xx="123"/>
>  <e1 id="c" xx="456"/>
> </maindoc>
> 
> and
> 
> <refdoc>
>  <e2 id="a" yy="789"/>
>  <e2 id="b" yy="012"/>
> </refdoc>
> 
> then I want to retain only <e1 id="a" xx="123"/> in <maindoc> and also need to get hold of <e2 id="a" yy="789"/> in <refdoc>.
> (It is safe to assume that the "id" values are indeed unique within each document.)
> 
> In XSLT, I would easily have filtered out the matching elements by iterating over the children of <maindoc>, putting the elements into variables and then comparing their values by means of a predicate. (Or by using keys, of course.)
> 
> In XProc, however, I have difficulty getting both node trees to "close up" because XProc variables, alas, can't hold nodes.
> 
> Alternatively, sending in both documents via two input ports seems promising, but I can't see how to connect p:viewport or p:xslt (or any other step that could do the filtering) to TWO input ports or input documents. Using document() within XSLT to read the second document would be less ideal, I think, because this would mean to serialize the document, which is the result of another XProc step, to disk first.
> 
> Comparing the nodes in some string representation would in itself be fine, maybe, but the matching nodes from both documents need to survive this because I need to continue to process them, so it could be only part of the solution.
> 
> Which is the best technique to solve this kind of problem in XProc?
> 
> Yves
> 
> --
> Dr. Yves Forkl - Softwareentwicklung
> SRZ, Bessemerstr. 83-91, 12103 Berlin
> www.srz.de | Firmengruppe: www.besscom.de
> tel +49 30 75301-335 | fax +49 30 75301-11335
> 
> Satz-Rechen-Zentrum Hartmann+Heenemann GmbH&Co. KG
> Sitz Berlin | AG Charlottenburg | HRA 8089
> Komplementärin Satz-Rechner-Betriebsgesellschaft mbH
> Sitz Berlin | AG Charlottenburg | HRB 4905
> Geschäftsführer: Walter Fock
> 
>
Received on Friday, 8 June 2012 16:04:17 UTC