W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > April 2006

Re: XProc: An XML Pipeline Language

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Mon, 10 Apr 2006 15:37:24 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87vethmdcb.fsf@nwalsh.com>
/ Rui Lopes <rlopes@di.fc.ul.pt> was heard to say:
| First, thanks for the nice bootstrap document.
|
| The introduction states "The outputs from the last component constitute the
| outputs of the pipeline itself."
|
| I'm not so sure about this. Imagine the following example:
|
|             O1
|            /
| I1 -> XSLT
|            \
|             O2 -> XSLT -> O2b
|
| The pipeline is defined by two XSLT steps. The first takes an input document
| (I1), splits it into two documents (O1, O2). O1 will not be processed further,
| yet I want it to be serialized. On the other hand, O2 will be processed in the
| next pipeline step into O2b. This output will be part of the pipeline output,
| therefore serialized.
|
| If the pipeline outputs are just the outputs from the last component, in this
| simple example O1 wouldn't be serialized, just O2b. Is this the expected
| behaviour we want to provide?

If your pipeline declares that it has two outputs, and O1 and O2b are
bound to those outputs, then they are both outputs of the pipeline.

If you want O1 and/or O2b to be serialized, but not delivered as
pipeline output, then you must bind them to a serializer:

             O1 -> Save
            /
 I1 -> XSLT
            \
             O2 -> XSLT -> O2b -> Save

Naturally, you could save one and send the other along as the pipeline
output if you wanted.

In your original example above, if O1 and O2b aren't bound to the
pipeline outputs, then your pipeline has a static error. We've said
you can't have outputs that flow onto the floor.

Does that help?

I see another problem with this pipeline:

             O1
            /
 I1 -> XSLT
            \
             O2

That's not how XSLT works. XSLT has two inputs and one output:


 Stylesheet
            \
              XSLT -> Output
            /
 Document

I think we'll need a component to decompose the multiple output
documents into one or more (if you want to get one for example).

I can imagine a "MatchBaseURI" component that takes as input a
sequence of documents and outputs two sequences, one containing all
the documents that match a particular URI and one that contains all
the ones that don't match:

<p:pipeline>
  <p:input name="stylesheet"/>
  <p:input name="document"/>

  <p:step name="xslt">
    <p:input name="stylesheet" select="$stylesheet"/>
    <p:input name="document" select="$document"/>
    <p:output name="results"/>
  </p:step>

  <p:step name="match-base-uri">
    <p:parameter name="match">.*/MANIFEST.xml$</p:parameter>
    <p:input name="results"/>
    <p:output name="matched"/>
    <p:output name="notmatched"/>
  </p:step>

  <p:step name="save">
    <p:input name="notmatched"/>
  </p:step>

  <p:step name="xslt">
    <p:input name="stylesheet" href="manifest.xsl"/>
    <p:input name="document" select="$matched"/>
    <p:output name="manifest-result"/>
  </p:step>

  <p:step name="save">
    <p:parameter name="href" select="'manifest.html'"/>
    <p:input name="document" select="$manifest-result"/>
  </p:step>
</p:pipeline>

That pipeline processes the input document with the specified stylesheet,
then splits the result document named MANIFEST.xml off and serializes
everything else. It performs another transformation on the manifest and
serializes the result to manifest.html.

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Monday, 10 April 2006 19:37:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:47 GMT