Use cases 5.19, 5.21, 5.22, 5.23, and 5.31 from vojtech.toman@emc.com on 2012-04-24 (public-xml-processing-model-wg@w3.org from April 2012)

From: <vojtech.toman@emc.com>
Date: Tue, 24 Apr 2012 09:13:50 -0400
To: <public-xml-processing-model-wg@w3.org>
Message-ID: <F3C7EBECE80AC346BE4D1C5A9BB4A41F2ED14E4E13@MX11A.corp.emc.com>
Hi all,

See below for my take on the use cases 5.19, 5.21, 5.22, 5.23, and 5.31.

Regards,
Vojtech


--
Vojtech Toman
Consultant Software Engineer
EMC | Information Intelligence Group
vojtech.toman@emc.com
http://developer.emc.com/xmltech

-----

5.19:
In my XML Prague paper "XProc: Beyond application/xml" I looked at one possible way of extending XProc to support non-XML media types. The basic idea is that XProc steps declare which media types they accept on their input ports and which media types they produce on their output ports. If it happens that data with a media type A (for instance, text/csv) arrives on an input port that expects media type B (for instance, application/xml), the XProc processor will try to convert the data to the expected media type. What kinds of conversions are supported and what do they look like is not covered in the paper, because that is an issue on its own. I was focusing just on the implications of this to the XProc processing model (which, it turns out, are actually not that big).

You can find the conference proceedings here (my article is on page 27):
http://www.xmlprague.cz/2012/files/xmlprague-2012-proceedings.pdf

The specific use case described in 5.19 (converting a CSV file to XML) can be solved by using XSLT 2.0 to tokenize the CSV data and turn it into XML. The example below uses the stylesheet developed by Andrew Welsh (http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html):

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                version="1.0">
  <p:output port="result"/>
  <p:option name="pathToCSV" required="true"/>

  <p:xslt template-name="main">
    <p:input port="source">
      <p:empty/>
    </p:input>
    <p:input port="stylesheet">
      <p:document href="http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.xslt"/>
    </p:input>
    <!-- note that relative paths are resolved against the stylesheet's base URI -->
    <p:with-param name="pathToCSV" select="$pathToCSV"/>
  </p:xslt>
</p:declare-step>

In this solution, the stylesheet loads the CSV file. I think it should be straightforward to modify the pipeline/stylesheet so that the pipeline itself loads the CSV file (using p:data or p:http-request) and passes the c:data-wrapped representation to the stylesheet.

---

5.21:
This one is a little tricky as XProc does not support specifying serialization options on output ports dynamically. Because of that, it is not possible to write a pipeline with a single "result" output port that uses different serialization options that depend on the (dynamic) data content type. One solution is to have multiple output ports ("result-html", "result-xml", ...) with different serialization options, but that's probably silly and too inconvenient to work with (plus it does not work with non-XML data). Another solution is not to have any output ports at all and use p:store instead. The drawback of this is that p:store writes the data to an external location and therefore breaks the pipeline flow, but you can have multiple p:store steps with different serialization options, or you can even set the serialization options on p:store dynamically.

Because the p:xsl-formatter renders the XSL-FO document to an external location, I went for the p:store solution: 

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
            version="1.0"
            xmlns:html="http://www.w3.org/1999/xhtml"
            xmlns:fo="http://www.w3.org/1999/XSL/Format">

  <p:input port="source"/>
  <p:option name="output" required="true"/>

  <p:choose>
    <p:when test="/html:html">
      <!-- apply a theme using XSLT and serialize as HTML -->
      <p:xslt>
        <p:input port="stylesheet">
          <p:document href="style.xsl"/>
        </p:input>
        <p:input port="parameters">
          <p:empty/>
        </p:input>
      </p:xslt>
      <p:store method="html">
        <p:with-option name="href" select="$output"/>
      </p:store>
    </p:when>
    <p:when test="/fo:root">
      <!-- apply an XSL-FO processor-->
      <p:xsl-formatter>
        <p:with-option name="href" select="$output"/>
        <p:input port="parameters">
          <p:empty/>
        </p:input>
      </p:xsl-formatter>
    </p:when>
    <p:otherwise>
      <!-- serialize as XML -->
      <p:store>
        <p:with-option name="href" select="$output"/>
      </p:store>
    </p:otherwise>
  </p:choose>
</p:declare-step>

---

5.22:
The newsfeed example (the mobile example is just a combination of the newsfeed example and 5.21):

<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
            version="1.0">

  <p:option name="configuration" required="true"/>

  <p:choose>
    <p:when test="$configuration='RSS 1.0'">
      <p:xslt>
        <p:input port="stylesheet">
          <p:document href="atom-to-rss-10.xsl"/>
        </p:input>
      </p:xslt>
    </p:when>
    <p:when test="$configuration='RSS 2.0'">
      <p:xslt>
        <p:input port="stylesheet">
          <p:document href="atom-to-rss-20.xsl"/>
        </p:input>
      </p:xslt>
    </p:when>
  </p:choose>
</p:pipeline>

---

5.23:
This pipeline takes an XMLRPC request document and invokes a method (an XProc pipeline) based on the value of /methodCall/methodName. Because there is no standard p:eval step for dynamic evaluation of XProc pipelines, we have to use p:choose which lists all possible pipelines statically.

The pipeline below is rather simplistic in the sense that it does not try to interpret XMLRPC's "int", "string", "struct", etc. elements. The input data is passed in the original XMLRPC format to the invoked pipelines, and likewise, the pipelines are expected to represent their results in XMLRPC format.

<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
            version="1.0"
            xmlns:ex="http://www.example.org">

  <!-- Defines various 'method' pipelines in the "http://www.example.org" namespace.
       Pipeline interface contract:
       - a single (primary) input port
       - a single (primary output port)
       - expect a single <params> input document
       - produce a single <params> or <fault> output document
  -->
  <p:import href="method-library.xpl"/>

  <p:pipeline type="ex:invoke-method">
    <p:variable name="method" select="/methodCall/methodName"/>

    <p:identity>
      <p:input port="source" select="/methodCall/params"/>
    </p:identity>
    
    <p:try>
      <p:group>
        <!-- Note: the p:choose could be replaced with a single call
             to p:eval if we had such a step -->
        <p:choose>
          <p:when test="$method = 'method1'">
            <ex:method1/>
          </p:when>
          <p:when test="$method = 'method2'">
            <ex:method2/>
          </p:when>
          <p:otherwise>
            <p:template name="error-message">
              <p:input port="template">
                <p:inline>
                  <message>Unsupported method: {$method}</message>
                </p:inline>
              </p:input>
              <p:with-param name="method" select="$method"/>
            </p:template>
            <p:error code="ex:error">
              <p:input port="source">
                <p:pipe step="error-message" port="result"/>
              </p:input>
            </p:error>
          </p:otherwise>
        </p:choose>
      </p:group>

      <p:catch name="catch">
        <p:template>
          <p:input port="source">
            <p:pipe step="catch" port="error"/>
          </p:input>
          <p:input port="template">
            <p:inline>
              <fault>
                <value>
                  <struct>
                    <member>
                      <name>faultCode</name>
                      <value><int>-1</int></value>
                    </member>
                    <member>
                      <name>faultString</name>
                      <value><string>{string(/*)}</string></value>
                    </member>
                  </struct>
                </value>
              </fault>
            </p:inline>
          </p:input>
        </p:template>
      </p:catch>
    </p:try>
    <p:wrap-sequence wrapper="methodResponse"/>
  </p:pipeline>

  <p:validate-with-relax-ng>
    <p:input port="schema">
      <p:data href="xmlrpc.rnc" content-type="text/plain"/>
    </p:input>
  </p:validate-with-relax-ng>

  <ex:invoke-method/>

  <p:validate-with-xml-schema>
    <p:input port="schema">
      <p:document href="xmlrpc-response.xsd"/>
    </p:input>
  </p:validate-with-xml-schema>

</p:pipeline>


---

5.31:
The pipeline below does the following:

1. Checks if XSLT 2.0 is supported
2. If XSLT 2.0 is available, it applies an XSLT 2.0 stylesheet to the input XML document. The stylesheet uses xsl:result-document to generate secondary output documents.
3. If XSLT 2.0 is not available, it applies an XSLT 1.0 stylesheet. The stylesheet uses either the exsl:document or result:write extension (whichever is available) to generate secondary output documents.

The pipeline has two output ports: the "result" output port for the primary result of the XSLT transformation, and "secondary" for the secondary documents.

...the pipeline almost works. The problem is with the XSLT 1.0 transformation, because the secondary documents do not appear on the "secondary" step of the p:xslt step. This is actually a requirement made by the XProc specification: "If XSLT 1.0 is used, an empty sequence of documents must appear on the secondary port." The exact behavior of exsl:document and result:write in the XProc context is implementation-defined; in most cases, the generated documents will be simply written to the specified external location.

<p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
            xmlns:ex="http://www.example.org"
            name="main" version="1.0">

  <p:output port="secondary" sequence="true">
    <p:pipe step="process" port="secondary"/>
  </p:output>

  <p:declare-step type="ex:is-xslt20-supported">
    <p:output port="result"/>
    <p:try>
      <p:group>
        <p:xslt version="2.0">
          <p:input port="source">
            <p:inline>
              <foo/>
            </p:inline>
          </p:input>
          <p:input port="stylesheet">
            <p:inline>
              <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
                <xsl:template match="/">
                  <true><xsl:value-of select="1 to 2"/></true>
                </xsl:template>
              </xsl:stylesheet>
            </p:inline>
          </p:input>
          <p:input port="parameters">
            <p:empty/>
          </p:input>
        </p:xslt>
      </p:group>
      <p:catch>
        <p:identity>
          <p:input port="source">
            <p:inline><false/></p:inline>
          </p:input>
        </p:identity>
      </p:catch>
    </p:try>
  </p:declare-step>


  <ex:is-xslt20-supported/>

  <p:choose name="process">
    <p:when test="/true">
      <p:output port="result" primary="true"/>
      <p:output port="secondary" sequence="true">
        <p:pipe step="xslt" port="secondary"/>
      </p:output>
        
      <p:xslt name="xslt" version="2.0">
        <p:input port="source">
          <p:pipe step="main" port="source"/>
        </p:input>
        <p:input port="stylesheet">
          <p:inline>
            <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
              <xsl:template name="generate-secondary-content">
                <doc>Hello world!</doc>
              </xsl:template>

              <xsl:template match="/">
                <xsl:result-document href="foo.xml">
                  <xsl:call-template name="generate-secondary-content"/>
                </xsl:result-document>
                <ignored/>
              </xsl:template>
            </xsl:stylesheet>
          </p:inline>
        </p:input>
      </p:xslt>
    </p:when>

    <p:otherwise>
      <p:output port="result" primary="true"/>
      <p:output port="secondary" sequence="true">
        <p:pipe step="xslt" port="secondary"/>
      </p:output>

      <p:xslt name="xslt" version="1.0">
        <p:input port="source">
          <p:pipe step="main" port="source"/>
        </p:input>
        <p:input port="stylesheet">
          <p:inline>
            <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                            xmlns:exsl="http://exslt.org/common"
                            xmlns:redirect="http://xml.apache.org/xalan/redirect"
                            extension-element-prefixes="exsl redirect" version="1.0">
              <xsl:template name="generate-secondary-content">
                <doc>Hello world!</doc>
              </xsl:template>

              <xsl:template match="/">
                <exsl:document href="foo.xml">
                  <xsl:call-template name="generate-secondary-content"/>
                  <xsl:fallback>
                    <redirect:write file="foo.xml">
                      <xsl:call-template name="generate-secondary-content"/>
                    </redirect:write>
                  </xsl:fallback>
                </exsl:document>
                <ignored/>
              </xsl:template>
            </xsl:stylesheet>
          </p:inline>
        </p:input>
      </p:xslt>
    </p:otherwise>
  </p:choose>

</p:pipeline>
Received on Tuesday, 24 April 2012 13:14:53 UTC