Re: Naming ports vs. naming documents from Norman Walsh on 2006-04-28 (public-xml-processing-model-wg@w3.org from April 2006)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Fri, 28 Apr 2006 09:22:30 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87k699rg15.fsf@nwalsh.com>
/ Richard Tobin <richard@inf.ed.ac.uk> was heard to say:
| Just to summarise what we were disussing at the end of yesterdays telcon:

Thank you, Richard.

| Since each connection passes a document (or sequence of documents, but
| let's ignore that for now) it would be possible instead to specify the
| connections by giving the documents names, which might be URLs, and
| specifying the document consumed or produced by each port.

So instead of:

  <p:step name="p:validate">
    <p:input name="source" href="document.xml"/>
    <p:input name="schema" href="schema.xsd"/>
    <p:output name="result" label="vout"/>
  </p:step>

  <p:step name="p:xslt">
    <p:input name="source" ref="vout"/>
    <p:input name="stylesheet" href="style.xsl"/>
    <p:output name="result" label="xsltout"/>
  </p:step>

which connects the "'result' output pipe" of the validate step to the
"'source' input pipe" of the XSLT step, we could say:

  <p:step name="p:validate">
    <p:input name="source" href="document.xml"/>
    <p:input name="schema" href="schema.xsd"/>
    <p:output name="result" produces="someURI"/>
  </p:step>

  <p:step name="p:xslt">
    <p:input name="source" consumes="someURI"/>
    <p:input name="stylesheet" href="style.xsl"/>
    <p:output name="result" label="xsltout"/>
  </p:step>

which says that the validate step produces a document with the URI
'someURI' and that the XSLT step consumes that URI.

The distinction being primarily that some other component could do a
"GET" on "someURI", even if the component didn't have an input that
explicitly consumed the result the validate step, with the expectation
that it would get the document produced by the validate step.

I have to say, on the whole, I prefer the labelled ports approach for
several reasons:

1. These URI names for the documents flowing through the pipes are
   likely to be transient at best and never actually accessible outside
   the context of the running pipeline. The 'someURI' above, for example,
   is never explicitly serialized so I wouldn't expect it to be available
   after the pipeline was executed.

2. It introduces aliases. I expect the output of an XSLT (2.0 anyway)
   component to have an intrinsic base URI. Giving the document flowing
   through the pipe a (potentially and often) different URI seems
   like it introduces lots of complexity.

3. When we come to actually addressing the notion of sequences of
   documents, we have a tricky problem. We'd have URIs that referred
   to a sequence. While we can provide components for dealing with
   sequences, it's difficult to imagine how a GET can actually return
   a sequence in a meaningful way to your average chunk of code
   expecting to read an XML document.

I think we do want a mechanism for indicating that a component
produces an output with a URI so that some other component can access
it, but I don't think I want to expose all of the plumbing that way.

|   <step type="validate" name="val">
|     <input name="source" from="in"/>
|     <input name="schema" from="whatever1"/>
|   </step>
|   <step type="xslt" name="ss">
|     <input name="source" from="val.result"/>
|     <input name="stylesheet" ref="whatever2"/>
|   </step>

I think you still want to make the outputs explicit:

  <step type="validate" name="val">
    <input name="source" from="in"/>
    <input name="schema" from="whatever1"/>
    <output name="result"/>
  </step>
  <step type="xslt" name="ss">
    <input name="source" from="val.result"/>
    <input name="stylesheet" ref="whatever2"/>
    <output name="result"/>
  </step>

otherwise you can't (easily) have a component that produces two output
streams.

| The second approach may reduce the number of arbitrary names you have
| to invent.

I think I like it. Though I think I'd be inclined to do the syntax
just a little differently:

  <step type="validate" xml:id="val">
    <input name="source" from="in"/>
    <input name="schema" from="whatever1"/>
    <output name="result"/>
  </step>
  <step type="xslt" xml:id="ss">
    <input name="source" from="#val/result"/>
    <input name="stylesheet" ref="whatever2"/>
    <output name="result"/>
  </step>

                                        Be seeing you,
                                          norm

-- 
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Friday, 28 April 2006 13:22:56 UTC