Re: Component interfaces from Erik Bruchez on 2006-01-16 (public-xml-processing-model-wg@w3.org from January 2006)

From: Erik Bruchez <ebruchez@orbeon.com>
Date: Mon, 16 Jan 2006 14:08:06 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <43CB9AB6.3080909@orbeon.com>
Robin Berjon wrote:
 >
 > On Jan 16, 2006, at 12:07, Erik Bruchez wrote:
 >> Rui Lopes wrote:
 >>
 >> > I've thought a bit more about this issue. I agree with you,
 >> > regarding the XSLT processor. However, requiring infosets as inputs
 >> > is a problem: if you have an XQuery processor, your approach would
 >> > require queries to be written in XQueryX [1]; if you have a Relax NG
 >> > schema, compact syntax would not be allowed; a hypothetic SQL
 >> > processor would require queries to be wrapped into an XML envelope.
 >>
 >> The questions of XQuery and Relax NG compact are interesting. A few
 >> points:
 >
 > There may be another option that does not incur overly high complexity:
 > if instead of Infosets the inputs are sequences of nodes, XQuery, RNC,
 > or SQL could be input as text nodes

An interesting point.

In passing, "XQuery 1.0 and XPath 2.0 Data Model (XDM)"'s items are
not necessarily nodes [1]. I think it is important for us to realize
that while the "node" terminology is quite anchored in people's minds,
the latest XML specifications introduce new terminology with which we
should all be familiar.

A solution to the "XQuery" debate would then consist in the processing
model allowing passing arbitrary sequences of *items*, not
*nodes*. This would put a strong dependency on XDM, of course, but
would also make the XML processing model very much in sync with XSLT
2.0 and XQuery 1.0. You could pass your XQuery document as simply as
by passing an xs:string.

I find the idea of following XDM quite attractive:

1. We would not reinvent the wheel.

2. It looks like we could satisfy more common use cases.

3. The question of how XML pipeline components adapt to each other can
    be solved with simple typing a la XSLT (although I would really,
    really hope for a compliance level that does not require full XML
    schema support, yet that does accept the simple XML schema types,
    which by the way are also supported by Relax NG and other
    languages). For example, if your pipeline step input realy expects
    a complete document it can declare:

      as="document-node()"

    If your pipeline step expects pure text, then you can write:

      as="xs:string"

    Similarly, you could type outputs the same way, therefore ensuring
    that the pipeline engine can check types.

    Optionally, it would be possible to extend support to full XML
    schema support. Here I would also like to make sure that using
    alternate schema languages, like Relax NG, remains possible.

 > (possibly document nodes containing only a text node child, though
 > that might be bending it too much).

Ouch. I don't think you will find that such a thing is allowed in any
W3C spec. In particular:

1. In XML 1.1, you find that the document production has exactly one
    element production [2].

2. In the infoset [3], you find that a "document information item"
    must contain "exactly one element information item".

I think the best thing here is to not think in terms of *nodes*, but
in terms of *items*, as per the XDM [1].

-Erik

[1] http://www.w3.org/TR/2005/CR-xpath-datamodel-20051103/#types-hierarchy
[2] http://www.w3.org/TR/xml11/#NT-document
[3] http://www.w3.org/TR/xml-infoset/#infoitem.document
Received on Monday, 16 January 2006 13:08:13 UTC