XMLP Comment to XML Query on Data Model and Infoset

The XMLP WG concurs with XML Schema's comment
(http://lists.w3.org/Archives/Public/public-qt-comments/2004Mar/0057.html)
regarding Data Model and Infoset. While we agree with Schema, the XMLP WG
also wishes to clarify its own particular experiences and motivation.

Our XOP and MTOM work provides a means of optimizing the transmission of
XML elements known to be of type xsd:base64Binary. SOAP envelopes are
specified as Infosets, however we took a significant amount of time trying
to use the XPath/Query Data Model for XOP and MTOM. Our rational was that
the DM is emerging as the normative way of modelling a typed XML tree in
the W3C stack, and that's exactly what we needed.  The DM gave us a
normative way of saying element nodes have a dm:type, a dm:string-value,
and a dm:typed-value, thus enabling us to talk rather precisely about the
relationship between the string-value and typed-value as controlled by the
type. Furthermore, this would have made it really easy to talk about how to
send a query result through XOP/MTOM/SOAP.

This formulation worked fine architecturally and was used as the basis for
our published February WDs.  Unfortunately, we received user feedback which
boiled down to "The relationship between all these models is way too
obscure for ordinary mortals to see what you're doing. First the SOAP
envelope is an Infoset, then for optimization purposes it is a data model,
just so you can get at node types,  then it goes back to being an Infoset
for SOAP processing."

On the basis of such feedback, we removed the DM from XOP and MTOM and
introduced more or less ad-hoc typing by indicating that we optimize things
that happen to resemble the canonical form of base64binary. Now, if someone
ever wants to send a query result through SOAP and have it optimized,
they'll have to write prose in their spec saying (a) view this as an
infoset using whatever mapping you believe appropriate, and (b) use ad-hoc
means to remember which nodes in the DM were base64binary when talking to
XOP. We think this is an example of the sort of difficulties we all will
contine to have in using Query results with other parts of the XML stack
(perhaps a version of DOM that exposes PSVI?).

There are philosophical and structural differences between DM and Infoset,
but anyone can see that much of the critical information in the two
formulations is identical.  We think it is unfortunate to have two
normative places in the W3C XML stack that separately introduce the rules
that elements have attributes, elements have other elements and characters
as children, etc.  It would be really valuable if such fundamental rules
could be introduced in exactly one place, with optional add-ons providing
node identity, typing, or whatever additional content or other abstractions
are needed.

We think this is a very important issue to get right for the future of the
W3C XML stack, although it is not immediately clear what "right" is.
Certainly, XMLP's failed attempts to use the DM in MTOM and XOP have been
sobering and have shown that real energy can be wasted as a result of the
current state of affairs.  We are aware that the Query effort is at a point
where changes of this sort are potentially disruptive and in that sense for
good reason are potentially unwelcome.  That said, we think it is worth
some effort to separate the long-term technical analysis from the
discussion of what - if anything - needs to be done between now and when
XML Query becomes a Recommendation. Both are important issues, although
there is no lack of sympathy for finishing Query ASAP.  We would like to
believe that we could relatively quickly decide whether there is a long
term direction that will be better than the status quo, and if so whether
any immediate changes could or should be made without unduly delaying
Query's progress.

On behalf of the XMLP WG
David Fallside
Chair, XMLP WG

Received on Wednesday, 31 March 2004 15:38:13 UTC