DOM WG Comments and Issues WRT XQuery 1.0 and XPath 2.0 Data Model

The DOM Working Group appreciates the attention given to the DOM concerns communicated previously. The refactoring of that specification to not rely on functional-looking constructors has done a lot to make the specification easier for us to understand and integrate with DOM concepts. At the previous DOM F2F meeting, I was asked to express a number of remaining issues. I will refer to XQuery 1.0 and XPath 2.0 Data Model as XPDM2.

DOM1

The DOM WG appreciates the value of the infoset mapping of the model and the work involved in refining it. The DOM WG still finds it unfortunate that the XPDM2 cannot rely primarily on an infoset mapping (See original issue 1a Different Data Models in W3C). Issues be avoided, and there could be other mismatches we have missed in our reading of your latest revision.

DOM2

In XPDM2 4.3.4, [namespace attributes] are returned as "the sequence of namespace information items constructed from the nodes that are present in the difference between the sequence of nodes returned by the dm:namespaces accessor on this element and the sequence of nodes returned by the dm:namespaces accessor of this element's dm:parent".

But [namespace attributes] is information which was not accounted for in XPDM2. Claiming that this information is always the delta seems inaccurate and causes problems. If, for example, the data model is built on a model such as DOM which preserves this information, this would seem to force a processor to ignore the more-accurate available information and rely on false information. This means that DOM and XPDM2 cannot be seen as views of a single unified data model and may make it quite difficult to produce functions which rely on this information which XPath has lost.

For example, let's say the environment containing the XPath implementation contains a function which verifies a digital signature on a subtree, returning a boolean. In terms of XPath, it is impossible to detect whether the signature is correct or not, because the signature is different depending upon exactly where the declaration is in the infoset, between cases which produce identical XPath data. The function could gather the rest of the information missing from DOM or some other more-complete infoset representation, but XPath's declaration that it possesses the infoset information which turns out to be flawed confuses and seems to hide the real accurate information that may be required by other infoset operations.

This is a problem seen in other cases in XPDM2, where inaccurate information is used in the mapping to cover for information not carried by XPDM2. The infoset mapping should not claim to posses this infoset information which it does not which interferes if the actual information is present.

DOM3

Also in In XPDM2 4.3.4, [namespace attributes] description cited above claims that is constructed as a sequence of [namespace] information items. This is not possible because a sequence of [namespace] information items is not at all the same thing as [namespace attributes].

DOM4

In XPDM2 4.3.4, it is not clear whether the namespace information is even available as namespace nodes, since that is an optional presentation of namespace information in XPDM2, which may otherwise be available via accessors.

DOM5

When dealing with id/idref, XPDM2 exposes xml schema types, when in fact these are dtd types. See:

http://www.w3.org/TR/xmlschema-2/#ID

XML 1.0: http://www.w3.org/TR/REC-xml#id

If the [attribute type] property exists and has one of the following values: ID, IDREF, IDREFS, ENTITY, ENTITIES, NMTOKEN, or NMTOKENS, the {target namespace} is " http://www.w3.org/2001/XMLSchema" and the {name} is the [attribute type].

id is different defined in xml schema (ncname) than in dtd (name). They are mixing the two

the target namespace should be dtd instead of xmlschema

DOM6

In XPDM 4.6.4, [element content whitespace] is purported to be false. This is another case like shown in issue DOM2, where there might be a more-complete infoset available which knows the real value of this information and make it available to various functions, rather than forcing the infoset value to be inaccurate.

DOM7

XPDM2 is often not just an extension of the infoset, but a relaxation. A mapping is provided, but no information about what happens to the mapping when the model conflicts with the infoset. These issues have been raised before. For example, what is the infoset mapping when there are more than one element child nodes of the document node? This needs to be clearly specified. DOM, does not permit this particular violation of infoset. An XPath implementation on top of DOM would not be able to represent this sort of infoset violation. This may be true wherever the data model is more relaxed from the infoset. In any cases where DOM does not rigorously guarantee a valid infoset, it reconciles deviations during serialization or normalization in a specified way. XPDM2 does not seem to do this. We need some sort of fix in the XPath specification, if only to be able to easily refer to this sort of infoset relaxation and the inability of some implementations to represent it (as well as the inability of such a model to be serialized as well-formed XML).

DOM8

The DOM API has used a nodeName accessor since level 1 which was further refined as namespace support was incorporated in level 2. The XPDM2 specification seems to have a dm:node-name accessor which behaves quite differently from the DOM accessor. We believe that either the value / behavior should match, or a different name should be used to avoid conflict and confusion.

Thanks for your consideration of these issues, and congratulations on the progress of your specifications,

Ray Whitmer
for the DOM Working Group