XQuery/XPath Data model comments from Jeni Tennison on 2001-09-16 (www-xml-query-comments@w3.org from September 2001)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Sun, 16 Sep 2001 21:42:19 +0100
To: www-xml-query-comments@w3.org
Message-ID: <12612633746.20010916214219@jenitennison.com>
Hi,

Here are some comments on the XQuery/XPath data model WD (dated 7th
June). I hope they're helpful.

3.2 Document Order. The second paragraph states that the relative
order of nodes in different documents is implementation-dependent but
stable. How 'stable' is 'stable'? Within a single XPath? Within an
XSLT stylesheet? Within multiple runs of the same stylesheet on the
same document?

3.4 Schema Components and Values. The third paragraph gives xs:ID and
xs:IDREF as examples of primitive value types, when actually they are
derived (from xs:NCName).

3.6 Ignoring Comments, Processing Instructions, and Whitespace. The
definition of insignificant whitespace means that a text node can only
be classified as whitespace if its parent element has been validated
according to an XML Schema. This seems to be very limiting; perhaps it
could be rephrased, possibly something like:

  1. contains no characters other than white space characters (as
  defined in XML 1.0), and
  2. does not have a parent element with a [validity] property with
  the value 'valid' and a [type definition] property yielding a simple
  type definition or a complex type definition with a content type of
  mixed.

4 Nodes.

Why don't namespace nodes have parents? It's useful to be able to
continue to traverse a tree from namespace nodes. For example, in
stylesheets for browsing XML documents, you can only work out whether
a namespace needs to be declared by looking at the namespace nodes on
its ancestors (e.g. ancestor::*/namespace::*[name() = name(current())
and . = current()]).

Can attributes be roots of trees? In other words, is it possible to
have a node tree that contains a single attribute node? I don't think
it's explicitly prohibited by the description here.

The last couple of paragraphs in the introduction to Section 4 are
confusing because they introduce the concept of an InfoItem object
type. We were told in Section 1 (Introduction) that there were five
categories of values - nodes, simple values, sequences, errors and
schema components. InfoItem objects seem to be another type
altogether, and it's unclear how they fit in. Can this be elaborated
earlier in the document?

4.2 Elements.

The constructor for the element node probably should include the type
definition in the constructor as well, for cases where the [type
definition] of the element information item is not the same as the
[type definition] of the [element declaration], which can occur if
xsi:type is used in the document.

Possibly it will be part of the update to incorporate schema-less and
DTD valid data, but the declaration and type accessors should probably
return Sequences that might be empty.

I think it might be useful to be able to access the [member type
definition] property of the PSVI for element information items, to
know exactly which type the element value is, perhaps as a separate
accessor:

  member-type : ElementNode -> Sequence(0,1)<SchemaComponent>

or altering the type accessor to return a sequence in such cases:

  type : ElementNode -> Sequence(1,2)<SchemaComponent>

or changing the type accessor to return the [member type definition]
where appropriate. If included, the constructor should involve the
member type definition as well.

It's not clear what happens with nil elements? Is something special
done with the type to indicate that they have a nil value or
something?

4.3 Attributes.

As with the element nodes, it would be useful to access the [member
type definition] properties of the attribute information items as well
as their type definitions. The constructor doesn't need to incorporate
the type definition, since that cannot be set through xsi:type, but it
would have to include the member type definition.

4.4 Namespaces. There's a typo: "The accessors name, node-kind and
string-value also apply to comment nodes." should read "The accessors
name, node-kind and string-value also apply to namespace nodes."

4.7 References. It's unclear how reference nodes fit into the data
model, or what their purpose is. As far as I can tell, document nodes
and element nodes cannot have reference nodes as children, so I
suspect that the parent accessor applied to a reference node will
result in an empty sequence? In which case that should be indicated
with:

  parent(ReferenceNode) : Sequence(0,0)<ElementNode | DocumentNode>

Some informative examples of the kind of thing that *might* be
returned by an implementation accessing the string value of a
reference node would be helpful.

5.1 Primitive Values.

There's a typo in the first paragraph, which contains 'xs:hexbinary'
rather than 'xs:hexBinary'.

I think that the id accessor needs to be altered to include a document
context, since a single IDREF value might access different element
nodes in different documents. I think this needs to be a function
rather than an accessor, or something?

Similarly, I think that the referent accessor of xs:anyURI requires
some extra contextual information in case it's a relative URI rather
than an absolute URI. Plus, given that the URI could have an XPointer
fragment (I imagine), then shouldn't it return a sequence consisting
of any number of any type of nodes, for full flexibility?

5.2 Derived Simple Values. The way that this section uses the term
'primitive' is confusing. Is the intention that the only value types
that are supported within the data model are the primitive types from
XML Schema? If so, what's the purpose of separate constructors for all
the built-in data types, as given in the F&O WD, as xf:short() will
give exactly the same type of value as xf:decimal()? If that's not the
intention, could you use something else instead of the term 'primitive
value' within this section?

9 Equality. I wonder whether it would also be useful to define
a string-value-equal function that could be used to test the equality
of the string values of two nodes.

10 Example.

The example document isn't valid according to the namespace Rec
because there's no namespace declaration for the 'xs' prefix used for
xs:schemaLocation. It's also not well-formed because there aren't any
quotes around the version number in the XML declaration. I think nyou
want:

<?xml version="1.0"?>
<p:part xmlns:p="http://www.mywebsite.com/PartSchema"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation = "http://www.mywebsite.com/PartSchema
                              http://www.mywebsite.com/PartSchema"
        name="nutbolt">
  <mfg>Acme</mfg>
  <price>10.50</price>
</p:part>

I don't think that the schema is valid. The namespace declaration uses
the wrong namespace; it's using an old namespace anyway; and it uses
both a type attribute and the content of an xs:element to indicate the
type, which is not legal. I think you want either:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="http://www.mywebsite.com/PartSchema">
  <xs:element name="part">
    <xs:complexType>
      <xs:element name = "mfg" type="xs:string"/>
      <xs:element name = "price" type="xs:decimal"/>
      <xs:attribute name = "name" type="xs:string"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

or:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           targetNamespace="http://www.mywebsite.com/PartSchema"
           xmlns="http://www.mywebsite.com/PartSchema">
  <xs:element name="part" type="part-type" />
  <xs:complexType name="part-type">
    <xs:element name = "mfg" type="xs:string"/>
    <xs:element name = "price" type="xs:decimal"/>
    <xs:attribute name = "name" type="xs:string"/>
  </xs:complexType>
</xs:schema>

Cheers,

Jeni
---
Jeni Tennison
http://www.jenitennison.com/
Received on Monday, 17 September 2001 04:11:18 UTC