unbound prefixes and validity with xs:QName


(This messages discharges my action of raising the issue formally, and attempts
to pull together the relevant data as well. This should capture the information
in the messages [1], [2] on the IG.)

The question is whether this instance:

   <tricky qname="unbound:something"/>

would be invalid per this element declaration:

   <xs:element name="tricky">
     <xs:complexType>
       <xs:attribute name="qname" type="xs:QName"/>
     </xs:complexType>
   </xs:element>

It is valid per the lexical rules in part 2, so the only question comes as to
where we make an appeal to their being a value and a value that we know.

The relevant clause in "Validation Rule: Datatype Valid" [3] is:

A string is datatype-valid with respect to a datatype definition if:
   ...
   2 the value denoted by the literal ·match·ed in the previous step is a
   member of the ·value space· of the datatype, as determined by it being Facet
   Valid (§4.1.4) with respect to each member of {facets} (except for
   ·pattern·). 
this being the only clause that makes any appeal to the value space.

Some read this as implicitly requiring you to have the value in hand in order
to do the facet check.

I read this as saying that "Facet Valid" decides whether the value is OK.
In this case, Facet Valid does not obtain, because there are no facets. 
(The clause "as determined by..." I take to be definitional.) 

Further, even if it did obtain, the difficulty with a QName with an unbound
prefix isn't that there isn't a value, it is only that you don't _know_ what it
is. So I would look at this as very much analogous to undischarged component
references, where it isn't that you know something is wrong, it is that you
don't know what the state of affairs is. And indeed, if the instance document
were a schema, and the "qname" attribute above were instead a "type" attribute,
this would be an entirely consistent view to take.

Similarly, the only place in Structures that makes an appeal to the value 
is if there is some kind of value constraint. In this case, there is no value
constraint, so again, those constraints don't apply.  I don't believe there is
any disagreement about the interpretation of the applicability of those clauses.

I did not find anywhere that explicitly requires that (a) there be a value and
(b) you know what it is. 

There is a note is 3.2.18 of Datatypes [4] that says:
	NOTE: The mapping between literals in the ·lexical space· and values in the
	·value space· of QName requires a namespace declaration to be in scope for
	the context in which QName is used. 
Some read this as noting that you need to have an inscope namespace definition
in order to have a good QName. But again, I don't read this as compelling me to
know what the value happens to be, but only noting that indeed there are some
cases where things get sticky if you want to get your hands on the right value
in the value space.

I am not entirely sure that it is a good idea to require that there both
be a value and that you know exactly what it is in the case of QName. (And I
wonder about whether it would be a good one in some of those float/double edge
cases, too.)  I have always been uncomfortable with the pun of using the
mechanism for declaring namespace for tags to provide namespace declarations
for content (i.e. architecturally I would always prefer seeing explicit,
distinct kinds of declarations). In the context of XQuery, for example, this
gets strange, because the above instance would be valid if there were the
necessary "declare namespace..." in the prolog, even though the instance in
itself would be invalid per the putative XML Schema "must have a known value"
rules. I find that odd.

Note that once you put a value constraint on the attribute or derive a type
that adds some facets (enumerations, for example), you do run afoul of the
cited clauses, among other things. On this we all agree.

So the questions are:
(1) Do we in fact require that an instance of a simple type have a value
    to be valid? And if not, should we?
(2) Do we in fact require that you know what the (or a?) value of an instance
    of a simple type is in order to be valid? And if not, should we?

//Mary

[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Feb/0070.html

[2] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2004Feb/0104.html

[3] http://www.w3.org/TR/xmlschema-2/#cvc-datatype-valid

[4] http://www.w3.org/TR/xmlschema-2/#QName

Received on Friday, 20 February 2004 14:05:08 UTC