- From: <noah_mendelsohn@us.ibm.com>
- Date: Thu, 8 Aug 2002 23:35:04 -0400
- To: "Ashok Malhotra" <ashokma@microsoft.com>
- Cc: "Dan Connolly" <connolly@w3.org>, www-rdf-comments@w3.org, www-xml-schema-comments@w3.org
I see where you're coming from, Dan, but I suspect the horse has already left the barn on this one. A few comments on what you've written: Dan Connolly writes: >> But on careful review, I don't see >> that anywhere in the spec. I see stuff like >> "Each value in the value space of a datatype >> is denoted by one or more >> literals in its *lexical space*. " I'm not sure we can change this retroactively, even if it were desireable. I'm curious, how would values of IDREF fit into this world (we can probably cheat on that since validation of IRDEF against IDs is actually done in Part 1, sort of.) >> I suggest that the lexical form of QNames >> should be considered to include the relevant >> namespace name; that'll make it unambiguous, >> though it won't correspond exactly to >> the attribute value. Section 2.3 says: "[Definition:] A lexical space is the set of valid literals for a datatype. " While the term literal is not defined, unfortunately, one of the few really clean aspects of the datatype design is that it clearly refers to an ordered list of Unicode characters. Furthermore, it's clear in structures that those are exactly the characters that are validated as the contents of an attribute or element. I believe your proposal on QNames would violate this fundamental invariant, and for that reason among others I am strongly opposed. I think we have to admit that the lexical space for QName is context dependent, for better or worse. Actually, I had always wanted to include the pertinent prefix in the value space of QNaming making it a triple not a pair. I lost that one, but I'm not sure that would have dealt with your concern in any case. >> " constraint: the union >> of (string, decimal) has the decimal 10 >> in its value space, but nothing in its >> lexical space to denote it. I'm surprised, and upon review I think you've discovered a contradiction in the recommendation. In sections such as 2.3.1 it says things like: [Definition:] A canonical lexical representation is a set of literals from among the valid set of literals for a datatype such that there is a one-to-one mapping between literals in the canonical lexical representation and values in the ·value space·. implying that there must be at least one lexical form for every value. On the other hand, the definition of union is: [Definition:] Union datatypes are those whose ·value space·s and ·lexical space·s are the union of the ·value space·s and ·lexical space·s of one or more other datatypes. and section 2.5.1.3 says: [Definition:] The datatypes that participate in the definition of a ·union· datatype are known as the memberTypes of that ·union· datatype. The order in which the ·memberTypes· are specified in the definition (that is, the order of the <simpleType> children of the <union> element, or the order of the QNames in the memberTypes attribute) is significant. During validation, an element or attribute's value is validated against the ·memberTypes· in the order in which they appear in the definition until a match is found. The evaluation order can be overridden with the use of xsi:type. So, the rec. seems contradictory to me. My preferred resolution would be different than yours, I think. I think we should workb backwards from the validation rules, make clear that order matters, and that in your example the decimal 10 is NOT in the value space of the union. So the value space of a union would be the values corresponding to lexical forms that validate per the order sensitive rule. Thus, neither the value spaces nor the lexical spaces can be a union. Actually, I think it's clear that the lexical spaces can't be a union, since the form "10" would appear twice, which seems wrong to me. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------ "Ashok Malhotra" <ashokma@microsoft.com> Sent by: www-xml-schema-comments-request@w3.org 08/02/2002 02:22 PM To: "Dan Connolly" <connolly@w3.org>, <www-xml-schema-comments@w3.org> cc: <www-rdf-comments@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: RE: QName is ambiguous; aren't datatypes unambiguous? union types total? Dan: The datatypes spec does not quite say "every legal lexical form for a datatype denotes a single value in the value of that datatype." We should consider adding such wording when we rewrite for Schema 1.1. We should then, carefully, address the exceptions that you point out. All the best, Ashok -----Original Message----- From: Dan Connolly [mailto:connolly@w3.org] Sent: Thursday, August 01, 2002 10:32 PM To: www-xml-schema-comments@w3.org Cc: www-rdf-comments@w3.org Subject: QName is ambiguous; aren't datatypes unambiguous? union types total? Consider: <aDoc> <eltA xmlns:x="http://example/vocab1#" aQNameAttr="x:n"/> <eltB xmlns:x="http://example/vocab2#" aQNameAttr="x:n"/> </aDoc> Suppose we look at that document using a schema that says aQNameAttr has type QName (in both cases). According to http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#QName there's a value from eltA; i.e. the pair (http://example/vocab1#, "n") but the value from eltB is the pair (http://example/vocab2#, "n") while their lexical forms are the same in both cases: x:n. I thought that a fundamental property of datatypes was that they're unambiguous; i.e. for any datatype, there's exactly one value that corresponds to each item from the lexical space. The designs that the RDF Core WG is considering for using XML Schema datatypes in RDF depend on this property. But on careful review, I don't see that anywhere in the spec. I see stuff like "Each value in the value space of a datatype is denoted by one or more literals in its *lexical space*. " But I don't see "each literal in the lexical space of a datatype denotes exactly one value." That should be in there somewhere, no? I suggest that the lexical form of QNames should be considered to include the relevant namespace name; that'll make it unambiguous, though it won't correspond exactly to the attribute value. QName is certainly a special case w.r.t. using XML Schema datatypes in RDF. Hmm... but I guess union datatypes are too. On the other hand, union datatypes don't even obey the "Each value in the value space of a datatype is denoted by one or more literals in its *lexical space*. " constraint: the union of (string, decimal) has the decimal 10 in its value space, but nothing in its lexical space to denote it. -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ see you in Montreal in August at Extreme Markup 2002?
Received on Thursday, 8 August 2002 23:36:47 UTC