- From: Pat Hayes <phayes@ai.uwf.edu>
- Date: Thu, 29 Nov 2001 17:40:02 -0600
- To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
- Cc: www-rdf-interest@w3.org, joint-committee@daml.org
>From: Patrick.Stickler@nokia.com >Subject: RE: Cutting the Patrician datatype knot >Date: Thu, 29 Nov 2001 14:09:09 +0200 > >> > For example, if you allow union XML Schema datatypes there is >> > a model of >> > >> > <rdfs:range foo xsd:[integer union string]> >> > <John foo 7> >> >> As I think I've said earlier, I don't consider >> [integer union string] to be a "valid" data type. > >And why not? > >> The definition of a data type that I subscribe to is >> that a data type defines a value space and (optionally) >> a lexical space, and a member of the lexical space maps >> to one and only one member of the value space. > >[integer union string] satisfies this definition. In [integer union >string] the lexical item "7" maps to the integer 7. > >> In the above union "data type", the literal "7" maps to >> two members of the value space. Therefore, it is not a >> valid data type. > >Not correct. Please read the XML Schema recommendation to see how union >datatypes work. OK, now I see what you have been getting at (and what was wrong with my earlier replies on this thread, sorry if they created more heat than light.) Since the lexical space of integer is a subset of that of string, and since the ordering of unions is significant, the union datatype [integer union string] is exactly the same as that of integer on numerals; it takes numerals to integers, while non-numerals are treated as strings. OK, but now what is the problem with my MT extension, again? Of course, with THIS sense of "union", one cannot treat [integer union string] as in any sense the simple class-union of the classes [integer] and [string]. So as far as the RDF reasoner is concerned, [integer union string] is just another datatype, which might as well be called [foodle]. If someone were to assert that [integer] rdfs:subClassOf [integer union string] . that would be correct, but that would cause no problem in the MT, since [integer union string] agrees with [integer] on numerals; they have the same lexical-to-value mapping on anything that would map to an integer. Similarly, it would be correct to assert [string] rdfs:subClassOf [string union integer] . though in that case it seems rather pointless since in this case the datatypes are identical, since even numerals will be mapped as strings by that union. But again, this poses no problems for the MT. On the other hand, if someone were to assert [string] rdfs:subClassOf [integer union string] . then that would be simply false, as all numeral strings are in the former value space but not the latter. Heres an artificial example that would screw things up in the way I think you intend. Suppose there was a datatype xsd:gnirts for backward strings, so that if "abcd" were in that datatype then it would denote the value "dcba". Then [gnirts union string] would have the same value space as [string], but would have an incompatible lexical mapping. (Similarly for octal and decimal integers, eg.) There are ways past this issue, if it is really an issue. Are there any such cases in XML Schema, however? I can't find any, unless they are somehow buried in the details of the Gregorian calendar. > > What you seem to be defining is just a union of lexical >> space. I.e., the union of the lexical space of integers with >> the lexical space of strings; which, however possible to do, >> is not particularly useful if you want to deal with the >> values themselves. > >No, XML Schema has a method for creating union datatypes that satisfies >your requirements. If you want to exclude such datatypes you have to >provide a criterion other than ``usefulness''. > >> XML Schema is not concerned with values the same way that >> an application would be. XML Schema only has to ensure >> the integrity of the lexical and structural space. Thus, >> a union such as above is reasonable, as XML Schema does >> not itself worry about the ambiguity that arises in the > > lexical to value mapping. > >Again, XML Schema does *not* have ambiguous lexical-to-value mappings. >Although this is not explicitly stated in the XML Schema datatype document, >it can be inferred from lots of places in section 2. [Note to XML Schema >people: This property of datatypes should be explicitly stated. Also, >datatypes really should be four-tuples, one element being the >lexical-to-value map!] Amen to that last point. That is a serious blunder in the way that XML schema are stated. > >> You do, though, raise an important question -- whether it >> is possible to define XML Schema simple data types which >> do not have a N:1 mapping from lexical space to value space. >> If we can have 1:N or N:N mappings, then we are going to >> have problems, and that might mean that perhaps XML Schema >> may need to be more constrained with regards to some >> simple type derivations. > >No XML Schema datatype has a 1:N or N:N lexical-to-value map. It is not >the presence of such datatypes that causes problems. > >Instead, again, it is the presence of two (different) datatypes that have >overlapping value spaces but different lexical-to-value maps within this >overlap! Which are, exactly? As far as I can see, this situation never arises with the combinations [integer], [string] , [integer union string], [string union integer]. The lexical and value spaces are respectively [integer] numerals ---> N (integers) [string] S (strings) ---> S (strings) [integer union string] S ---> N union (S - numerals) [string union integer] (same as [string]) None of these have the pathological behavior that you describe, since their lexical-to-value mappings coincide on the parts of the value spaces that overlap. > >> I'm presuming, of course, that RDF is only concerned with >> simple data types, not all XML Schema definable types in >> general. > >This is true even if you include all XML Schema datatypes, even the >composite ones. > >> > For example, what is the theory of rdf:type on datatype classes? >> >> Good question. I'm not the best person to offer an answer, >> insofar as the formal MT is concerned, but I would expect >> that the theory of rdf:type is the same for all classes, datatype >> or otherwise, and it is the knowledge about a particular class >> that tells us it is a data type class, and data type classes >> have distinct characteristics, such as defining a value space >> and (optionally) lexical space. Right, exactly. >If we declare that literals >> may only be bound to data type classes, No need to say this; only that any binding to a non-datatype class does not fix the interpretation of the literal. Only datatype class bindings have the semantic power to constrain the lexical-to-value mapping used in the interpretation, but other bindings are harmless, even if datatype-uninformative. >then we know that a >> given class is a data type class if it is bound to a literal, >> and thus know how to interpret the pairing of literal (lexical >> form) to data type. > >But if you don't provide a theory of rdf:type on datatype classes, then >others cannot evaluate your mechanism, as it uses rdf:type to determine the >lexical-to-value mapping. Right, exactly. If one tries to apply ordinary RDFS reasoning to the style adopted in Peter's proposal, which is to use rdfs:range on the predicate to define the datatype of the literal in the object position, eg aaa eg:prop "10" . eg:prop rdfs:range xsd:integer . then the closure rules for RDFS reasoning will not generate the required restrictions of the literal occurrence unless the literal is somehow allowed to be the subject of a triple of the form "10" rdf:type xsd:integer . It is worth bearing in mind that all of the rdfs: vocabulary is eliminable in RDF, and can all be defined in terms of rdf:type. That includes rdfs:range. Pat Hayes -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Thursday, 29 November 2001 18:39:10 UTC