Datatyping: questions about TDL proposal from Pat Hayes on 2002-01-30 (w3c-rdfcore-wg@w3.org from January 2002)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Wed, 30 Jan 2002 15:46:28 -0600
To: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>, patrick.stickler@nokia.com
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05101049b87dfcb81d25@[65.212.118.208]>
Guys, sorry Im only now getting up to speed on this stuff, and if any 
of these questions/issues have been already covered in the email 
record then just say so and I'll get to them eventually.

Q=question
C=comment

Q1. Definition of TDL refers to a 'pairing'. Does that mean some kind 
of syntactic combination operation, or is this just a mathematical 
definition of some abstract entity? And what exactly is a 'datatype 
identity'?

Q2. In the figure immediately below, what is meant by 'internal 
value' and 'application value space'?

C[3.  Style comment. I suggest it would be better to not modify the 
definition of RDF interpretation, but to introduce a new notion of 
'datatyped interpretation' or whatever. That would enable us to keep 
all the different notions of entailment straight. ]

Q4. Terminology section refers to 'before' and 'after' datatyping. I 
have trouble understanding what this means. Do you have in mind that 
there is some kind of process which 'datatypes' an RDF graph? If so, 
what is the difference between the graph before and the graph after 
that operation? Or does this mean something  else altogether?

C5. The literal-value pairs are welldefined but seem odd, since they 
pair a unicode string with a semantic value, ie a denotation. Is that 
really what you mean? If so, then a set of these things would be a 
datatype mapping, right?

C6. "....A datatype class corresponds to its map, ie a set of pairs..."

Well, OK, but this seems a very odd decision. First, the natural RDF 
object corresponding to a set of pairs is a property (extension), not 
a class. Second, while a property can of course have a class 
extension as well as its property extension, there isnt any implied 
connection between them in RDF, so if you treat a set of pairs as a 
class then that amounts to saying that the fact that it is a class of 
*pairs* is irrelevant to its behavior as a class. Third, there is in 
fact no way to specify in RDF that any particular class is a class of 
pairs; whereas if you had characterized this as a property, then the 
RDF semantic conditions imply that it has a property extension (if it 
is ever used in a triple).

Q/C7. Interpretation.  "...the type information is checked by 
requiring this pair to be a member of each class associated with this 
node. "
What does 'associated with this node' mean?? I think what you mean is 
'each class which the denotation of the node is required to be a 
member of', right? (That is what a range constrai...sorry, assertion 
of a triple using rdfs:Range, would imply, for example.) If so, that 
is what the RDF MT says already. But notice that according to your 
convention about datatype classes, that says that the node labelled 
with the unicode string denotes a pair, not the value inside the 
pair. Is that really what you want it to say? That would mean that 
the for example the 'same' date written using different date formats 
are different dates, and so on. In fact, as far as identity is 
concerned, it means that any two values from any two distinct 
datatyping schemes are never the same value.

Q8. That same paragraph refers to 'untyped Unicode nodes'. Does that 
imply that there are two kinds of Unicode node? If so, how are they 
distinguished in the syntax?

Q9. In section 3.1 example 1, the figure has this new kind of (green 
hexagonal) node in it. What is this thing, exactly? (Is it an 
extension to the RDF graph syntax? Or some kind of external addition 
to the graph?? Or what?  If it is just an annotation, then this is 
one of the old proposals (called DC in 
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0295.html); 
but it reinterprets the xsd: classes in an odd way that makes the 
assertions wrong, for some reason best known to you guys :-)

Q/C10. Model Theoretic Interpretation of local idiom. "...Hence x is 
the integer 30."  OK, but what this  graph asserts is that the age of 
Bob is the pair <"30",30>, right? Not that the age of Bob is 30. (If 
that is wrong, how do the interpretation rules for the Bob ex:age ... 
triple manage to extract the second item in the denoted pair?)

Also, if what you say about rdf:value is correct, then since the 
unicode node has to have the same denotation as the blank node, and 
since that denotation has to be in the class xsd:integer, it has to 
be a pair; so the unicode node itself has to denote a pair.  And the 
first item of that pair is the unicode string itself, right?

C.11  Relevant to the above: look, you don't need to have *pairs* in 
the class. They are just getting in the way. If the xsd:integer class 
were the value space of the datatype (and if rdf:value were identity) 
then this idiom would work just fine, and Bob's age would be 30.  You 
do need some semantic constraint to interpret the unicode strings 
properly, but then you need that anyway. The pairs don't seem to help 
any.

Q12. In section 3.2, global idiom: "Per the following, the lexical 
form "30" is required to be a member of the lexical space of the 
datatype xsd:integer". HOW? I really don't see how this works. Since 
xsd:integer denotes a set of pairs, the range of ex:age must be a set 
of pairs, so whatever the unicode node denotes must be a pair. But 
you have it marked as 'representing a value'; and you also say that 
the lexical form is thereby required to be a member of a lexical 
space. As far as I can see, this is saying the following. The unicode 
string denotes a pair <a,x> consisting of a unicode string and a 
datatype value (eg <"13", 13>, which is a member of the extension of 
the xsd:integer datatype mapping), and it thereby 'represents' a 
value, and also simultaneously is 'required' to be a member of a 
lexical space. BUt it can't be all three of "13", 13 and <"13",13> at 
the same time, right?

(Also, re.  'required': what if the  unicode string is NOT a member 
of the lexical space of the asserted datatype? Is that just an 
inconsistency?)

C13. (Following on from C11). It seems clear from the diagram that 
what this RDF is supposed to mean is that the range of ex:age is 
integers (according to xsd:integer, ie the value space), and that the 
age of Bob is 30. What's wrong with just declaring that that is 
indeed what it does mean? See 
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0011.html.
The problem with the old P(++) proposals was the nasty flaw detected 
by Patrick , which is that a super-datatype-class of a datatype class 
might have a different lexical-to-value mapping. But your 'pairs' 
proposal has exactly the same problem. In fact it is worse, since if 
the lexical-to-value mapping is different, then the datatypes are not 
even subclasses of one another,  with your convention; so there isn't 
even any way to *say* that one datatype is 'sub' another. (Of course, 
just refusing to say that, say, xxsd:octal is a subclass of 
xxsd:number, or whatever the example is that screws up datatype 
inheritance, was always an option in the old P(++)-style proposal as 
well. )

C14. The paragraph "Whether the rdfs:range statement....property in 
question." isn't going to work in ANY model theory, unless we 
effectively redefine RDF syntax to provide some way to distinguish 
local from global. The MT has to be defined on triples, not on 
triples in some kind of undefined 'context'. (How far out do we have 
to look in the graph, or on the web, to see if there is a 'more 
local' assertion?)

C15. The list of Satisfactions looks good, but omits the one rather 
central one which I guess people didnt think to write out explicitly: 
that the idiom used actually means what it ought to mean.

C16. Neat hack for almost handling union datatypes, ie ignore the 
part that gives all the trouble. If I can use that same hack, I can 
do them too :-)

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 30 January 2002 16:46:19 UTC