Re: datatypes and MT from Pat Hayes on 2001-11-05 (w3c-rdfcore-wg@w3.org from November 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Mon, 5 Nov 2001 14:03:58 -0600
To: Dan Connolly <connolly@w3.org>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05101026b80c97d0198c@[65.212.118.166]>
>Pat Hayes wrote:
>[...]
>>  Let me try to first summarize the MT changes that I managed to
>>  extract from the pfps/ph interchange
>
>I object to this proposal entirely.
>
>>  For example, the following graph written in bnode-style:
>>
>>  aaa bbb _:1 .
>>  _:1 rdf:value "345" .
>>  _:1 rdf:type xsd:integer .
>>
>>  would be boiled down into:
>>
>>  aaa bbb _:1:"345" .
>>  _:1 rdf:type xsd:integer .
>
>Let's not muck up the model theory like this.

It doesn't muck up the model theory.

>Let's
>keep it simple:

?? See below for why this isn't simple.

>abstract syntax:
>	terms:
>		constants (URIs w/fragids)
>		string literals
>		bnodes (existentially quantified variables)
>	statement:
>		term term term.

We've already gone into why this simple a syntax does not work.

>	formula:
>		statement*
>with the traditional interpretation structure
>(with the IEXT() indirection trick).
>
>That's it.
>
>If we want to say "my shoe size is some
>integer whose decimal representation is '10'",

But I don't want to say that. I want to say that my shoe size is 10. 
(If you know about shoe sizes, you ought to know what that "10" 
means.) I also, for similar reasons, don't want to have to say that 
the result of adding some integer whose decimal representation is "6" 
and anther integer whose decimal representation is "4" is a third 
integer whose decimal representation is "10"; I would much rather not 
have to even mention representations at all, and just write 4+6=10, 
like people have done for about the last millennium. And the reason 
in both cases is because it is SIMPLER.

Heres why it is simpler, apart from the obvious avoidance of needless 
literal-bloat. That innocent-looking decimalRep only works if you 
change the semantic rules at that point. Rather than treat the 
literal as naming what it denotes, like all the other nodes in the 
RDF graph, you have to read it as naming itself. At literal nodes, 
and only at literal nodes, the semantic rules have to be rewritten 
just to accommodate this oddity. This would be OK if everyone were to 
keep very clear in their heads that the rules are being changed, but 
I suspect that is going to be too much to ask, and that some people, 
not unreasonably, are going to interpret literals like all other 
names (that being the way they are usually understood everywhere 
else) and will thus be dropped into a mire of confusion; and others, 
perhaps, may push the balance the other way, and start treating all 
names as self-denoting (a tendency that the generic DPH seems to have 
innately in any case), which is an even deeper mire.

The great advantage, to me, of the MT extension is that it keeps the 
RDF simpler. The graphs are simpler, and the inferences are simpler. 
All datatyping inference is done by RDFS up to the point where 
rdf:type connects a datatype to a literal, then that datatype mapping 
handles that literal; that's all most people will need to know. Also 
the basic semantic rules are simpler; ALL labels denote in the same 
way, for example, and datatype value domains are classes, as any 
reasonable user might expect. The only thing that is more complicated 
to state is the model theory itself, for technical reasons; but even 
there, the extra complexity only affects literals in the presence of 
datatyping, and can be factored away from the rest of the MT very 
cleanly. And most of the complexity is just in stating datatypes 
themselves as semantic constructs (which is why we need DT and DTS 
and DTC).

>that's easy:
>
>	:me :shoeSize _:x.
>	_:x rdf:type :integer.
>	_:x :decimalRep "10".
>
>which can be written in RDF/xml 1.0 very easily:
>
>	<rdf:Description rdf:about="#me">
>	  <shoeSize>
>             <integer decimalRep="10"/>
>	  </shoeSize>
>         </rdf:Description>
>
>To fill in the details... let dt:
>be the namespace of XML Schema primitive datatypes,
>and let rdfs:str be a new property
>that relates XML Schema datatype to strings;
>it's unambiguous over each of the primitive datatypes;

Are you sure? What about things like leading zero suppression? But in 
any case, the natural way to describe a datatype mapping is going 
from the lexical domain to the value domain. You are forced to talk 
about the inverse of this mapping. As well as being unnecessary, this 
assumes that there always is a unique inverse mapping.

>in the case of dt:string, it's the identity relation.
>
>	<rdf:Description rdf:about="#me">
>	  <shoeSize>
>             <dt:decimal rdfs:str="10"/>
>	  </shoeSize>
>         </rdf:Description>

I still have trouble with RDF/XML,  so sorry if I'm confused, but 
does that create *three* triples, with two bnodes in common:

#me shoeSize _:1 .
_:1 dt:decimal _:2 .
_:2 rdfs:str "10" .

(If not, how does the 'rdfs:str' get into the second triple??)

>I use rdfs:str rather than rdf:value because M&S 1.0
>(and some dublin core documentation) suggest that
>rdf:value is for some wierd sort of currying.

I agree that rdf:value seems to be both a bad term and hopelessly 
compromised at this point.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Monday, 5 November 2001 15:04:00 UTC