Possible tweak to datatype semantics

I know its very late to even be talking about this, but Antoine's test cases made me notice an oddity which the current semantics for datatyped literals produces, and which would be easy to fix. So I'm outlining it here in case the WG feels it would be worth doing. 

We distinguish 'recognized' datatype IRIs from the others, and right now, if you see a literal with an unrecognized datatype IRI in it, say x:dt, then you know nothing at all about what that literal means. Absolutely nothing. So this inference:

:a :p "foo"^^x:dt .

|=

:a :p _:x .
_:x rdf:type x:dt .

is not a valid entailment. But if x:dt were recognized, it would be: and moreover, you know this without knowing anything about x:dt. This entailment is valid for ANY recognized datatype, and ANY string "foo". So why isn't it valid for any datatype, recognized or not?  This is clearly slightly irrational. A rational way to reason would be: I know now, even without recognizing that datatype, that this inference will be valid when I do recognize it; and I won't need to know anything more about the datatype in order to make that inference; so why not just pretend that I recognize the datatype and make the inference now, to save time?

We could fix this with the following changes.

In section 7.1, add the condition (to the table, it would be the third line out of three):

For any literal "sss"^^aaa, if IL("sss"^^aaa) is defined then <IL("sss"^^aaa), I(aaa)> is in IEXT(I(rdf:type))

and add the explanatory text immediately below:
"The third condition applies to all datatyped literals, whether the datatype IRI is recognized or not."

And in section 7.2.1, at the beginning of the text, add the entailment pattern (moved from section 8.1.1, and with "for ddd in D" removed):

rdfD1  <if S contains>  xxx aaa "sss"^^ddd  <then S D-entails> xxx aaa _:nnn .      _:nnn rdf:type ddd .

together with its explanatory text from 8.1.1.


The advantage to RDF engines is that this is one less case where they have to check whether or not a datatype is "recognized", and it means that the interpolation lemma is more useful when there are datatyped literals around. 

Any comments? Is this worth doing? Is this legally possible to do at this LC stage? I would be willing to declare the current version an error if that is what it takes :-)

Pat

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes

Received on Friday, 13 September 2013 05:26:18 UTC