Re: Possible tweak to datatype semantics from Pat Hayes on 2013-09-15 (public-rdf-wg@w3.org from September 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Sun, 15 Sep 2013 03:04:32 -0700
To: Sandro Hawke <sandro@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: RDF WG <public-rdf-wg@w3.org>, Antoine Zimmermann <antoine.zimmermann@emse.fr>
Message-Id: <46144B07-D38C-4F84-840B-ABDCC0463850@ihmc.us>
On Sep 13, 2013, at 5:57 AM, Sandro Hawke wrote:

> On 09/13/2013 02:03 AM, Peter F. Patel-Schneider wrote:
>> What good would this change from the 2004 situation do?
>> 
>> Even if inertia was strongly indicating that this change should not be made, I would vote against it.
>> 
> 
> Assuming you mean "if inertia was NOT strongly indicating".
> 
> To my mind, this change is a simple bug fix, not unreasonable to do after Last Call.   I could be wrong about that.
> 
> 
>> If you make this change, you have the situation that if x:dt is not a recognized datatype, the empty graph does not RDFS entail
>>  x:dt rdf:type rdfs:Class .
>> but
>>  :a :p "foo"^^x:dt .
>> does.
>> 
> 
> As a human, that seems perfectly correct to me.   When I see the empty graph, I do not know that x:dt is an rdfs:Class.  When I see :a :p "foo"^^x:d, I do.   Just like I also know :p is an rdf:Property when I see that triple.

I agree with Sandro that this makes intuitive sense. I don't find the conclusion that x:dt is a class to be particularly significant. Personally I would be happy to declare that everything is a class, as a basic axiom. There is nothing in RDF to prevent anyone writing any IRI as the object of an rdf:type literal. 

> 
>> I believe that your argument falls apart when you look closer at it.  You are saying, in effect, that if x:dt is a recognized datatype then any well-typed literal with it as the datatype belongs to it, and the appearance of an ill-typed literal causes a contradiction, and thus entails any graph, including the graph that states that the ill-typed literal belongs to the x:dt, so why not make this hold even if x:dt is not a recognized datatype. However, when x:dt is *not* a recognized datatype this reasoning doesn't hold water, so there is no reason to modify the semantics to make it valid.

The argument applies to any datatype, recognized or not.

I see a literal "foo"^^ex:dt whose datatype IRI is not known to me, so I have no idea what it denotes. However, I can reason as follows. There are two cases. Either "foo" is a well-formed literal string for this datatype, or it isn't. If it is, then this literal denotes something in the class ex:dt, to I can infer the existential conclusion. If it isn't, then this RDF I am looking at is false, so it is also correct to infer that conclusion (in fact, any conclusion, by ex falso quodlibet). Either way, it is correct to draw the conclusion. Therefore, the conclusion is a valid consequence of the RDF which contains this literal. 

Any my point is, that this line of reasoning does *not* depend upon my recognizing the datatype. I have not used any knowledge of the datatype or its L2V mapping here, only the basic logical rules of RDF(S) and how it handles typed literals. 

>> 
> 
> I don't think that's actually Pat's argument.    I understood the argument to be that this change makes the formal semantics more closely match human intuition.   Not to be provocative, but I though that was the goal of formal semantics -- to match human intuition as much as possible while still being completely precise and being fairly simple and comprehensible.
> 
>> You might just as well argue that if x:dt is a recognized datatype then it is a subclass of rdfs:Literal therefore anything should be a subclass of rdfs:Literal.
>> 
> 
> Perhaps this is relevant to the argument you think/thought Pat was making; I don't see its relevance.

Neither do I.

Pat

> 
>       -- Sandro
> 
> 
>> peter
>> 
>> On 09/12/2013 10:25 PM, Pat Hayes wrote:
>>> I know its very late to even be talking about this, but Antoine's test cases made me notice an oddity which the current semantics for datatyped literals produces, and which would be easy to fix. So I'm outlining it here in case the WG feels it would be worth doing.
>>> 
>>> We distinguish 'recognized' datatype IRIs from the others, and right now, if you see a literal with an unrecognized datatype IRI in it, say x:dt, then you know nothing at all about what that literal means. Absolutely nothing. So this inference:
>>> 
>>> :a :p "foo"^^x:dt .
>>> 
>>> |=
>>> 
>>> :a :p _:x .
>>> _:x rdf:type x:dt .
>>> 
>>> is not a valid entailment. But if x:dt were recognized, it would be: and moreover, you know this without knowing anything about x:dt. This entailment is valid for ANY recognized datatype, and ANY string "foo". So why isn't it valid for any datatype, recognized or not?  This is clearly slightly irrational. A rational way to reason would be: I know now, even without recognizing that datatype, that this inference will be valid when I do recognize it; and I won't need to know anything more about the datatype in order to make that inference; so why not just pretend that I recognize the datatype and make the inference now, to save time?
>>> 
>>> We could fix this with the following changes.
>>> 
>>> In section 7.1, add the condition (to the table, it would be the third line out of three):
>>> 
>>> For any literal "sss"^^aaa, if IL("sss"^^aaa) is defined then <IL("sss"^^aaa), I(aaa)> is in IEXT(I(rdf:type))
>>> 
>>> and add the explanatory text immediately below:
>>> "The third condition applies to all datatyped literals, whether the datatype IRI is recognized or not."
>>> 
>>> And in section 7.2.1, at the beginning of the text, add the entailment pattern (moved from section 8.1.1, and with "for ddd in D" removed):
>>> 
>>> rdfD1  <if S contains>  xxx aaa "sss"^^ddd  <then S D-entails> xxx aaa _:nnn .      _:nnn rdf:type ddd .
>>> 
>>> together with its explanatory text from 8.1.1.
>>> 
>>> 
>>> The advantage to RDF engines is that this is one less case where they have to check whether or not a datatype is "recognized", and it means that the interpolation lemma is more useful when there are datatyped literals around.
>>> 
>>> Any comments? Is this worth doing? Is this legally possible to do at this LC stage? I would be willing to declare the current version an error if that is what it takes :-)
>>> 
>>> Pat
>>> 
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 home
>>> 40 South Alcaniz St.            (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile (preferred)
>>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Sunday, 15 September 2013 10:05:08 UTC