Re: Sloppy inference rules from Steve Harris on 2012-11-07 (public-rdf-wg@w3.org from November 2012)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 7 Nov 2012 12:45:18 +0000
To: Pat Hayes <phayes@ihmc.us>
Cc: Markus Lanthaler <markus.lanthaler@gmx.net>, "'Ivan Herman'" <ivan@w3.org>, "'Guus Schreiber'" <guus.schreiber@vu.nl>, "'RDF WG'" <public-rdf-wg@w3.org>
Message-Id: <CFBCE250-900D-456F-B910-C32652A06F77@garlik.com>
On 2012-11-06, at 16:24, Pat Hayes wrote:

> 
> On Nov 6, 2012, at 5:46 AM, Steve Harris wrote:
> 
>> On 2012-11-05, at 23:04, Pat Hayes wrote:
>>> 
>>> On Nov 5, 2012, at 6:15 AM, Steve Harris wrote:
>>> 
>>>> On 2012-11-01, at 09:50, Markus Lanthaler wrote:
>>>> 
>>>>> On Thursday, November 01, 2012 6:56 AM, Ivan Herman wrote:
>>>>> 
>>>>>> As Antoine notes, the OWL 2 group has faced the same issue for OWL 2
>>>>>> RL. I do not see any problem doing that in this case either. I do not
>>>>>> think we should reopen, at this point, the bnode-in-predicate and
>>>>>> literal-in-subject issue and, with this, using this 'generalized
>>>>>> triples for the rules' seems to be the clean approach...
>>>>> 
>>>>> Honestly it sounds a bit strange to me to simply accept that there is a
>>>>> fundamental problem without trying to address it - especially considering
>>>>> that the problem has been known since at least 2005 (2002?).
>>>>> The other thing that worries me even more is the fact that a number of RDF
>>>>> serialization formats are in the process of being standardized right now. At
>>>>> least JSON-LD doesn't have this artificial restriction but that was heavily
>>>>> criticized by the RDF WG and, as it seems at the moment, we will have to
>>>>> introduce it.
>>>>> 
>>>>> I think there won't be a better point in time to fix this once for all.
>>>> 
>>>> It is a matter of opinion that there is anything broken, to "fix".
>>> 
>>> True. Let me try to explain why this current situation seems brain-damaged to anyone with logical training. A well-built logic does more than allow you to state facts: it supports inference rules (or sometimes, inference machinery of a different kind, but anyway inference machinery) which allows you to derive facts from other facts. Rules typically interact and support one another, by the outputs (conclusions) of some rules being usable as inputs to other rules, so that chains of reasoning can be supported, sometimes quite complicated chains of reasoning. Ideally, the rules should exactly "capture" the logic's own semantic notion of entailment, so that some sentences entail another just when it can be derived from them by applying the rules. 
>>> 
>>> RDF syntax, however, doesn't let you do this. There are RDF graphs which entail others, but the obvious rule derivation is blocked because the 'intermediate' sentences needed to make the rules connect properly are deemed illegal, even though they actually make semantic sense and indeed would be true under the logic's own semantic rules, and are needed in order for the rules to work properly on the "legal" sentences. Which is brain-damaged :-)
>> 
>> Sure, I understand that point of view, though that's a nice, succinct summary of it. I've built a number of inferencing engines, and you butt your head against this problem every now and again.
>> 
>> The counter to that though is more of a human factors thing: if we allowed literals as subjects in triples then people would use them as identifiers.
> 
> Perhaps they would. However, I don't think it is our job, as writers of a standard, to set out to police the world's behavior. 

Well, actually it is, surely? The question is whether this is going to far.

>> It's familiar from the DB world, and not obviously wrong to people who don't grok "Linked Data".
>> 
>> Sometimes it's harmless, e.g.
>>  23765 a :Integer .
> 
> Not only is that "harmless", it is in fact true. 

Perhaps - it depends on what else is said about :Integer.

>> Other times it's not harmless:
>>  23765 a :Widget .
> 
> Well, it depends on what you take "Widget" to mean. If this is the class of widgets, then yes this is a mistake. But if it's the class of widget numbers, it's fine. And I have no problem with it being the class of widgets *and* widget numbers, by the way, though I know I am in a minority here. 

The issue arrises when you have the above two statements in single graph. They may both be true, but it's deeply unhelpful.

>> Other times it's even hard to demonstrate that it's a bad idea:
>>  "8d8b0e54-6b8f-43ab-aff9-26a7a12890a0" a :LogEntry .
>> 
>> It's not speculation, I've heard people complain that they can't use integers to identify e.g. people, and have to stick a URI prefix on the front.
> 
> But all of these are just as valid as arguments against allowing literals in the object position in a triple, yet apparently you have no problem with that. 

Correct, because then they can only appear at the "leaves" (I'm sure there's a better term in graphs).

> And on the other side, there are clear uses for being able to say things about literal values: classifying dates and times, talking about SS numbers, saying things about what language a character string is written in, or from which news source it was extracted, etc.. 

But none of them stop you from saying things you might want to, it just becomes trickier.

>> We'd have the same issues with lexical "tags", and other things that are identifiers in some defined context.
> 
> AFAIKS, you havn't actually said what the "issues" are. 

I believe I have, but given that this is explicitly outside our charter I don't think it's a good thing to consume WG time on.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
+44 7854 417 874  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
80 Victoria Street, London, SW1E 5JL
Received on Wednesday, 7 November 2012 12:45:54 UTC