The entailment rules (was: Re: RDF Semantics Editors Draft?) from Richard Cyganiak on 2012-05-21 (public-rdf-wg@w3.org from May 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 21 May 2012 13:49:31 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: public-rdf-wg WG <public-rdf-wg@w3.org>
Message-Id: <1C3DDA60-0231-4892-9C89-6613AEF9D4AD@cyganiak.de>
Hi Pat,

Reviving this thread from last week…

On 14 May 2012, at 06:11, Pat Hayes wrote:
> Then let us have a wholly new document which just describes the rules, or perhaps in an appendix to the Primer. I would like to separate the rules from the semantics: they really are separate topics.

The rules are important because they make the the Semantics stuff accessible to readers without a background in model theory. If we cut this connection, e.g. by not claiming completeness of the rules, or hiding it away in a document that no one reads like the test cases, we achieve two things:

1. Less people will pay attention to the model theory because fewer readers can make sense of it.

2. We increase demand for *someone else*, *outside of W3C*, to write up a more easily-digestible version of the semantics, and chances are that this will not get the same level of scrutiny and review.

> The rules are a kind of introduction to implementing reasoners, although they are pretty awful when viewed in this light. 

You may have intended the rules as an introduction to implementing reasoners. But this is *not* why I and Ivan and others care about them.

We like the rules because they are easy to understand. Reading the rules is a great way of understanding, for example, how rdfs:domain or rdfs:subPropertyOf really works. Trying to work out how, say, rdfs:subPropertyOf works just based on the model theory takes a good few hours of work and study because one first has to understand the document's formalisms and get a good part of the entailment regime into one's head. Each of the rules, on the other hand, can be read and understood pretty much in isolation. That makes the relevant rule a great addition to the less formal language that defines rdfs:subPropertyOf in RDF Schema.

>>> We will try to make this as complete as we can, but will not claim nor set out to prove that they are indeed complete. (That alone will get rid of pages of impenetrable maths.) 
>> 
>> Well, we have a form that many readers reportedly find easier to understand, and one that they find harder. As long as they're known to be equivalent,
> 
> They are not equivalent.

The RDF Semantics document contains proofs of equivalence between the various rulesets and entailment regimes.

>> I think it doesn't matter too much which one is informative and which one is normative.
> 
> I think it matters hugely.

Why? You have proven that they are equivalent.

>>> I think they should be in a separate doument entirely, perhaps as part of the test cases,
>> 
>> This would be inappropriate. The test cases are supposed to be machine-processable so that one can have a test harness that automatically verifies an implementation.
> 
> I don't see why this would prohibit the rules being there. All a rule is, is a pattern of an entailment. The test cases have entailments in them now. It would be easy to rephrase the rules in a test-case-like format, using tables of patterns. They would be used to test reasoners, which should be able to demonstrate the conclusion when told to assume the antecedents. 

The rules are good for teaching the semantics. This means they should be someplace where they will be read and where people that look for them will find them. Test case documents are read by vendors who want to certify their implementation, not by general users trying to work out what rdfs:subPropertyOf does.

>>> and no claims should be made as to their completeness, and no long and extremely opaque (and flawed) completeness proofs should be included, even in an appendix. Nobody gives a damn about completeness in any case.) 
>> 
>> -1. I do give a damn about the completeness of the rules.
> 
> I really do not understand your position. How can you care about completeness when you dont want us to even have the content which allows us to define the very concept of completeness? 

1. I believe that we should present all content in the most digestible form possible. This is why I like the rules.

2. If we present the same content in multiple forms (which can be a good idea), then it is important to characterize the relationship between the different representations. Stating that they are equivalent is a strong characterization.

3. The form that's most readily accessible should be normative. This doesn't diminish the value of the model theory as a means for proving that the rules are complete.

4. Ultimately, correctness is not what's written in some dusty tome sitting on W3C's webserver. Correctness is what's actually interoperably implemented. In that regard, RDF Semantics is, I have to say, a failure. I want to help making RDF Semantics easier to implement, and closer connected to what's actually being implemented out there.

>> And making the model theory normative instead of the rules saves us from typos in the normative parts how?
> 
> The actual statement of the truth conditions can be checked fairly easily. Small errors in rules, and especially missing cases needed for completeness, are MUCH harder to spot. There are errors in the 2004 rules, in fact, which were only discovered months after publication and after the most thorough vetting by some extremely careful people (try reading the email logs shortly before LC to see how anal this got at times). Even the early implementers (Jos deRoo had a rule engine in 2003 which we used often to check completeness) did not notice them. 

So you are saying that transforming the semantics into a form that is easy to implement is next to impossible, and can't even be done with the considerable community attention that comes with the W3C standardization process. So implementations are basically guaranteed to be broken. This strengthens the case that making RDF Semantics a normative recommendation in the first place was a big mistake.

> Semantics does not describe or even mandate ANY behavior. An "engine" which inputs RDF and does exactly nothing to it except print it back out unchanged, is a perfectly correct and conformant RDF engine.

That is nonsense. There is no such thing as a conformant RDF engine. For such a thing to exist, there would have to be a spec that defines conformance criteria for such a beast. I would be *thrilled* if this existed, and I would be *thrilled* if RDF 1.1 Semantics would define some conformance criteria.

(Also, if all one has to do to conform to RDF Semantics is sit there and print back the triples unchanged, then the spec could have said that in a lot less words.)

>>> What would it mean to make those rules normative? Would an efficient tableax-based reasoner be then illegal?
>> 
>> No, why would it?
> 
> Because if would not be using rules at all; but the rules are normative. Which I take to mean, if you aren't using the rules, you are are doing it wrong.

If you get the same results by whatever means, you aren't. I said this several times already in this thread.

And still, the (vast?) majority of RDFS “reasoners” are rule-based.

>>>>> I think we all agree that this is *not* the document you should be looking at in your first encounter with RDF.
>>> 
>>> If you know nothing about logical methods, inference engines, machine inference? Yes, it might not be a good starting point if you are this ignorant, indeed.
>> 
>> Inappropriate display of arrogance.
> 
> Why? There seems to be a ground assumption here that our typical reader knows absolutely nothing about semantics, logic, reasoners, reasoning, or indeed almost anything about the technical field in which these specifications are situated

For RDF to be successful, many people — domain modellers, implementers, technical writers — need to understand, for example, what can be inferred from a handful of triples under RDFS-entailment. This doesn't mean we need to write for idiots. But it means we need to write for non-logicians.

>>>> This problem could be solved easily with the insertion of some language in the introduction pointing to the Primer in the first paragraph (instead of just the Vocab and Concepts, as it does now).
>>> 
>>> See above. It refers to the primer in the second paragraph of the document. 
>> 
>> In the section that nobody reads. The Introduction is silent on the topic of which other documents you should already be familiar with before starting this one.
> 

> By all means let us copy paragraph 2 of the Status section into the Introduction, where people might possibly see it.  I presume this editorial decision will apply to all the technical documents.

All documents except RDF Semantics already spend quite a bit of their introductions explaining their role in the big picture of RDF and their relationships to the other documents. RDF Semantics is the odd one out. But ok, we can easily fix that.

>>> I couild try to write a short explanation of how to test a proposed inference for validity by constructing a formal counterexample. This might give naive readers a better grasp of how to connect interpretations with rules, in fact. It could also go into the 'test cases' document, or even into the primer somewhere (?) 
>> 
>> Not keen on having this in the test cases. Covering more of the semantics in the Primer is an intriguing idea.
> 
> It might help. It can be expressed in non-mathematical terms. I will take a stab at producing some text for this.

If you need a clueless non-mathematician to test this on, I'm happy to read and comment.

Best,
Richard
Received on Monday, 21 May 2012 12:50:29 UTC