Re: Reorganizing the RDF Semantics from Ivan Herman on 2011-05-25 (public-rdf-wg@w3.org from May 2011)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 25 May 2011 09:23:37 +0200
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>, Lee Feigenbaum <lee@thefigtrees.net>
Message-Id: <DCACB2CE-F219-4D5A-A7AA-17907EBB6458@w3.org>
On May 25, 2011, at 06:18 , Pat Hayes wrote:

[snip]
> 
>> we have to look at the effects this may have on the deployment and standardization landscape... In particular:
>> 
>>   - I simply do not know (and I trust some others in the group will know that) whether various implementations of RDF stores and engines have bought into the current layering or not. Ie, do we have deployment of engines that claim: "I implement RDF interpretation and entailment but I do not implement RDFS interpretation and entailment". They might be in trouble...
> 
> Yes. The regimes will still exist and the new version will of course define them exactly as they are defined now, for exactly this reason. But it will also allow other combinations to be used. 

Great! I misunderstood your intentions then, sorry about that. 

> 
>> 
>>   - We have a SPARQL 1.1 Entailment Regimes' draft[1] in last call that carved up its space along the layers in the semantics document. We may have to make a decision very quickly on your proposal to possibly modify that document by, essentially, simplifying it, too. I am not saying that is impossible, but I am (again:-) concerned about a possible delay on SPARQL 1.1 (I am sure Lee will agree with me on this:-)
> 
> I hope this could be resolved very quickly. I am sure that RDF can adapt smoothly to whatever SPARQL requires, actually. 
> 

Well... in view of your previous remark, this may become moot, in the sense that the current SPARQL entailment document will remain valid as is. The only additional issue is that it may make sense for the entailment document to add something about the other possibilities an implementation may provide through those 'other combinations' that you refer to. It may be worth to see with Birte Glimm (editor of that document) if this is necessary and possible. I would regard that as a basically editorial remark, though, which should not create a problem to the advancement of SPARQL.

>> 
>> 
>> - While we are looking at the reorganization of the RDF Semantics, there are some 'wishes' that I'd also have; these are not incompatible with what you describe. Actually, it might be even easier.
>> 
>>   - We know that there are certain rules/interpretations that make RDF implementations complicated and the community has come up with non-standard tricks around this. The most obvious one is the infinite number of axioms due to our friends rdf:_i. ter Horst describes the approach which is most commonly used afaik (use an upper limit for rdf:_i based on the ones used in the graph); the sparql document makes it even more restrictive in [2] by considering only those that really appear in the graph being queried. I would love to see these approaches explicitly reflected in the semantics document.
> 
> Me too. ter Horst has a number of bug fixes and improvements which I plan to adopt, with suitable thanks.

Ok. As you said, the devil is in the details, but I am happy something will happen along these lines.

>  
> 
>> 
>>   - Both for implementers and for casual readers the current Semantics document, ie, the way it is formulated, is fairly difficult to follow. Most of the readers are not familiar with the model theoretical formulation. However, all computer scientist can understand the entailment rules pretty easily, they are obvious to anyone who has written a line of computer code. In the current document those rules are fairly hidden, explicitly stated as informal; they do feel like an add-on. I think they should be way more prominent that they are now, in many respect more prominent than the interpretation constraints. You hint to that in your proposal below, actually, which makes me confident that this could be done without compromising the mathematics...
> 
> Yes. I agree absolutely (and was, in spite of my spirited defense of mathematical foundational thinking, very persuaded by Richard's earlier volley to this effect.)

:-)

> 
> I have a few editorial reservations, though. it is misleading to give the impression that *all* semantic constraints can be captured by rules in this way. As soon as one gets to OWL (or several of the more expressive languages) there are going to be semantic conditions that cannot be captured by Horn rules.

Correct. It is probably wise to make a comment about that in the document, but for the rest we should leave that to OWL. 


To have our conscience clear I have made some checks on the new OWL documents. The OWL 2 RDF Based semantics:

http://www.w3.org/TR/owl2-rdf-based-semantics/

has been defined from scratch, so to say, ie, it does not rely on the RDF Semantics. In other words whatever we do will not directly affect that document. The OWL RL Profile with rules:

http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules

may be more relevant at first glance, but there is no direct reference either; the only thing it says is:

[[[
In order to avoid potential performance problems in practice, OWL 2 RL/RDF rules do not include the axiomatic triples of RDF and RDFS (i.e., those triples that must be satisfied by, respectively, every RDF and RDFS interpretation)... ; moreover, OWL 2 RL/RDF rules include most, but not all of the entailment rules of RDFS.... An OWL 2 RL/RDF implementation may include these triples and entailment rules as necessary without invalidating the conformance requirements for OWL 2 RL....
]]]

In other words, we should not worry about OWL in this respect...

> And, I am reluctant to give the impression that RDF/S semantic constraints *must* be implemented by rules, as there are many other techniques for checking consistency or entailment which may be far more efficient in some applications. (This BTW is why all those rules are explicitly labeled as informative in the current version.)

My reading is that the fact of providing the rules as normative does _not_ mean that this _is_ the way you must implement RDF/S. Again to the OWL RL case, the rule set (which is normative) includes rules for owl:sameAs which, if implemented naïvely, can increase the size of a graph tremendously. However implementation may (and do) avoid that through local tricks. So, maybe with a clear comment, I would be in favour of adding these rules as normative content, and even push it editorially to the forefront, so to say, to catch the attention of all readers.

> 
> Still, I am sure we can, with careful editing, provide the clarity of rules for the programmers and also hedge them around with enough warnings to allow the inference wizards to do their exotic stuff. 

Pat-as-inference-wizard:-) Yes, I am sure we can, and we can significantly change the perception of the RDF semantics by doing so. Pat, I am happy to help and act as a sounding board in this exercise, which I believe is very important!

Thanks

Ivan



> 
> Pat
> 
>> 
>> Thanks!
>> 
>> Cheers
>> 
>> Ivan
>> 
>> [1] http://www.w3.org/TR/sparql11-entailment/
>> [2] http://www.w3.org/TR/sparql11-entailment/#RDFEntRegime
>> 
>> 
>> On May 24, 2011, at 06:53 , Pat Hayes wrote:
>> 
>>> I would like to propose some structural changes to the RDF Semantics document, in addition to the various local changes that will be required by various decisions the WG takes, and the need to correct noted errors. I wonder what the WG thinks...
>>> 
>>> In many ways the RDF semantics follows a textbook presentation of model theory. However, the way it is organized, so that each entailment regime is associated with a namespace, giving simple, RDF, RDFS and D-entailments, is *not* textbook stuff. This kind of thing just doesn't happen in textbook logics, so we were on new ground. We did it in this way largely because we couldn't think of anything else and it seemed natural to carve the space up by the URI prefix. I now think that this 'chunking' of entailments into distinct entailment regimes is not particularly useful, and probably causes more harm than good, and have a different proposal. 
>>> 
>>> Another, related, point is that the Semantics document follows logic textbook style in its focus on the vocabularies. The classical logical view is that a logic, such as RDF, is not itself a 'language': rather, a logical language is a set of particular names, and interpretations are always relative to such a set. We called these vocabularies. I now think that this is not really appropriate for a Web language such as RDF (or indeed OWL or RIF or any  of the others); rather, we should always have a single 'vocabulary' consisting of all possible Web names, ie *all* IRIs. A web interpretation is then a mapping from all possible IRIs to elements of a universe, so this universal vocabulary does not need to be mentioned more than once. This eliminates the need to speak of RDF-interpretations, RDFS-interpretations, etc.; they are all just interpretations. (An RDF interpretation is now an interpretation which satisfies all the RDF semantic conditions, and similarly for the others; but this is no longer a different *sort* of interpretation.) This simplifies and unifies the semantic treatment, and it also gets rid of some odd technical glitches associated with empty vocabularies.
>>> 
>>> So, the idea is that we will list all the semantic conditions, just as we do now (though see below) but instead of grouping them into distinct entailment regimes, we will associate them with the vocabulary that is used to state them. We simply say that if you use any of the rdf: or rdfs: URIs in your graph, then you are buying into (that is, you agree to accept the truth of) all the semantic conditions that apply to your vocabulary items, ie all the axioms and rules that are stated using only the vocabulary items you use. For example, if you use rdfs:subClass, then you are agreeing that it is transitive, since this rule only uses rdfs:subClass. Similarly, if you use any RDF literal syntax, then you are buying into the semantic conditions that apply to whatever type URIs you are using, and so on. We can still define the RDF- and RDFS- entailment regimes, but these would now be in an appendix rather than being the overall organizing backbone of the whole semantic system. (Simple entailment will always be a well-defined option, by the way: it is the entailment that you get when you ignore all vocabulary semantic conditions.)
>>> 
>>> This has the merits of simplicity and uniformity, but more importantly, it allows the semantic commitment made by an RDF user to be tailored to the particular pieces of RDF/S vocabulary she wants to use, without necessarily buying into a whole entailment regime; and it means that the question, of which entailment regime is relevant (should we be doing RDF or RDFS reasoning?) is now avoided, or maybe answered in a uniform and automatic way. An example is the recent request to include XSD datatyping without being forced to buy into RDFS entailment: this would follow automatically in this new regime, simply by using XSD vocabulary in literals but not as class names. 
>>> 
>>> Obviously, the devil is in the details, but I would be interested in feedback (positive or negative) before getting too embroiled in those. 
>>> 
>>> I would also like to adopt a more 'regular' way to express the various semantic conditions. Right now some of them are written as model-theoretic constraints on interpretations, others as 'axioms' and others as entailment 'rules' . There is no real reason to have things this mixed, and I think it would be easier if all the conditions were presented uniformly, perhaps in both model-theoretic and axiom/rule styles, in different tables, but in a uniform format throughout. 
>>> 
>>> Pat
>>> 
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 or (650)494 3973   
>>> 40 South Alcaniz St.           (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile
>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973   
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 25 May 2011 07:21:33 UTC