- From: Dave Reynolds <der@hplb.hpl.hp.com>
- Date: Mon, 09 Jul 2007 15:11:15 +0100
- To: Sandro Hawke <sandro@w3.org>
- CC: public-rif-wg@w3.org
Sandro Hawke wrote: > Dave Reynolds <der@hplb.hpl.hp.com> writes: >> On the telecon we discussed that part of the proposal and I thought we >> agreed to change it, that the entailmentRegime should be an attribute of >> the ruleset rather than the dataset. At least that's what I was >> suggesting and I thought Jos agreed and no one else objected. > > Yeah, I didn't object because I couldn't figure out how to articulate my > concern. I thought about it some more, and then sent the e-mail at the > start of this thread. > >> Does that address your issue? > > I don't think so. > >> As discussed in the parallel thread with Jos and Axel I would prefer to >> do that by providing an import mechanism rather than metadata. So that a >> rule set where one wanted to process RDF data and assume RDFS semantics >> could be something like: >> >> <rif:RuleSet> >> <rif:import uri="http://www.w3.org/2007/rif#RDFSruleset.rif" /> >> ... rules ... >> </rif:RuleSet> >> >> However, if that were done indirectly via ruleset metadata that would be >> OK too (I can make arguments both ways round if you like :-)). >> >> If neither ruleset import nor ruleset metadata are acceptable to you >> what's the alternative? > > As I understand it, the semantics of an RDF file, like anything else > with a MIME type, should stand on their own. To add additional > parameters affecting the interpretation is like saying "fetch > http://example.com/bar and interpret it as JPEG, no matter what MIME > type is received." (That may actually be what you need to do some > times, but clearly one "SHOULD NOT" do that.) Nah, I think it is more like fetching the jpeg, respecting it's image/jpeg mime type but saying "but don't bother with incremental rendering". You haven't ignored all its semantics but have permitted some optional processing to be skipped. > I've been meaning to poke at that other thread -- let me do it by > suggesting the particular Web address at which we should publish rules > that implement RDFS: "http://www.w3.org/2000/01/rdf-schema". That is, > we should provide, to the best of our ability, executable semantics for > RDFS. > > I'm hearing in that thread that it wont work, but I'm having trouble > understanding the difficulty. Well there's a difference between "won't work" and "not a good idea in practice". I'm only claiming the latter. In practice in an application there are many cases where one wants to control the amount of entailments to be performed. These include: - performance trade-offs - termination (kind of an extreme performance trade-off) - to do some processing such as validation where the full set of entailments would get in the way In the specific case of RDFS the most clear cut problem area is the infinite number of axiomatic triples of the form: rdf:_N rdf:type rdfs:ContainerMembershipProperty . rdf:_N rdfs:domain rdfs:Resource . rdf:_N rdfs:range rdfs:Resource . A correct and complete ruleset for RDFS would include these. [Clearly it couldn't do so as ground facts but could provide rules which deliver these on demand.] Now the problem is that any application which uses this RDFS specification must never ask queries such as "what are all the properties in this dataset" because the answer will be infinite. Yet that is a common and important query. This also rules out ever using a forward chaining system such as a PR engine for interpreting the RDFS ruleset. Yet those axiomatic facts are fairly useless for most applications. Systems in practice solve this by either ignoring the ContainerMembershipProperties all together or arranging that only the rdf:_N which exist in the base data are reported, or (in the case of Jena) allowing either at the developer's discretion. The minimum range of entailments required is a property of the application, the assumptions that the rest of the code is going to make. It is not a property of the data. Which is why if we are going to specify the entailment regime at all it has to either be out-of-band as part of the application or associated with the rule set, not with the dataset. So for RIF I see us having 3 options: (1) Pick a single subset of RDFS semantics which we think is sufficiently complete to satisfy most applications of RIF and enforce that as the one true way. We don't have the realistic option of having a single *complete* RDFS ruleset given the infinite axiom problem so this is a blessed subset. (2) Pick a few subsets of RDFS semantics to encode, capturing the most common useful trade-offs. Leave the machinery open for people to specify other rulesets. We might in fact just pick one such subset but the point is we leave it open to allow applications to pick others. (3) Provide nothing. Say it is up to the RIF processor to decide how much RDFS entailment to apply, it is not a property of a RIF document. You are arguing for #1, I'm arguing for #2. Specifically I'm suggesting that we just have a generic "include RIF ruleset" feature and provide one or more RIF rulesets capturing RDFS semantics (these may or may not be normative). That way any rule set publisher can be clear on what RDFS semantics the rule set assumes but has the option to assume none or a different subset from any ones that the WG blesses. My reasons for this are: (a) allows applications to make the performance trade-offs when they need to; (b) because I'm not sure we'll agree on what the one-true subset should be. Specifically Harold and others have suggested rhoDF (which is indeed the most useful core of RDFS) whereas at least some applications need deduction of things like rdfs:member. Arguably it shouldn't be RIF's job to bless a single RDFS subset. (c) because I have use cases for different subsets of entailment, e.g. (i) Validation. One use of rules is publishing data validation constraints. This is particularly useful in the RDF world where validation is tricky and under supported. Some validation is only possible in the absence of certain entailments. For example the rule "all things used as an rdfs:Class should be declared as rdf:type rdfs:Class" is in practice very useful for identifying errors in RDF documents but is meaningless given implied RDFS entailment. (ii) Publishing RDFS subsets. One legitimate use of RIF is to publish rule-based (i.e. proof theoretic) semantics for RDF specifications including various OWL/Tiny subsets that several groups have talked about. That might well require control over the minimal RDFS entailments to be assumed. Dave -- Hewlett-Packard Limited Registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England
Received on Monday, 9 July 2007 14:11:31 UTC