- From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
- Date: Fri, 19 Feb 2010 10:57:18 +0000
- To: Ivan Herman <ivan@w3.org>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>, Chimezie Ogbuji <ogbujic@ccf.org>, Axel Polleres <axel.polleres@deri.org>
On 19 February 2010 08:26, Ivan Herman <ivan@w3.org> wrote: > I am not sure, procedurally, what the next step is. Would we need a nod > from the WG to make these changes on your draft, or would you just go > ahead and do it? It affects the (C2) condition on previously published > entailment regimes, too. You know the official process much better than I do I guess ;-) > Personally, I would prefer some other 'go-ahead' reactions from all > parties interested... You can also raise this shortly on the next telco > if there is time enough (noting that this document still has the > 'time-permitting' label attached to it...) How about putting that on the agenda of our (RIF) entailment regimes teleconf. At least there we'll have a bigger group and everybody cares about entailments. If we all agree, then I would just go ahead and change it unless there are official requirements that require a thumbs up from the whole group. Birte > Ivan > > On 2010-2-18 13:18 , Birte Glimm wrote: >> [snip] >>>> I am not 100% happy. I think it is better because I can imagine tools >>>> that allow you to choose between getting all triples and getting only >>>> those that use the input vocabulary. Thinking even more about this, >>>> here is another refinement that might capture even better what most >>>> users would expect. >>>> (C2) For each variable x in V(BGP), sk(μ(x)) occurs in sk(SG) or in Vocab. >>>> (C3) For each triple s p o in P(BGP), either s or o occurs in sk(SG). >>>> >>> So... I wonder whether we are not going down the false route here. The >>> _only_ requirement we have is to ensure finiteness. (C2) ensures that, >>> but we could, maybe, make it even more lax by concentrating only on the >>> rdf:_i properties and disallow any of those in the conclusion that do >>> not appear in the original graph. (This is what ter Horst does.). That >>> might ensure finiteness. >> >> C2 is that lax at the moment since we said vocab is the reserved >> vocabulary for the regime you are using minus the rdf:_i and variable >> binding come either from there or from the queried graph, so if the >> graph talks about rdf:_3, then rdf:_3 is allowed. Ter Horst goes even >> a little bit further if I remember correctly by taking the largest n >> such that rdf:_n occurs in the input and allows all rdf:_m with m<=n. >> >>> The result set will be large? Sure it will. But this _is_ what the >>> RDF/OWL semantics dictates, isn't it, so why would we want to second >>> guess the user? Does he/she wants to control the size of the output? >>> Well, that is why, in their infinite wisdom, the authors of SPARQL >>> invented FILTER-s...:-) >> >> That's a good point. It might be a bit slower to first add all the >> results and then remove them again, but in the end there are not that >> many. (More than a user might have expected, but in total numbers not >> many). >> >>> If I look at the particular example that you gave, here is a modified >>> version: >>> >>> SELECT ?ind ?class >>> WHERE { >>> ?ind a ?class . >>> FILTER( >>> !regex( str(?ind), "^http://www.w3.org/2000/01/rdf-schema#" ) && >>> !regex( str(?ind), "^http://www.w3.org/1999/02/22-rdf-syntax-ns#") >>> ) >>> } >>> >>> with the data: >>> >>> ex:a a ex:C . >>> >>> and, I believe, with RDFS entailment we would get what we want. >> >> True. >> >>> So I believe we may want to examine an alternative route: >>> >>> - define (C2) to be the absolute strict minimum to ensure a finite solution >>> - look at the existing FILTER possibilities to see if we can handle >>> those common use cases. We may want to propose some shortcuts, like the >>> one above, ie, some sort of an operation which says >>> >>> inNamespace( ?x, URI ) >>> >>> which means that ?x as a term starts with the URI. >>> >>> What do you (and others!) think? >> >> I think that could work out very nicely :-) >> >> Birte >> >>> I have the gut feeling it will also help in defining the RIF >>> alternative, b.t.w. >>> >>> Ivan >>> >>> >>>> Here, C2 is very lax and it allows almost all axiomatic triples, but >>>> still guarantees finiteness of the answers in all cases. Now C3 >>>> basically says that if the instantiated BGP has nothing to do with >>>> your data, then omit it. I omit the predicate as an option in C3 >>>> since predicates such as rdf:type occur in most graphs (possibly by >>>> implicit triples as in OWL) and will then let through most axiomatic >>>> triples. Answers that are then filtered out are basically axiomatic >>>> triples that you didn't mention at all. This could be relaxed to (not >>>> sure which one makes more sense, needs more thinking) >>>> (C3) There is at least one triple s p o in P(BGP) such that either s >>>> or o occurs in sk(SG). >>>> >>>> Going back to the examples, if we have the query: >>>> SELECT ?ind ?class WHERE { ?ind a ?class } >>>> and data: >>>> [[ >>>> ex:a a ex:b . >>>> ]] >>>> as in you generated output, then you would get under OWL RL >>>> ?ind/ex:a, ?class/ex:b >>>> >>>> For the query >>>> SELECT ?r WHERE { ex:a ?r ex:a } >>>> over >>>> [[ >>>> ex:a ex:b ex:c . >>>> ]] >>>> you would get >>>> ?r/owl:sameAs >>>> no mater whether there is ex:something owl:sameAs ex:something or not. >>>> >>>> It is still possible to generate artificial examples, but less so. E.g., >>>> SELECT ?type WHERE { rdf:type a ?type } >>>> over the empty graph gives you nothing since C3 cannot be satisfied in >>>> an empty graph. If we query over >>>> [[ >>>> ex:b a ex:c. >>>> ]] >>>> we get >>>> ?type/rdf:Property >>>> C2 holds because rdf:Property is in vocab (assuming we do RDF(S) entailment) >>>> C3 holds because the instantiated BGP is rdf:type a rdf:Property and >>>> the subject rdf:type occurs in its abbreviated form in your data. That >>>> seems a nicer compromise. If we don't exclude axiomatic triples, we >>>> get infinite answers, and if we exclude some, we get some non-local >>>> side effects, but that cannot be totally avoided. >>>> >>>>> Well... this is clearly not our decision. To be formal, we should >>>>> definitely flag that as an issue to be discussed. In some ways, the >>>>> question is: is it better to have many potential responses (ie, the user >>>>> will have to filter things out) or a small number though some expected >>>>> results will not be returned (only via ugly tricks). >>>> >>>> Yes. >>>> I would prefer better too many than too few. Tools will hopefully >>>> provide a way to be configured such that they hide what I don't want >>>> to see, while still giving me the chance to see it if I want to. >>>> >>>> [snip] >>>>>> Another would be to define >>>>>> something like "extensible entailment regimes", where the entailment >>>>>> regime is a combination of one of the defined semantics plus some >>>>>> rules/axioms that the endpoint will always apply, e.g., the SKOS >>>>>> axioms are always assumed to be present. >>>>> >>>>> I think the answer is: this is where RIF comes in. That gives me the >>>>> necessary flexibility (and I can always simulate OWL 2 RL level with a >>>>> RIF rule set). >>>> >>>> Yes. That would be nice. >>>> >>>>> B.t.w.: a point here for the future RIF discussion: >>>>> >>>>> - we have something for OWL 2 RL >>>>> - we will have something for RIF (hopefully) >>>>> - there is a RIF document that, essentially, defines a RIF rule set for >>>>> OWL 2 RL >>>>> >>>>> Surely the result of the entailment should be the same whether I use the >>>>> OWL 2 RL entailment definition or the RIF one and use the official rule >>>>> set... >>>> >>>> Yes. It would be not nice at all if that were not the case. >>>> >>>> Cheers, >>>> Birte >>>> >>>>> Cheers >>>>> >>>>> Ivan >>>>> >>>>>> Birte >>>>>> >>>>>>>> We might still want to exclude axiomatic triples unless they occur in >>>>>>>> the input because they potentially add lots of answers that you don't >>>>>>>> really want. >>>>>>> >>>>>>> See above. But if I am wrong, I do not mind that either. >>>>>>> >>>>>>>> Another possibility would be to apply C2 only to variables in subject >>>>>>>> and object position and allow anything in the predicate position. That >>>>>>>> still can cause counterintuitive side effects, but many more cases are >>>>>>>> covered. In that case, inconsistencies need extra care because in >>>>>>>> principle this would still allow infinite answers if the given graph >>>>>>>> is inconsistent and we just assume the scoping graph to be equivalent >>>>>>>> to the queried graph no matter what. >>>>>>> >>>>>>> I guess you had that in one of the first drafts and we did not really >>>>>>> like it:-( >>>>>>> [snip] >>>>>>>>> >>>>>>>>> Sigh. >>>>>>>> >>>>>>>> Sigh too. >>>>>>> >>>>>>> :-) >>>>>>> >>>>>>> Ivan >>>>>>> >>>>>>>> Birte >>>>>>>> >>>>>>>>> >>>>>>>>> Ivan >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> http://www.w3.org/TR/2009/REC-owl2-rdf-based-semantics-20091027/#Appendix:_Axiomatic_Triples_.28Informative.29 >>>>>>>>> [2] >>>>>>>>> http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/#Entity_Declarations_and_Typing >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2010-2-16 18:42 , Birte Glimm wrote: >>>>>>>>>> Hi all, >>>>>>>>>> I have committed a new version of the entailment regimes document: >>>>>>>>>> http://www.w3.org/2009/sparql/docs/entailment/xmlspec.xml >>>>>>>>>> >>>>>>>>>> There is now a description of the OWL RDF-Based Semantics incl. the >>>>>>>>>> OWL 2 RL profile. The OWL 2 RL profile can also be used with Direct >>>>>>>>>> Semantics, so I have added that there too. Further I have added a >>>>>>>>>> section about aggregates with RDF(S) entailment, addressing at least >>>>>>>>>> parts of Axel's comments (no owl:sameAs discussion yet for >>>>>>>>>> aggregation). I also defined the behaviour for inconsistent graphs >>>>>>>>>> more clearly because the previous spec didn't define the scoping graph >>>>>>>>>> in the case of inconsistencies. It was rather assumed that the scoping >>>>>>>>>> graph is still equivalent to the active graph, so that systems can >>>>>>>>>> just use the graph as is modulo bnode renaming, but that allowed >>>>>>>>>> infinite answers for inconsistent graphs. I now use Axel's suggestion >>>>>>>>>> for condition C2 and require not only bindings for variables inn >>>>>>>>>> subject position to occur in the input, but require this for all >>>>>>>>>> variables. This also solves the OWL RDF-Based semantics problem where >>>>>>>>>> you can have infinite answers from owl:topDataProperty, which relates >>>>>>>>>> an individual to all data values. Now all RDF-Based regimes (RDF, >>>>>>>>>> RDFS, OWL 2 RDF-Based (for OWL Full and OWL RL)) use the same >>>>>>>>>> definitions, which is nice IMO. >>>>>>>>>> >>>>>>>>>> Birte >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Ivan Herman, W3C Semantic Web Activity Lead >>>>>>>>> Home: http://www.w3.org/People/Ivan/ >>>>>>>>> mobile: +31-641044153 >>>>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>>>>>>>> FOAF : http://www.ivan-herman.net/foaf.rdf >>>>>>>>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Ivan Herman, W3C Semantic Web Activity Lead >>>>>>> Home: http://www.w3.org/People/Ivan/ >>>>>>> mobile: +31-641044153 >>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>>>>>> FOAF : http://www.ivan-herman.net/foaf.rdf >>>>>>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> Ivan Herman, W3C Semantic Web Activity Lead >>>>> Home: http://www.w3.org/People/Ivan/ >>>>> mobile: +31-641044153 >>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>>>> FOAF : http://www.ivan-herman.net/foaf.rdf >>>>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>>>> >>>>> >>>> >>>> >>>> >>> >>> -- >>> >>> Ivan Herman, W3C Semantic Web Activity Lead >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +31-641044153 >>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>> FOAF : http://www.ivan-herman.net/foaf.rdf >>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>> >>> >> >> >> -- >> Dr. Birte Glimm, Room 306 >> Computing Laboratory >> Parks Road >> Oxford >> OX1 3QD >> United Kingdom >> +44 (0)1865 283529 >> > > -- > > Ivan Herman, W3C Semantic Web Activity Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > PGP Key: http://www.ivan-herman.net/pgpkey.html > FOAF : http://www.ivan-herman.net/foaf.rdf > vCard : http://www.ivan-herman.net/HermanIvan.vcf > > -- Dr. Birte Glimm, Room 306 Computing Laboratory Parks Road Oxford OX1 3QD United Kingdom +44 (0)1865 283529
Received on Friday, 19 February 2010 10:57:52 UTC