- From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
- Date: Thu, 18 Feb 2010 12:18:13 +0000
- To: Ivan Herman <ivan@w3.org>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>, Chimezie Ogbuji <ogbujic@ccf.org>
[snip] >> I am not 100% happy. I think it is better because I can imagine tools >> that allow you to choose between getting all triples and getting only >> those that use the input vocabulary. Thinking even more about this, >> here is another refinement that might capture even better what most >> users would expect. >> (C2) For each variable x in V(BGP), sk(μ(x)) occurs in sk(SG) or in Vocab. >> (C3) For each triple s p o in P(BGP), either s or o occurs in sk(SG). >> > So... I wonder whether we are not going down the false route here. The > _only_ requirement we have is to ensure finiteness. (C2) ensures that, > but we could, maybe, make it even more lax by concentrating only on the > rdf:_i properties and disallow any of those in the conclusion that do > not appear in the original graph. (This is what ter Horst does.). That > might ensure finiteness. C2 is that lax at the moment since we said vocab is the reserved vocabulary for the regime you are using minus the rdf:_i and variable binding come either from there or from the queried graph, so if the graph talks about rdf:_3, then rdf:_3 is allowed. Ter Horst goes even a little bit further if I remember correctly by taking the largest n such that rdf:_n occurs in the input and allows all rdf:_m with m<=n. > The result set will be large? Sure it will. But this _is_ what the > RDF/OWL semantics dictates, isn't it, so why would we want to second > guess the user? Does he/she wants to control the size of the output? > Well, that is why, in their infinite wisdom, the authors of SPARQL > invented FILTER-s...:-) That's a good point. It might be a bit slower to first add all the results and then remove them again, but in the end there are not that many. (More than a user might have expected, but in total numbers not many). > If I look at the particular example that you gave, here is a modified > version: > > SELECT ?ind ?class > WHERE { > ?ind a ?class . > FILTER( > !regex( str(?ind), "^http://www.w3.org/2000/01/rdf-schema#" ) && > !regex( str(?ind), "^http://www.w3.org/1999/02/22-rdf-syntax-ns#") > ) > } > > with the data: > > ex:a a ex:C . > > and, I believe, with RDFS entailment we would get what we want. True. > So I believe we may want to examine an alternative route: > > - define (C2) to be the absolute strict minimum to ensure a finite solution > - look at the existing FILTER possibilities to see if we can handle > those common use cases. We may want to propose some shortcuts, like the > one above, ie, some sort of an operation which says > > inNamespace( ?x, URI ) > > which means that ?x as a term starts with the URI. > > What do you (and others!) think? I think that could work out very nicely :-) Birte > I have the gut feeling it will also help in defining the RIF > alternative, b.t.w. > > Ivan > > >> Here, C2 is very lax and it allows almost all axiomatic triples, but >> still guarantees finiteness of the answers in all cases. Now C3 >> basically says that if the instantiated BGP has nothing to do with >> your data, then omit it. I omit the predicate as an option in C3 >> since predicates such as rdf:type occur in most graphs (possibly by >> implicit triples as in OWL) and will then let through most axiomatic >> triples. Answers that are then filtered out are basically axiomatic >> triples that you didn't mention at all. This could be relaxed to (not >> sure which one makes more sense, needs more thinking) >> (C3) There is at least one triple s p o in P(BGP) such that either s >> or o occurs in sk(SG). >> >> Going back to the examples, if we have the query: >> SELECT ?ind ?class WHERE { ?ind a ?class } >> and data: >> [[ >> ex:a a ex:b . >> ]] >> as in you generated output, then you would get under OWL RL >> ?ind/ex:a, ?class/ex:b >> >> For the query >> SELECT ?r WHERE { ex:a ?r ex:a } >> over >> [[ >> ex:a ex:b ex:c . >> ]] >> you would get >> ?r/owl:sameAs >> no mater whether there is ex:something owl:sameAs ex:something or not. >> >> It is still possible to generate artificial examples, but less so. E.g., >> SELECT ?type WHERE { rdf:type a ?type } >> over the empty graph gives you nothing since C3 cannot be satisfied in >> an empty graph. If we query over >> [[ >> ex:b a ex:c. >> ]] >> we get >> ?type/rdf:Property >> C2 holds because rdf:Property is in vocab (assuming we do RDF(S) entailment) >> C3 holds because the instantiated BGP is rdf:type a rdf:Property and >> the subject rdf:type occurs in its abbreviated form in your data. That >> seems a nicer compromise. If we don't exclude axiomatic triples, we >> get infinite answers, and if we exclude some, we get some non-local >> side effects, but that cannot be totally avoided. >> >>> Well... this is clearly not our decision. To be formal, we should >>> definitely flag that as an issue to be discussed. In some ways, the >>> question is: is it better to have many potential responses (ie, the user >>> will have to filter things out) or a small number though some expected >>> results will not be returned (only via ugly tricks). >> >> Yes. >> I would prefer better too many than too few. Tools will hopefully >> provide a way to be configured such that they hide what I don't want >> to see, while still giving me the chance to see it if I want to. >> >> [snip] >>>> Another would be to define >>>> something like "extensible entailment regimes", where the entailment >>>> regime is a combination of one of the defined semantics plus some >>>> rules/axioms that the endpoint will always apply, e.g., the SKOS >>>> axioms are always assumed to be present. >>> >>> I think the answer is: this is where RIF comes in. That gives me the >>> necessary flexibility (and I can always simulate OWL 2 RL level with a >>> RIF rule set). >> >> Yes. That would be nice. >> >>> B.t.w.: a point here for the future RIF discussion: >>> >>> - we have something for OWL 2 RL >>> - we will have something for RIF (hopefully) >>> - there is a RIF document that, essentially, defines a RIF rule set for >>> OWL 2 RL >>> >>> Surely the result of the entailment should be the same whether I use the >>> OWL 2 RL entailment definition or the RIF one and use the official rule >>> set... >> >> Yes. It would be not nice at all if that were not the case. >> >> Cheers, >> Birte >> >>> Cheers >>> >>> Ivan >>> >>>> Birte >>>> >>>>>> We might still want to exclude axiomatic triples unless they occur in >>>>>> the input because they potentially add lots of answers that you don't >>>>>> really want. >>>>> >>>>> See above. But if I am wrong, I do not mind that either. >>>>> >>>>>> Another possibility would be to apply C2 only to variables in subject >>>>>> and object position and allow anything in the predicate position. That >>>>>> still can cause counterintuitive side effects, but many more cases are >>>>>> covered. In that case, inconsistencies need extra care because in >>>>>> principle this would still allow infinite answers if the given graph >>>>>> is inconsistent and we just assume the scoping graph to be equivalent >>>>>> to the queried graph no matter what. >>>>> >>>>> I guess you had that in one of the first drafts and we did not really >>>>> like it:-( >>>>> [snip] >>>>>>> >>>>>>> Sigh. >>>>>> >>>>>> Sigh too. >>>>> >>>>> :-) >>>>> >>>>> Ivan >>>>> >>>>>> Birte >>>>>> >>>>>>> >>>>>>> Ivan >>>>>>> >>>>>>> [1] >>>>>>> http://www.w3.org/TR/2009/REC-owl2-rdf-based-semantics-20091027/#Appendix:_Axiomatic_Triples_.28Informative.29 >>>>>>> [2] >>>>>>> http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/#Entity_Declarations_and_Typing >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2010-2-16 18:42 , Birte Glimm wrote: >>>>>>>> Hi all, >>>>>>>> I have committed a new version of the entailment regimes document: >>>>>>>> http://www.w3.org/2009/sparql/docs/entailment/xmlspec.xml >>>>>>>> >>>>>>>> There is now a description of the OWL RDF-Based Semantics incl. the >>>>>>>> OWL 2 RL profile. The OWL 2 RL profile can also be used with Direct >>>>>>>> Semantics, so I have added that there too. Further I have added a >>>>>>>> section about aggregates with RDF(S) entailment, addressing at least >>>>>>>> parts of Axel's comments (no owl:sameAs discussion yet for >>>>>>>> aggregation). I also defined the behaviour for inconsistent graphs >>>>>>>> more clearly because the previous spec didn't define the scoping graph >>>>>>>> in the case of inconsistencies. It was rather assumed that the scoping >>>>>>>> graph is still equivalent to the active graph, so that systems can >>>>>>>> just use the graph as is modulo bnode renaming, but that allowed >>>>>>>> infinite answers for inconsistent graphs. I now use Axel's suggestion >>>>>>>> for condition C2 and require not only bindings for variables inn >>>>>>>> subject position to occur in the input, but require this for all >>>>>>>> variables. This also solves the OWL RDF-Based semantics problem where >>>>>>>> you can have infinite answers from owl:topDataProperty, which relates >>>>>>>> an individual to all data values. Now all RDF-Based regimes (RDF, >>>>>>>> RDFS, OWL 2 RDF-Based (for OWL Full and OWL RL)) use the same >>>>>>>> definitions, which is nice IMO. >>>>>>>> >>>>>>>> Birte >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Ivan Herman, W3C Semantic Web Activity Lead >>>>>>> Home: http://www.w3.org/People/Ivan/ >>>>>>> mobile: +31-641044153 >>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>>>>>> FOAF : http://www.ivan-herman.net/foaf.rdf >>>>>>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> Ivan Herman, W3C Semantic Web Activity Lead >>>>> Home: http://www.w3.org/People/Ivan/ >>>>> mobile: +31-641044153 >>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>>>> FOAF : http://www.ivan-herman.net/foaf.rdf >>>>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>>>> >>>>> >>>> >>>> >>>> >>> >>> -- >>> >>> Ivan Herman, W3C Semantic Web Activity Lead >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +31-641044153 >>> PGP Key: http://www.ivan-herman.net/pgpkey.html >>> FOAF : http://www.ivan-herman.net/foaf.rdf >>> vCard : http://www.ivan-herman.net/HermanIvan.vcf >>> >>> >> >> >> > > -- > > Ivan Herman, W3C Semantic Web Activity Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > PGP Key: http://www.ivan-herman.net/pgpkey.html > FOAF : http://www.ivan-herman.net/foaf.rdf > vCard : http://www.ivan-herman.net/HermanIvan.vcf > > -- Dr. Birte Glimm, Room 306 Computing Laboratory Parks Road Oxford OX1 3QD United Kingdom +44 (0)1865 283529
Received on Thursday, 18 February 2010 12:18:46 UTC