- From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
- Date: Sun, 23 May 2010 14:23:41 +0100
- To: Chimezie Ogbuji <ogbujic@ccf.org>
- Cc: Axel Polleres <axel.polleres@deri.org>, Sandro Hawke <sandro@w3.org>, Ivan Herman <ivan@w3.org>, W3C SPARQL WG <public-rdf-dawg@w3.org>
Hi Chime, somehow the mail got lost in all the other SPARQL mails, so I am a bit late with answering it. I only comment on what needs clarification. For all the comments that are already addressed by changes in the document I am happy with the outcome. 2010/5/20 Chimezie Ogbuji <ogbujic@ccf.org>: > Birte, thanks for the comments. I intend to address them (hopefully by the > end of the night) but there are some principled concerns that I wanted to > raise ASAP (because I believe they warrant an editorial note) so there is > enough time to work through the changes we resolved to make with your > approval prior to publication. [snip] >> There are still several places where we seem to enforce RIF strongly >> safe core documents, which is no longer necessary given the relaxed >> conditions on extensions to BGP matching. > > This is the part that is problematic for me. For instance, RDF entailment > *enforces* finite answers and appeals to the relaxation of the restrictions > by (essentially) categorizing entailment from BNodes and axiomatic triples > as 'trivial' (I put trivial in quotes because - as Andy mentioned - the word > is problematic and often used subjectively as is the case here). > > It seems to me like the same standard is being applied differently and > justified by a subjective determination of what kind of infiniteness is > appropriate to rule out. I consider them trivial, because all answers that are omitted from the actual solutions can be reconstructed from the solutions without even knowing the queried graph (at least under RDF/RDFS entailments). E.g., if a query SELECT ?x WHERE { ?x ex:b ex:c } gives the answers ?x/ex:a1 and ?x/ex:a2, then I can get all the infinitely many possible solutions by replacing ex:a1 and ex:a2 with all (infinitely many) blank nodes. Each so obtained answer, e.g., _:xxx ex:b ex:c is entailed by the queried graph, although you don't know which triples were in the graph. Thus, computing also all possible answers can be done by a post-procssor that only has to know the query and the returned answers. Giving you all the (infinite) answers straight away is really not necessary and I could even start returning binding of the for ?x/_:bnode1, ?x/_:bnode2, etc which means you never get to see the important 2 answers in a finite amount of time. I believe a similar argument can be given for the axiomatic triples. With rules, however, this is not the case, i.e., there are rule sets that cause infinite answers, but I cannot construct all the answers from a finite subset of the answers and the infinite answer set has no redundancies. I am still not sure about how to writ RIF rules properly, but imagine a rule that generates odd numbers (I believe that can be done given Odd(1) as a fact and then a rule that uses built-ns to increment a given x by two like Odd(x) implies Odd(x+2)). Now if you query for odd numbers and cut of the solutions at any point, you cannot say without knowing the rue set whether the returned solutions come from asserted fact or from inferences and whether somehow the inferences stopped because a rule was no longer applicable or whether the solutions have just been omitted because of some cut-off that is used to guarantee finiteness. Maybe we should put in something stronger in condition 4 that really properly defines some notion of redundancy in solutions, but that requires than proof that the regimes meet the defined redundancy condition and it will take some time to work this out properly. BTW, another problem the you easily get, when you don't filter out bnode redundancies is that condition 3 is violated. E.g., if you have SG: ex:a ex:b ex:c . ex:d ex:e ex:f . and the query SELECT ?x WHERE { ex:a ex:b ?x . ?y ex:e ex:f . } Then without filtering bnodes in any way, you get among other answers mu1: ?x/_:bnode1, ?y/_:bnode2 but also mu2: ?x/_:bnode2, ?y/_:bnode1 Condition 3 requires that SG entails SG union P1(BGP) union P2(BGP) Note that since the BGP of the query has no bnodes no renaming into BGP1 and BGP2 is really required and even the empty RDF instance mapping sigma can be used both for P1 and P2 to give: SG entails ex:a ex:b ex:c . ex:d ex:e ex:f . ex:a ex:b _:bnode1 . _:bnode2 ex:e ex:f . ex:a ex:b _:bnode2 . _:bnode1 ex:e ex:f . most interesting here is SG entails ex:a ex:b _:bnode1 . _:bnode1 ex:e ex:f . and SG entails ex:a ex:b _:bnode2 . _:bnode2 ex:e ex:f . These two entailments do NOT hold. Reusing the same bnode in different query answers, where the bnodes refer to different elements in the graph causes these problems. Each query answer by itself is entailed, but putting them all together as condition 3 requires is not possibe. Thus, filtering out bnode redundancies is not only required for finiteness. This is what C1 does and although you define the Skolemisation as I do, you don't have C1 any more, which requires in particular that Skolemizing P(BGP) results in ground triples, i.e., all bnodes in P(BGP) are from the known bnodes in SG. >> ..snip.. Ideally, I would just >> declare this as solutions without messing around, but unfortunately >> blank nodes and axiomatic triples can introduce infinitely many >> redundant answers that nobody really wants to see. This infiniteness >> is quite different than those that you get from recursive rules say. > > I don't think this form of infiniteness is different from those you get from > recursive rules. An infinite solution set is an infinite solution set > regardless of how you come about it, and if the general goal was to allow > infinite solution sets then we should just remove the restriction altogether > rather than introducing ambiguity in order to justify removing some forms of > infiniteness. Unfortunately, I wasn't able to participate in the F2F to > argue this point. see above >> ..snip..constant in question cannot have been in the queried graph. Anyway, >> long answer with mostly theoretical concerns ;-), but I would prefer >> to use sk as in the other regimes or, if you don't want that, at least >> say that sk is defined for any bnode in any BGP. > > I appreciate the long answer, it does help me understand skolemization as it > relates to how we reduce infiniteness via Bnode. However, I'm not sure > exactly where my use of sk differs from the other regimes since I > essentially copied the text from that of the other regimes. Can you show me > exactly where the difference is? as pointed out above the keyword in C1 (which you don't have) is that sk(P(BGP)) are ground, i.e., bnode free triples, which means all bnodes are from the known ones in the scoping graph. For other bnodes sk is not defined and maps them to themselves. >> 7.3 >> ... This entailment regime restricts the legal graphs to only those >> that refer to strongly safe RIF core documents. ... >> That is no longer required. We can now allow also not strongly safe >> RIF core documents, > > Yes, but as I've mentioned before, by that same logic RDF entailment (for > example) should not include provisions to rule out infiniteness by relying > only on a subjective interpretation of what is trivial. If we want to apply > the standard in the same way, then RDF entailment (and the others) should be > defined such that infiniteness is *not* ruled and an informative section > should be added where the conditions that rule them out are placed. > Otherwise, it is a double standard. Not having C1 means that any answer set is infinite and full of redundancy. This would render the regimes pretty much useless and condition 3 cannot be satisfied either. One could argue that infiniteness from axiomatic triples is slightly different, but all these triples are know and do not really depend on the queried graph, so adding them to the answer does again not really give you anything that you didn't know before, so for me there is a difference. >> but in that case, finiteness is not guaranteed. >> What the changed condition on extensions to BGP matching requires is >> that trivial sources of infinite answers should be defined and >> excluded by the regime. This is deliberately a bit vague, > > Yes, very vague. So much so, that I wonder about the value of using > 'trivial' at all. For me the previous condition was pretty ok and if you have anything that is close to the original one, but works for RIF, I am happy to strengthen it. Loosen it by even dropping trivial is not an option for me. Birte
Received on Sunday, 23 May 2010 13:24:15 UTC