- From: Pat Hayes <phayes@ihmc.us>
- Date: Tue, 17 Jan 2006 20:44:02 -0600
- To: Enrico Franconi <franconi@inf.unibz.it>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>, bparsia@isr.umd.edu
>A quick reaction: > >On 17 Jan 2006, at 17:42, Pat Hayes wrote: > >>Enrico Franconi wrote: >> >>>The new proposal of Pat >>><http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0061.html> >>>does not work for any approach where bnodes are implicit, and this >>>happens not only for OWL-Lite entailment, as we already pointed >>>out in >>><http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0064.html>, >>>but also for RDF entailment. For example, given the graph >>> :john :age "25"^^xsd:decimal . >>>the query >>> ASK { _:b rdf:type rdf:XMLLiteral } >>>should clearly return YES if RDF entailment is considered. >> >>I agree, and see below. >> >>>However, >>>according to the latest proposal from Pat the answer to the query >>>would be NO regardless of the type of entailment considered. The >>>reason is that Pat considers bnodes in the query restricted to be >>>bound only to URIs and bnodes explicitly appearing in the graph. >> >>This restriction is to terms occurring in the scoping graph, not in >>the original graph. For the SPARQL case of simple entailment, the >>scoping graph is defined to be equivalent to the dataset graph, but >>for RDF or RDFS entailment, the appropriate adjustment would be to >>define the scoping graph G' to be the RDF or RDFS closure of the >>data set graph G. > >Oh, no. We are not going through the graph closure discussion again. No, no. Relax. The basic condition on an answer is still that the answer instance is appropriately entailed by the original graph. The discussion is only about how best to restrict the vocabulary of legal answer bindings: for this purpose, the finiteness of the closure is not important. I think this can be done even for OWL, simply by defining the relevant closure relative to the comprehension conditions described in the OWL/RDF semantics: it doesn't have to entail the answer, only provide a name for it. But forget about closures: I agree with your quick reaction that appealing to closures is kind of tacky, so let me make an alternative suggestion, detailed below, which tries to separate out three different roles in the definition of answer and put constraints on them. >The decision to have entailment was exactly to replace the closure >of the graph. >Your proposal does not scale up: the closure of an RDF (or RDFS) >graph is infinite, and the closure of a OWL-DL graph is not unique >(OWL-disjunction). >Moreover, if you let the scoping graph G' in the definition of basic >graph pattern matching, the algebraic expressions involving more >than one BGP wouldn't work properly (since you can not control >whether there are distinct scoping graphs Gi'). Of course we can control this: we simply impose it by fiat in the definitions. There must be a single scoping graph used for all answers to a query. We say this explicitly. >Full stop. Unacceptable by FUB. Please try to be more collegial. I believe I have answered your technical objections. >As we already pointed out, if you really want a scoping graph G', >this should appear once forever outside the definition of basic >graph pattern matching, most likely at the beginning of any >processing of the original graph G by the SPARQL server. Seems to me that you are here confusing definition scope with process scope. I agree that the definitions must be phrased so that all answers to a single query are defined with respect to a single scoping graph. (There is no need for a server to actually generate the scoping graph, of course: it plays a purely definitional role, so processing sequence questions seem irrelevant here (?)). ----- Here is the suggestion, which sticks closely to your construction and wording, and so does not assimilate bnodes to variables. (I still think that alternative is better, on both semantic and pragmatic grounds, and do not accept your distinction between variables and "true existentials", but I will not press that particular point any further, in the interests of coming to an agreement. It makes no difference up to RDFS, other than keeping the definitions more elegant and clarifying the semantics.) Comments to the WG written <<thus>>. <<Same definitions of OrderedMerge, term, basic graph pattern, etc.>> First we give a general form for the definition of "basic query answer" which applies to any entailment regime. SPARQL itself uses the simplest case of this. <<It might make more sense in the document to do this LAST, in a final section which is clearly demarcated from the rest of the document, and can refer back to everything else, hence making the rather oblique reference to 'SPARQL syntax and protocols' less enigmatic. Also, we would then have explained the rationale for all the oddities in the definition.>> Given an entailment regime E, a basic graph pattern, BGP, E-matches with pattern solution S on graph G with respect to a scoping graph G' and a scoping set B, just when the following three conditions all hold: (1) S(G' OrderedMerge BGP) is an appropriately well-formed RDF graph for E-entailment (2) G E-entails S(G' OrderedMerge BGP) (3) the identifiers introduced by S all occur in B. Several conditions must be met by the scoping graph and scoping set. The scoping graph and scoping set must be identical for all answers to a query; the scoping graph G' must be graph-equivalent to G; and B must contain every term in G'. Any querying language which uses the SPARQL syntax and protocols and satisfies all these conditions [may] be called a SPARQL extension, and be referred to by the use of an appropriate prefix, such as RDF-SPARQL. SPARQL extensions [may] impose further conditions on scoping sets, and [may] further restrict the set of answers, for example to avoid redundancy in answer sets; but any answer [must] correspond to a pattern solution which satisfies the three conditions above. Any such extra conditions [must] be stated so as to apply uniformly to all datasets. <<This idea of 'extension' is intended to be in alignment with that of RDF semantic extension, which I guess is kind of obvious. Seems to me that this kind of "restriction-in-advance" style fits well with our charter requirement 1.6. >> For example, it might be considered appropriate for an RDFS extension to SPARQL to require that all scoping sets contain all the URI references in the rdf: and rdfs: namespaces. This would allow 'tautological' answers to queries against the empty dataset, and would correspond exactly to SPARQL queries posed against the RDFS closure of the dataset graph. Omitting such vocabulary from B would prohibit such answers, corresponding to a regime in which only identifiers in the dataset graph could be used in query answers. << We could actually define RDF- and RDFS-SPARQL ourselves, with B being all the terms in G' plus respectively the rdf: or the rdf: + rdfs: namespaces. And we could also define datatyped versions. I would however suggest, and not merely in a spirit of being obstructive, that we do not actually *define* OWL-DL-SPARQL. The only point we are leaving open, really, is exactly how to define the scoping vocabulary B for OWL. I remain concerned that this may have to be allowed to contain enough vocabulary to construct OWL syntax using RDF collections.>> For SPARQL as defined in this document, the conditions are simplified so as to apply appropriately to simple entailment. Here the scoping set B is simply the set of terms in the scoping graph G', and the third condition need only be stated for bnodes. A basic graph pattern, BGP, matches with pattern solution S on graph G with respect to a scoping graph G' just when the following three conditions all hold: (1) S(G' OrderedMerge BGP) is an RDF graph (2) G simply entails S(G' OrderedMerge BGP) (3) the bnodes introduced by S all occur in G' Moreover, we require that any scoping graph G' must be graph-equivalent to G, and that a single scoping graph must be used for all answers to a single query. <<Note, this last point will need to be firmed up by text elsewhere in the document.>> << This is Enrico's definition modified with the scoping graph, which I suggest is necessary to avoid requiring that engines deliver the actual bnodeIDs from the dataset in all answer bindings. Without this, we are in effect defining things so that told-bnodes are automatic. But in case, I think it is intuitive, as the actual role of G' isn't anything to do with entailment: it is semantically transparent. It is only to keep the bnodes properly in line in answer sets. >> << I think the document should have some prose explaining this definition, by the way, like why it needs the G' included in the consequent, what condition 3 is for anyway, and why S has to be applied 'after' the merge. This can all be illustrated with a single example. These are all unconventional, and the reasons for them are not obvious at first blush, and all have to do with bnodes, so will be new thinking for many database-savvy readers. I can try drafting this if nobody else wants to. >> << There are also a bunch of easy lemmas we could give, such as that this is equivalent to the instance/subgraph definition, and that rdf- and rdfs-SPARQL are equivalent to SPARQL applied to the closure graphs, which gives a nice opportunity also to point out why that won't work for OWL. Or maybe all this should be in a 'user tutorial' document :-) >> Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Wednesday, 18 January 2006 02:44:12 UTC