- From: Pat Hayes <phayes@ihmc.us>
- Date: Sun, 5 Mar 2006 11:26:31 -0800
- To: Enrico Franconi <franconi@inf.unibz.it>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
>On 1 Mar 2006, at 22:08, Pat Hayes wrote: > >>>1) In the first part of my original message I argued that, by carefully >>>looking at the normative semantics RDF-MT, bnodes are always >>>interpreted autonomously within the RDF graph where they appear, no >>>matter where bnodes come from or which (abstract) syntax identity they >>>have. So, while this is an argument againts having BGP', this also >>>shows that it is absolutely harmless to have BGP'. >> >>The RDF MT defines a graph simply as a set of triples, so itself >>does not provide any way to determine the scope of a blank node. A >>single triple, and any bnode which it contains, might occur in many >>different RDF graphs. And the RDF MT refers to graphs, not to >>entities such as query patterns and answer binding sets, neither of >>which are RDF graphs. The SPARQL spec does not mention where the >>RDF graph boundaries are intended to be maintained, so it is up to >>us to make sure that the definitions draw them correctly. > >G' is a graph, S(BGP') is a graph, (G' union S(BGP')) is a graph, >the CONSTRUCTed graph from an answer set is a graph. The abstract >syntax representations of these graphs shouldn't care about the >identity of the bnodes in them (since their interpretation is >autonomously defined) No, this is a mistake. The abstract syntax does care about identity of bnodes between graphs, because graphs behave differently with respect to combination depending on whether or not they share bnodes. This is exactly how the abstract syntax handles what is more usually described using some notion of identifier scope. That is, the abstract syntactic properties of RDF graphs are affected by the co-occurrence of blank nodes. This point is independent of the semantics of an RDF graph: it has to do with the nature of the ways that RDF graphs are combined. If A has three blank nodes and B has two blank nodes, then their union may have any number between three and five blank nodes, depending on the ways that A and B share or do not share blank nodes. As you point out, the particular identity of bnodes in RDF graphs is unimportant to their meaning, so imposing disjointness is not any limitation on expressiveness of answer sets. For the purposes of interpretation, we can think of an RDF graph as a mathematical ideal under the group of 1:1-onto mappings from the set of all bnodes to itself. (We did at one point consider making such a definition.) However, this view of graphs does not support the operations of the graph syntax: it does not allow for the possibility of forming the union of two graphs. For unions to be meaningful, we have to respect the particular identity of bnodes: but this is the only reason. >, nor should limit them in any way. > >>>2) In the second part of my original message I argued that if we don't >>>have BGP' then the abstract syntax of answer sets is limited in a very >>>peculiar way, disallowing answer sets that contains bnodes that may >>>appear in the query. >> >>But this is not a limitation, since given the document conventions >>already in use,there is no way they could possibly share a bnode. > >What you are saying has nothing to do with the abstract syntax; this >has to do with the concrete document syntax. I am aware of the implications of what I say. My point was that the abstract syntactic conditions should accurately mirror the scoping conditions in place for the surface syntax that we use, in the spec, to describe the abstract syntax. These conditions require that the sets of bnodes in the answer bindings, and those in the query pattern, are disjoint. >The definitions in 2.5 shape the way an answer set could look like >in its abstract syntax, independently on its linearisation. Quite. But there is little point in phrasing them so as to allow shapes which cannot be specified by any surface syntax; and in fact, to do so is to an error, IMO, since it implies a generality - in this case, the possibility of identifying a blank node in an answer with a blank node from the query - which we cannot in fact provide to users. If a user were to interpret the answer bindings in a way which conforms to this possibility, that would be a misinterpretation of the SPARQL spec. . >>>This restriction is useless, since we know (point >>>1 above) that bnodes are always interpreted autonomously within the >>>graph where they appear, so having the same bnodes as in the query is >>>fine anyway. This restriction is bad, since not every equivalent answer >>>set would be legal in sparql. >> >>Can you elaborate on that last point? Perhaps with an example? It >>seems to be a new point in this discussion. > >Uh? This has been my point since ever. Here we go again: > >For example, suppose to have two engines that, given the same data >and the same query, differ only on how they represent (in abstract >syntax) The representation used by an engine will of necessity be some concrete syntax, not the abstract syntax itself. >the final CONSTRUCTed graph. The first one chooses not to use in the >final CONSTRUCTed graph any bnode appearing in the query. That does not make sense. Engines cannot use blank nodes from the abstract syntax, which are abstract mathematical entities: they must use some concrete data structure. >The second one uses in the final CONSTRUCTed graph some bnode >appearing in the query. What does that mean? Engines cannot access particular bnodes: bnodes are not data structures. Do you mean it to be the case that if an engine were to perform an operation on its representation corresponding to forming the union of the query and the CONSTRUCTed graph from this query, that they would have a common bnode? This would not be supported by the SPARQL spec, but it would follow from your definitions. I suggest that it should not follow from them, and that to allow this possibility is an error, as it could imply consequences which are not intended by the definitions, and entailments which are not supported by any intended reading of the SPARQL surface syntactic rules. >The two CONSTRUCTed graphs are graph-equivalent, but if we choose >not to have BGP' then the second engine would not satisfy the >conditions in 2.5, and therefore could not be called SPARQL >compliant. As it should not be. We should require that SPARQL engines keep bnodes in answer bindings distinct from bnodes in queries, as our own document scoping rules imply this separation. >Compare this with "2.7 Blank Nodes in Query Results" ><http://www.w3.org/2001/sw/DataAccess/rq23/#BlankNodesInResults>. >There it is said that the bnode identity in the result is obviously >irrelevant: >"These two results have the same information: the blank nodes used >to match the query are different in the two solutions. There is no >relation between using _:a in the results and any blank node label >in the data graph." This is talking about bnode LABELS in documents. The 'no relation between' is an informal way of saying that the the query and results documents have disjoint label scopes, so the label "_:a" in one does not indicate the same bnode as the same label in the other. The corresponding mathematical way of saying the same thing is that the sets of bnodes in the underlying abstract structures are disjoint. If the sets were not disjoint, then it would be incorrect to say that the bnode label scopes of those documents were separate. >However, without BGP' you limit the choice of bnodes in the result >*not* to include some bnodes, namely the ones appearing in the query. >This is bad. Not only is is not bad, it is required, in order to make the abstract syntax definitions correctly correspond to the document scope definitions. There is of course no way in any surface syntax to identify a particular blank node, so discussions of absolute identity of blank nodes are syntactically meaningless: it is meaningful only to discuss sharing or not of bnodes between structures. Imposing disjointness between bnode sets is however a real syntactic constraint, one that is reflected in detectable properties of a surface syntax: in our case, it means that SPARQL does not support an answer semantics which would require bnodes to be shared between query patterns and answer bindings. And indeed, it does not. So, we should phrase the definitions to reflect this fact. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Sunday, 5 March 2006 19:26:52 UTC