- From: Enrico Franconi <franconi@inf.unibz.it>
- Date: Sat, 3 Sep 2005 18:52:24 +0200
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
1) On 2 Sep 2005, at 12:58, Seaborne, Andy wrote <http://www.w3.org/mid/43183071.1070508@hp.com>: > = subgraph / entailment > > The RDF MT defines three kinds of entailment - simple, RDF and RDFS. > RDF and RDFS are examples of vocabulary entailment. > > SPARQL basic patterns are defined to match by subgraph - the graph > being matched against contains RDF and can have some level of > entailment applied or not. Your first example misses this because > you show the data, without a declaration of the entailment to be > applied. The SPARQL query can execute against a simple entailed > version or RDF entailed version (or "zero entailment"). OK, then it is necessary to change: s/subgraph of/entailed by/ in defn of basic pattern (we have noticed that this change has been done already twice...) and somewhere there should be the ability to declare the type of entailment (simple, RDF, RDFS - as defined in RDF-MT). ====== 2) On 2 Sep 2005, at 12:58, Seaborne, Andy wrote <http://www.w3.org/mid/43183071.1070508@hp.com>: > = Blank Nodes in query results > > Blank nodes as distinguished variables can't be returned in SELECT > queries. This is by design. An application should use a named > variable if it wants to return the binding in a solution. Uhu? We said "Blank nodes as binding of distinguished variables", which are clearly allowed (see, e.g., the example <http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/#BlankNodes>). So, the problem we mention in the comments remains: the minimality of answers is not guaranteed, since tuples in the answer set may be redundant. In our example we show how two equivalent graphs (i.e., they entail each other) give different answers to the same query. This is due to the fact that minimality is not required. We need to enforce in the definition the minimality of answers. This may happen if there are bnodes in the result (and \top unbound nodes in the result - see below). This can be easily and efficiently implemented (e.g., just hash the bnodes in the answer whenever you get them, and check). ====== 3) On 2 Sep 2005, at 12:58, Seaborne, Andy wrote <http://www.w3.org/mid/43183071.1070508@hp.com>: > = Blank Nodes in Queries > > """ > A blank node in a query pattern “behaves as a variable; a blank node > in a query pattern may match any RDF term”. > """ > > then the solution of a basic pattern is described in terms of > matching variables. This is supposed to cover the case of bNodes > from the query pattern as they are treated as variables and so have > bindings. > > Could you suggest wording that would make that clearer? We suggest to update the definition of "pattern solution" to include explicitly the bnodes in addition to the variables. ====== 4) About unbound values in the answer. Statement: unbound values generated by queries with an "optional" part are different from unbound values generated by unsafe queries. We suggest to forbid unsafe queries. Consider a variation of the example by Andy from <http://www.w3.org/mid/431852CC.5030402@hp.com>: @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . _:a rdf:type foaf:Person . _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@example.com> . _:b rdf:type foaf:Person . _:b foaf:name "Bob" . Consider the following query with an "optional" part: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox ?mbox } } ---------------------------------------- | name | mbox | ======================================== | "Alice" | <mailto:alice@example.com> | | "Bob" | | ---------------------------------------- Here, the meaning of the unbound value is that *no* RDF term may be the mbox of Bob. We call this unbound value "\bottom". On the other hand, consider the unsafe query: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name } ---------------------------------------- | name | mbox | ======================================== | "Alice" | | | "Bob" | | ---------------------------------------- Here, the meaning of the unbound values in the result is that *any* RDF term may be in that parts of the answer. We call this type of unbound value "\top". As a matter of fact, the \top value is just a shortcut generating an (infinite) answer set that contains any possible RDF term in place of the \top unbound value. ---------------------------------------- | name | mbox | ======================================== | "Alice" | <mailto:alice@example.com> | | "Alice" | <mailto:bruno@example.com> | | "Alice" | "Bob" | ... | "Bob" | <mailto:alice@example.com> | | "Bob" | <mailto:bruno@example.com> | | "Bob" | "Bob" | ... ---------------------------------------- Let's now extend the query as in the original example by Andy: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { { ?x foaf:name ?name } UNION { ?x foaf:name ?name . ?x foaf:mbox ?mbox } } ---------------------------------------- | name | mbox | ======================================== | "Alice" | | | "Bob" | | | "Alice" | <mailto:alice@example.com> | ---------------------------------------- Here, both unbound values are \top values, since they came from an unsafe subquery. However, note that this answer is not minimal, i.e., it contains a redundant part: in fact, the last row is already expressed by the first row. This can be understood also by noting that the first subquery contains completely the second subquery, as expected for logical reasons. So, by enforcing minimality (like we do when there are bnodes in the result, representing existential values), the expected answer is: ---------------------------------------- | name | mbox | ======================================== | "Alice" | | | "Bob" | | ---------------------------------------- Please note that a \bottom value in the result - like in the first example with the "optional" part - does not lead to any simplification. Also note that the "bound" operator is false only in the case of a \bottom unbound value, while it is true in the case of a \top unbound value. Moreover, note that a \top value in a filter construct is really problematic to handle. Consider an extension of the previous unsafe query with a filter operating on the \top unbound value (which can not be catched by a "bound" construct): PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name . FILTER ?mbox = <mailto:alice@example.com> } Since the ?mbox variable is unsafe, it may take any possible RDF term as value. So, the following is the correct answer: ---------------------------------------- | name | mbox | ======================================== | "Alice" | <mailto:alice@example.com> | | "Bob" | <mailto:alice@example.com> | ---------------------------------------- Even worst, you may need to generate infinite answer sets: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name . FILTER ?mbox >= "27"ˆˆxs:decimal } ---------------------------------------- | name | mbox | ======================================== | "Alice" | 27 | | "Alice" | 28 | | "Alice" | 29 | ... | "Bob" | 27 | | "Bob" | 28 | | "Bob" | 29 | ... ---------------------------------------- As a final note, as the following query shows, it is really necessary to distinguish explicitly between \top and \bottom unbound values, since they may appear in the same result: PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { { ?x foaf:name ?name } UNION { ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox ?mbox } } } ---------------------------------------- | name | mbox | ======================================== | "Alice" | | | "Bob" | | | "Alice" | <mailto:alice@example.com> | | "Bob" | | ---------------------------------------- This result has the following meaning: ---------------------------------------- | name | mbox | ======================================== | "Alice" | \top | | "Bob" | \top | | "Alice" | <mailto:alice@example.com> | | "Bob" | \bottom | ---------------------------------------- Our strong suggestion is to forbid unsafe queries completely, so that the \top unbound value will never appear. This requires a precise syntactical definition of safe queries. This restriction is customary in any database query language. If you don't want to forbid unsafe queries, we guess that you have to be ready to deal with the cases mentioned above. ====== 5) We understand now the difference between optional and union (thanks to Andy's example in <http://www.w3.org/mid/1125667385.16011.761.camel@dirk>). New observation: as it is currently defined, the "optional" construct makes the query language *non-monotonic*; i.e., by adding triples to the RDF data the answer set to a query may decrease. The logic becomes intrinsically harder. ====== 6) We would still like to give a name to a simpler sublanguage, which should have a clear semantics - and therefore all the implementations should agree on it. We propose to call "Rich SPARQL" the current language, and "SPARQL" the language without "description of resources" and without "specification and query of RDF datasets" (that is, the provenance issue), since everybody acknowledges that there are very serious semantic problems with those constructs. In fact, we agree with Dan's comment in <http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/ 2005Sep/0008.html>: > I suppose we could make explicit that for a query pattern P, if S > is a solution w.r.t. an input graph G, then S(P) is entailed by > G. Is that what you have in mind? > > I think the idea can be expanded to cover UNION > straightforwardly, and perhaps OPTIONAL with some effort, but I > don't know how this applies to queries that use the GRAPH > keyword. We believe that a nice work can be done for what we called above SPARQL, but not for Rich SPARQL. And we volunteer to do it (see [1] for our first attempt). The current document is definitely not enough to give a precise account of the semantics of SPARQL. In principle, a new document on the semantic of SPARQL should become somehow official for W3C. cheers -enrico+sergio [1] <http://www.inf.unibz.it/krdb/w3c/rdf-sparql-semantics.pdf> Enrico Franconi - franconi@inf.unibz.it Free University of Bozen-Bolzano - http://www.inf.unibz.it/~franconi/ Faculty of Computer Science - Phone: (+39) 0471-016-120 I-39100 Bozen-Bolzano BZ, Italy - Fax: (+39) 0471-016-129
Received on Saturday, 3 September 2005 16:52:37 UTC