From: Seaborne, Andy <andy.seaborne@hp.com>

Date: Thu, 19 May 2005 11:43:22 +0100

Message-ID: <428C6DCA.4000804@hp.com>

To: Dan Connolly <connolly@w3.org>

CC: www-archive@w3.org, Pat Hayes <phayes@ihmc.us>, "Eric Prud'hommeaux" <eric@w3.org>

Date: Thu, 19 May 2005 11:43:22 +0100

Message-ID: <428C6DCA.4000804@hp.com>

To: Dan Connolly <connolly@w3.org>

CC: www-archive@w3.org, Pat Hayes <phayes@ihmc.us>, "Eric Prud'hommeaux" <eric@w3.org>

Dan, Thanks for spending the time on this - your comments are very helpful. This message responses to the graph pattern comments. I'll rework the solution processing sections along the lines you suggest. Dan Connolly wrote: > On Wed, 2005-05-18 at 14:53 +0100, Seaborne, Andy wrote: > >>Formal Definitions from SPARQL Query Language for RDF > > >>Definition: Query Variable >> >>A named query variable is a member of the set V where V is infinite >> and disjoint from RDF-T. V is otherwise arbitrary. >> >>An unnamed query variable is a blank node in the query pattern and is >>disjoint from any blank node in the RDF graphs being matched. > > > That 2nd sentence seems like it belongs elsewhere. > Ah... I think it's best to leave it out of the abstract > definitions, and note in/near the grammar (or just > outside the box) that a parser > needs to come up with a variable name that's distinct > from all the others when it parses the [] construct. Agreed. A general point that comes up several times is that the definitions and description shoudl largely be independent of the concrete syntax and the parser. There may be other concrete syntaxes; queries may come from other places (e.g. build programmatically). Here, the key point is that blank nodes ([] or _:a form) must be disjoin from the dataset. It kinda falls out naturally because the fundamental matching is entailment so I will move the second part out of the definition. I do think that the definition of "query variable" needs to cover blanks nodes (hence "named"). Does that work for you? > > > >>Definition: Graph Pattern >> >>A Graph Pattern is constructed from triple patterns and value >>constraints, using the following operations to combine patterns: >> >> * Basic Graph Pattern, set set of triple patterns >> * Group Pattern, where each of a set of graph patterns must all >> match >> * Optional Graph Pattern, where a graph patterns may extend the >> solution >> * Union Graph Pattern, where one of a set of graph patterns must >> match >> * Patterns on Named Graphs, where patterns are matched against >> named graphs > > > I think it suffices to say: > > a Graph Pattern is either a Basic Graph Pattern, > a Group Pattern, and Optional Graph Pattern, ... > or a Pattern on Named Graphs. OK - I think you advocate a terse style of definition. I can move the list outside the definition (in the last working draft it was but that was only because the HTML was broken!). > > add to this list: a Constraint Pattern; more below. See comments below. Not sure about the combination with another pattern. > > > >>Definition: SPARQL Query >> >>A SPARQL query is a tuple consisting of a graph pattern, an RDF >>dataset description, a number of solution sequence modifiers and a >>result form. >> > > Dataset description? Not just a dataset? > hmm. I wonder where that comes in... > It doesn't seem to show up anywhere else. Could do - the query has dataset, and FROM/FROM NAMED are a description. How the description gets turns into the datset for the query is left open. I want to avoid talking about GET (if http://) or what aggregators might do with URIs. > As I think I suggested earlier, the semantics of FROM and FROM NAMED > should be specified by reference to the protocol, not as part of > the query language formalism. The QL should not depend on the protocol: 1/ The QL can be used without the protocol 2/ If the protocol does not specify the dataset, it comes from the query syntax so we end out looping here. > > That reminds me... the protocol spec needs to say how > to get from the default-graph-uri and the > named-graph-uri's to the corresponding graphs. > I'll try to remember to send mail about that separately. > > >>Definition: Query Pattern >> >>A query has one main graph pattern called the Query Pattern. >> > > A SPARQL Query has... > > Or: > > If (GP, D, M, R) is a SPARQL Query, then GP is its Query Pattern. > > This whole definition might be left out, as it's reasonably > clearly implicit in the defintion of SPARQL Query. Agreed. > > >>Definition: Triple Pattern >> >>A triple pattern is member of the set: >>(RDF-T union V) x (RDF-U union V) x (RDF-T union V) >> >>Definition: Solution > > > Solution or Substitution? Mixup - I'll read for consistency. > >>A substitution is a function from a subset of the set of variables to >>the set of RDF terms, RDF-T. >> >>The result of replacing every v in a graph pattern P by S(v) is >>written S(P). >> >>We say that S is a solution of P if P matches G with substitution S. >> >>Definition: Basic Graph Pattern >> >>A Basic Graph Pattern is a set of Triple Patterns. >> >>A basic graph pattern matches on graph G with substitution S if S(GP) >>is entailed by G. >> >>Definition: Query Solution >> >>A Query Solution is a Pattern Solution for the Query Pattern. A >>substitution in a query solution only contains variables mentioned in >>the query. >> > > suggest: > > Given a Query Pattern, a Query Solution is one of its Pattern Solutions > whose substitutions only contains variables mentioned in the Query > Pattern. I don't see the need for a restriction to only contain variables mentioned in the query. Is there a technical reason for this? Allowing other variables/values in a solution does no harm as far as I can see and allows for later extension to nested queries. I have to change the Query Results from saying plain "all" because of variable inference capabilities. > > or: > > Given a Query Q = (QP, D, M, R), S is a Query Solution for Q iff > S is a Pattern Solution for QP and every variable in the domain > of S is mentioned in QP. I'll adopt that tuple-defintion of query. > > > >>Definition: Query Results >> >>The Query Results, for a given graph pattern GP on graph G, is written >>R(GP,G), and is the set of all query solutions S such that S is a >>solution for GP on G. > > > hmm... rather: of all pattern solutions S, no? > > I'm looking for the definition of Pattern Solution and I don't see it. > Hmm. It was merged into substitutions because substitutions are solutions - different words for different points of view. Must be some junk around still. I see if I can pick one over the other - if not, I'll put back a defn for pattern solution > > >>R(GP, G) may be the empty set. >> >>Definition: Value Constraint >> >>A graph pattern may involve a value constraint, which is a >>boolean-valued expression of variables and RDF Terms that restricts >>query solutions. > > > Umm... "may involve"... the definition of Graph Pattern given > didn't involve value constraints. It did in the first sentence: A Graph Pattern is constructed from triple patterns and value constraints, using the following operations to combine patterns: (list of pattern operators follows). If I put the list outside the defn, have a sentence list of all possibilities, that should be clearer by being more uniform in the treatment of constraints. > Suggest rather adding > Constraint Pattern above and: > > Definition: Value Constraint > > A Value Constraint is a boolean-valued expression of variables > and RDF Terms. > > Definition: Constraint Pattern > > A Constraint Pattern is a pair of a Value Constraint > and a Graph Pattern. S is a solution for a Constraint > Pattern (C, GP) iff C(S) is true and S is a solution for GP. Hmm ... thinks ... How about defining a value constraint C as accepting S if C(S) is true? ---- Definition: Value Constraint A Value Constraint is a boolean-valued expression of variables and RDF Terms. For constriant C, a solution S matches C is C(S) is true. ---- The problem is that we have used the term "matches" - a solution causes a pattern to match and it's a bit odd using "match" but using the same work seems better. Tieing to a pattern does not work out completely: Example 1: SELECT WHERE { ?x rdf:type :foo :y :q ?v { ?x :p ?v } UNION { FILTER ?v < 3 } } either the triple { ?x :p ?v } is in the graph or the filter is true. Example 2: SELECT * WHERE { { ?x :p ?v } { FILTER ?v < 3 } } where the FILTER is in a group of one, no pattern in the group. Otherwise, it might be better to have a grammar to require FILTER to be associated with a pattern of some kind. Something like: pattern constraint* [unchecked if that grammar change works] but now we have order in the query and this is syntactically illegal: SELECT * WHERE { FILTER ?v < 3 ?x :p ?v } and the UNION example also fails. Our filters are not quite general patterns: 1/ Evaluation is according to XPath/XQuery F&O rules 2/ May be n-ary you don't write binary operators as (?x 3) math:lessMath ?result which is what it woudl be for the generalization to n-ary Constraints may be implemented as graph patterns - but they may not. > >>Definition: Group Graph Pattern >> >>A Group Graph Pattern GP is a set of graph patterns, GPi. >> >>A solution of Group Graph Pattern GP on graph G is any solution S such >>that, for every element GPi of GP, S is a solution of GPi. >> >>Definition: Optional Graph Pattern >> >>An Optional Graph Pattern is a graph pattern that can create new >>solutions, but will always match. >> >>Given graph pattern GP, the Optional Graph Pattern Opt(GP) matches >>with solution S if S is a solution for a match of GP, or S is not a >>solution for GP. > > > That doesn't work with my suggested clarification of Graph Pattern. > I think you had it better earlier, when an Optional Graph Pattern > had two parts: > > An Optional Graph Pattern is a pair of graph patterns (R, O). > S is a solution for the optional graph pattern (R, O) iff > - S is a solution for R and S is not a solution for O, or > - S is a solution for the merge of R and O Tieing to another pattern was something I was trying to avoid when making a change. Maybe explaining this matching can be explained outside definition. > > > >>Definition: Union Graph Pattern >> >>A union graph pattern is a set of graph patterns GP i . >> >>A union graph pattern matches a graph G with substitution S if there >>is some GP i such that GP i matches G with substitution S. > > > OK. > > >>Definition: RDF Dataset >> >>An RDF dataset is a set = { G, (u1, G1), (u2, G2), . . . (un, Gn) } >>where G and each Gi are graphs, and each ui is a URI. Each ui is >>distinct. >> >>G is called the default graph. Gi are named graphs. > > > OK > > >>Definition: DataSet Graph Pattern >> >>If D is a dataset {G, (<u1> G1), ...}, and P is a graph pattern then S >>is a pattern solution of GRAPH(g, P) if: >> >>g is a URI where g = <ui> for some i, and S is pattern solution of P >>on Gi or g is a variable, S maps the variable g to <ui> and S is a >>pattern solution of P on Gi. > > > OK > > Hmm... all the other definitions of graph pattern solution are > given w.r.t. just a graph, not a whole dataset. Ooops. > We should specify how they work with a whole DataSet. Agreed. Need to say that is a graph that is being matched against and GRAPH chnages it. I'll deal with the solution sequence comments separately. <snip/> AndyReceived on Thursday, 19 May 2005 10:46:17 UTC

*
This archive was generated by hypermail 2.3.1
: Wednesday, 7 January 2015 14:42:52 UTC
*