Comments from Kevin: comments/changes from Kevin Wilkinson are in green font with deletions bracketed by '-' and additions bracketed by '+'. For simple typos, I just indicate the change with '+' and don't show the deletion.
An RDF graph is a set of triples, each consisting of a +subject+, an +object+, and +predicate that specifies+ a property relationship between them+,+ as defined in RDF Concepts and Abstract syntax.
KW Comment: for consistency with the rest of this document, I added the word "predicate" above. Use it either in addition to or in place of "property".
Queries match graph patterns against the target graph of the query. Patterns are like graphs but may +have+ named variables in place of some of the nodes or predicates; the simplest graph patterns are single triple patterns. -and graph- +Graph+ patterns can be combined using various operators into more complicated graph patterns.Â
A binding is a mapping from -the- a variable in a query to -terms- +RDF terms (see Section 2.2)+. A pattern solution is a set of bindings which, when applied to the variables in the query, -cab-+can+ be used to produce a subgraph of the target graph; -query results are- +a query result is+ a set of pattern solutions. If there are no -result mappings-+pattern solutions+, the query results is an empty set. (KW Comment: result mappings is not defined at this point.)
Pictorially, suppose we have a graph with two triples and the given triple pattern:
triple1
triple2
triplePattern1
with the result:
reference | author |
---|---|
http://www.w3.org/TR/xpath | "James Clark" |
http://www.w3.org/TR/xpath | "Steve DeRose" |
RDF graphs are constructed from one or more triples, ex. graph1.
-A query for graphPattern1 will return the
email address of people known by Alice (specifically, the person
with the mbox alice@work.example
). When matched
against the example RDF graph, we get one result mapping
which binds three variables:-
(KW Comment: the above paragraph (1) makes no sense as there is no Alice in graph1, (2) uses the phrase "result mapping" which has not been defined. An attempted rewrite is below. Also, graph1 and graphPattern1 use the prefix "dc:" which is not defined.)
+The query pattern graphPattern1 will return the URI (via dc:relation) and authors (via dc:creator) for documents referenced by the (document identified by the) bound variable referrer. When matched against graph1, we get two pattern solutions which bind three variables:+
referrer | reference | author |
---|---|---|
http://www.w3.org/TR/xpath | http://www.w3.org/TR/xpath | "James Clark" |
http://www.w3.org/TR/xpath | http://www.w3.org/TR/xpath | "Steve DeRose" |
(KW Comment: in the above result, I think you want the referrer result to be ~TR/xslt rather than ~TR/xpath.)
The example below shows a +SPARQL+ query to find the title of a book from the information in an RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. Here, the SELECT clause names the variable of interest to the application, and the WHERE clause has one triple pattern.
Data:
<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial" .
Query:
SELECT ?title WHERE ( <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title )
Query Result:
title |
---|
"SPARQL Tutorial" |
The terms delimited by "<>" are URI References [13] (URIRefs); URIRefs can also abbreviated with an XML QName-like form [14]; this is syntactic assistance and is translated to the full URIRef. -Other RDF terms-+The terms delimited by double quotes+ are literals which, following N-Triples syntax [7], are a string and -optional language tag (introduced with '@') and datatype URIRef (introduced by '^^')-+optionally, either a language tag (indicated by '@') or a datatype URIRef (indicated by '^^')+.
...
-RDF has typed literals. Such literals are written using "^^". Integers can be directly written and are interpreted as typed literals of datatype xsd:integer.-
+RDF has typed literals. These are written by concatenating the lexical form of the literal value (in double quotes) with the URI of the datatype, separated by "^^". As a convenience, integers can be directly written (i.e. unquoted with no datatype URI) and are interpreted as typed literals of datatype xsd:integer.+
The building blocks of queries are triple patterns. Syntactically, a SPARQL triple pattern is a subject, predicate and object delimited by parentheses. The -previous- example +in section 2.1+ shows a triple pattern with a -variable subject- +subject variable+ (the variable book), a predicate of dcore:title and -a variable object-+an object variable- (the variable title).
( ?book dcore:title ?title )
A triple pattern applied to a graph matches all triples with identical RDF terms for the corresponding subject, predicate and object. The variables in the triple pattern, if any, are bound to the corresponding RDF terms in the matching triples.
(KW Comment: I think you need to elaborate on this definition of "matching". It should be precise. By identical, I assume you mean the lexical forms match, i.e., identical character strings. You need to add the caveat that prefixes are expanded prior to matching and that directly-written integers are converted to typed integers.)
-In SPARQL, a triple pattern is an RDF triple but with the addition that components can be a query variable instead.-
+In SPARQL, a triple pattern is an RDF triple in which any component can be a query variable.+
...
(KW Comment: the example below is confusing and does not illustrate the definition of binding. The table, in fact, shows two variables rather than one variable and the semantics are not defined. I would suggest replacing the table.)
In this document, we illustrate bindings in results in tabular form,:
x | y |
---|---|
"Alice" | "Bob" |
(KW Comment: the example above is confusing and does not illustrate the definition of binding (which only mentions a single variable). The table above, in fact, shows two variables rather than one variable and the semantics of binding multiple variables is not defined. I would suggest replacing the table of two columns with a single column that shows two bindings for one variable, as shown below.)
x |
---|
"Alice" "Bob" |
(KW Comment: the paragraph below on optional matches doesn't need to go here. It's confusing. I suggest removing it or moving it to section 4.)
-Not every binding needs to exist in every row of the table. So far, the exampl es have shown queries that either exactly match the graph, or do not match at all. Optional Matches can cause bindings, bit if they fail to match, they do not cause the solution to be rejected, and so can leave variables unset in a row of the table.-
(KW Comment: the definition of Substitution, below, is confusing, coming immediately after the definition of binding. They seem similar. Could you relate the two? How is a binding related to a substitution? You need to motivate the definition of substitution.)
Definition: Substitution
A substitution S is a partial functional relation from variables to RDF terms
or variables. We
write S[v] for
the RDF term that S pairs with the variable v and define S[v] to be v where
there is no such pairing.
Â
Definition: Triple Pattern Matching
For +substitution+ S and Triple Pattern T, S(T) is
-the-+a+ triple pattern +formed+ by
replacing any variable v in T with S[v]. (KW Comment:
there may be more than one such triple pattern, correct?)
Triple Pattern T matches RDF
graph G with substitution S, if S(T) is a triple of
G.
(KW Comment: the above definition (of Triple Pattern T match G) is a second definition of triple pattern matching. Previously, at the start of section 2.2, you say that a pattern matches all triples with "identical" RDF terms. Is it obvious that these two definitions are identical? Maybe prefix the first definition by saying it is an informal definition.)
If the same variable name is used more than once in a pattern then, within each -solution-+match+ to the query, the variable has the same value.
(KW Comment: "solution" is undefined in the above sentence; perhaps "match" is a better word choice.)
(KW Comment: I found the above the definition of Triple Pattern Matching and Substitution confusing. 'S' is used in different ways; as a partial function of one variable and a mapping from a triple pattern to another triple. Substitution, below, is confusing, coming immediately after the definition of binding. They seem similar. Could you relate the two? How is a binding related to a substitution? You need to motivate the definition of substitution.)
For example, the query:
SELECT * WHERE ( ?x ?x ?v )
matches the triple:
rdf:type rdf:type rdf:Property .
with solution:
x | v |
---|---|
rdf:type | rdf:Property |
It does not match the triple:
rdfs:seeAlso rdf:type rdf:Property .
because the variable x would need to be both rdfs:seeAlso and rdf:type in the same solution.
(KW Comment: again, the word "solution" is used here. It has not been defined. It should be replaced by "match" or some other word or else "solution" should be defined, if only informally).
-The keyword WHERE is followed by a Graph Pattern which is made of one or more Triple Patterns. These Triple Patterns are "and"ed together. More formally, the Graph Pattern is the conjunction of the Triple Patterns. In each query solution, all the triple patterns must be satisfied with the same binding of variables to values.-
+A Graph Pattern is one or more Triple Patterns "and"ed together, i.e., a conjunction of Triple Patterns. In a match, all the triple patterns must be satisfied with the same binding of variables to values.+
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Johnny Lee Outlaw" . _:a foaf:mbox <mailto:jlow@example.com> .
-There is a bNode [12] in this dataset. Just within the file, for encoding purposes, the bNode is identified by _:a but the information about the bNode label is not in the RDF graph. No query will be able to identify that bNode by the label used in the serialization.-
+Note that there is a bNode [12] in this dataset, identified by _:a. This label is used only for encoding within a file; once the file is read into an RDF graph, a new label may be assigned. Consequently, applications should not assume that bnode labels used in a serialization (e.g., a file) can be used to query an RDF graph containing that serialized dataset.+
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?mbox WHERE ( ?x foaf:name "Johnny Lee Outlaw" ) ( ?x foaf:mbox ?mbox )
Query Result:
mbox |
---|
<mailto:jlow@example.com> |
-This query contains a conjunctive graph pattern. A conjunctive graph pattern is a set of triple patterns, each of which must match for the graph pattern to match.-
+The above query contains a conjunctive graph pattern of two triple patterns, each of which must match for the graph pattern to match.+
Definition: Graph Pattern (Partial Definition) â??
Conjunction
A set of triple patterns is a graph pattern GP. For such a graph pattern to
match with substitution +S(T)+, each triple pattern in GP must match with
substitution +S(T)+.
Definition: Graph Pattern Matching
For substitution S, we write S(GP) for the graph
pattern produced by applying S to each triple pattern T in GP.
If GP = { T | T triple pattern } then S(GP) = { S(T) }
Graph Pattern GP matches RDF
graph G with substitution S if G
simply entails S(GP).
(KW Comment: here's yet another definition of matching. Does this supercede the definition of matching for triple patterns? At any rate, having a hyperlink to the definition is not good since this document is likely to be printed. So, as a convenience, it would be very, very nice to provide here an informal definition of "simply entails" (i.e., the gotchas; which, I think mainly have to do with bnodes but there may be other ones, too) and then refer the reader to the other document. Also, rather than a hyperlink, this should be a reference, i.e., to some document in Appendix B)..
The results of a query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query, depending on the data.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Johnny Lee Outlaw" . _:a foaf:mbox <mailto:jlow@example.com> . _:b foaf:name "Peter Goodguy" . _:b foaf:mbox <mailto:peter@example.org> .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name, ?mbox WHERE ( ?x foaf:name ?name ) ( ?x foaf:mbox ?mbox )
Query Result:
name | mbox |
---|---|
"Johnny Lee Outlaw" | <mailto:jlow@example.com> |
"Peter Goodguy" | <mailto:peter@example.org> |
The results enumerate the RDF terms to which the selected variables can be bound in the graph pattern. In the above example, the following two subsets of the data caused the two matches.
_:a foaf:name "Johnny Lee Outlaw" . _:a foaf:box <mailto:jlow@example.com> .
_:b foaf:name "Peter Goodguy" . _:b foaf:box <mailto:peter@example.org> .
(KW Comment: I don't like the above example because it illustrates two concepts. First, it shows that a query may have multiple solutions. That's fine. But, it also illustrates the results can be a projection of the query variables. This raises additional questions, specifically, how are duplicates handled. I'd feel better if this example included variable 'x' in the result list.).
For a simple, conjunctive graph pattern match, all the variables used in the -query-+graph+ pattern will be bound in every solution.
Definition: Pattern Solution
A Pattern Solution of Graph
Pattern GP on graph G is any substitution S such that GP
matches G with S.
For a graph pattern GP formed as a set of triple patterns,
-S(G)-+S(GP)+, has no variables and is a subgraph of G.
Definition: Query Solution
A Query Solution is a Pattern
Solution where the pattern is the whole pattern of the query.
Definition: Query Results
The Query Results, for a given
graph pattern GP on G, is written R(GP,G), and is the set of all query
solutions such that GP matches G.
R(GP, G) may be the empty set.
There is no standard representation of bNodes in RDF and the syntax of SPARQL queries does not allow them. They can form part of a pattern match and do take part in in the pattern matching process.
Suggestions for better wording most welcome.
(KW Comment: I'm not sure how to reword it because I'm not sure what it's trying to say. Here are two possible interpretations. 1. The representation of bNodes (i.e., the bNode label) is not specified in RDF and is implementation-specific. Consequently, SPARQL syntax does not prescribe a representation. Applications may use bNode labels in a pattern match but such queries are implementation-specific and are should not be considered portable. 2. The representation of bNodes (i.e., the bNode label) is not specified in RDF. Consequently, bNode labels should not be used in a triple pattern because that representation is not portable across implementations and, if fact, the label may even change within one implementation (labels need not be persistent). However, variables in a triple pattern may be bound to bNodes. Regardless of the interpretation, the definition of Triple Pattern in section 2.2 should perhaps be modified to reflect the notion that bNodes should not be in a triple pattern (or should they?).
In the results of queries, the presence of bNodes can be indicated but the internal system -identification-+label+ -is not-+may not be+ preserved. Thus, a client can tell that two solutions to a query differ in bNodes needed to perform the graph match but this information is only scoped to the results (result set or RDF graph). +Repeating the query (on the identical graph) may produce different labels for the bNodes.+
Redo when XML syntax document is available an duse the syntax there.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:name "Alice" . _:b foaf:name "Bob" .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?x ?name WHERE ( ?x foaf:name ?name )
Query Result +(query run twice)+:
x | name |
---|---|
_:a | "Alice" |
_:b | "Bob" |
x | name |
---|---|
_:r | "Alice" |
_:s | "Bob" |
These two results have the same information: the blank node used to match the query was different in the two solutions. There is no relation between using _:a in the results and any internal blank node label in the data graph; the labels in the results only indicate whether -elements-+terms+ in the +solutions+ were the same or different.
-RDF Literals are written in SPARQL as strings, with optional language tag (indicted by '@') or optional datatype (indicated by '^^'), with additional convenience forms for xsd:integers and xsd:doubles:-
+An RDF Literal is written in SPARQL as a string containing the lexical form of the literal delimited by doulbe quotes, optionally followed by a language tag (indicated by '@') or a datatype (indicated by '^^'). There are convenience forms for the numeric-typed literals xsd:integer and xsd:double in which the datatype may be omitted.+
Examples of literal syntax in SPARQL:
The dataset below contains a number of RDF literals:
@prefix dt: <http://example.org/datatype#> .
@prefix ns: <http://example.org/ns#> .
@prefix : <http://example.org/ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
:x ns:p "42"^^xsd:integer .
:x ns:p "abc"^^dt:specialDatatype .
:x ns:p "cat"@en .
-The pattern in the query matches because 42 is
interpreted as "42" with datatype URI
http://www.w3.org/2001/XMLSchema#integer.-
The pattern in the following query has a solution because 42 is
interpreted as "42"^^
http://www.w3.org/2001/XMLSchema#integer.+
SELECT ?v WHERE (?x ?p 42)
-This query matches, without requiring the query processor to have any understanding of the values in the space:-
+The following query has one solution. Note that the query processor has no understanding of the datatype. This is a syntactic match.+
SELECT ?v WHERE ( ?x ?p "abc"^^<http://example.org/datatype#specialDatatype> )
-This query has a pattern that-+The following query+ fails to match because "cat" is not the same RDF literal as "cat"@en:
SELECT ?v WHERE ( ?x ?p "cat" )
but -this does find-+the following query does have+ a solution:
SELECT ?v WHERE ( ?x ?p "cat"@en )
Implementation Requirements
An implementation of SPARQL only needs to -be able to- match +lexical+ forms and datatypes in graph patterns. It is not required to -provide- support the datatype hierarchy of XML schema nor -for-+any+ application-defined hierarchies. It is not required to provide matching in patterns based on value spaces. Thus, testing +numerical+ equality in a constraint is not identical to +literal matching+ in pattern matching.
(KW Comment: I don't follow this. It seems to say that constraint matching is different from pattern matching but I don't understand why. Could you motivate this? It seems intuitive that the two queries below should return the same result.)
In this dataset,
@prefix ns: <http://example.org/ns#> .
@prefix : <http://example.org/ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
:x ns:p "42"^^xsd:short .
there is no match required for the query:
SELECT ?v WHERE ( ?x ?p 42 )
but there is for this query,
SELECT ?v WHERE ( ?x ?p ?v ) AND ?v == 42
because of the use of numeric +equallity+.
An implementation may choose to -provide-+support+ datatype hierarchies and value based pattern matching. Applications using a SPARQL processor should not assume that the processor provides datatype hierarchies or matching based on value-spaces of literals unless the application knows explicitly that this is the case.
Graph pattern matching creates bindings of variables. It is possible to further restrict possible solutions by constraining the allowable binding of variables to RDF Terms. Constraints in SPARQL take the form of boolean-valued expressions; the language also allows application-specific filter functions.
Data:
@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix : <http://example.org/book/> . @prefix ns: <http://example.org/ns#> . :book1 dc:title "SPARQL Tutorial" . :book1 ns:price 42 . :book2 dc:title "The Semantic Web" . :book2 ns:price 23 .
Query:
PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ns: <http://example.org/ns#> SELECT ?title ?price WHERE ( ?x dc:title ?title ) ( ?x ns:price ?price ) AND ?price < 30
Query Result:
title | price |
---|---|
"The Semantic Web" | 23 |
By having a constraint on the "price" variable, only one of the books matches the query. Like a triple pattern, this is just a restriction on the allowable values of a variable.
Definition: Constraints
A constraint is a boolean-valued expression of variables and RDF
Terms that can be applied to restrict query solutions.
Definition: Graph Pattern (Partial
Definition) â?? Constraints
A graph pattern can also include constraints. These constraints
further restrict the possible query solutions of matching a graph
pattern with a graph.
SPARQL defines a set of operations that all implementations must provide. In addition, there is an extension mechanism for boolean tests that are specific to an application domain or kind of data.
(KW Comment: 1) you should reference the section number (11.3?). this helps readers who print this document. 2) may a constraint reference another variable, e.g. ?x < ?y. it's not clear if this is allowed or not.)
A constraint may lead to an error condition when testing some variable binding. The exact error will depend on the constraint: in numeric operations, supplying a non-number or a bNode will lead to such an error. Any potential solution that causes an error condition in a constraint will not form part of the final results.
(KW Comment: Is any error indication provided or is it silent or implementation-dependent. Please specify.)
...
Optional -portions of the-+matches on a+ graph may be specified in either of two equivalent ways:
OPTIONAL (?s ?p ?o)...
[ (?s ?p ?o)... ]
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . _:a rdf:type foaf:Person . _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@work.example> . _:b rdf:type foaf:Person . _:b foaf:name "Bob" .
-Query (these two are the same query using slightly different syntax):-
+The following two queries have different syntax but identical semantics:+
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE ( ?x foaf:name ?name ) OPTIONAL ( ?x foaf:mbox ?mbox )
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE ( ?x foaf:name ?name ) [ ( ?x foaf:mbox ?mbox ) ]
Query result:
name | mbox |
---|---|
"Alice" | <mailto:alice@example.com> |
"Bob" |
Now, there is no value of mbox where the name is "Bob". It is left -unset-+unbound+ in the result.
(KW Comment: I think this raises the question of HOW to determine if a variable is unset/unbound in a result. Does SPARQL assume this is implementation-dependent? Or does it require that an implementation provide an isBound function? This should be stated.)
This query finds the names of people in the dataset, and, if there is an mbox property, +retrieves+ that as well. In the example, only a single triple pattern is given in the optional match part of the query but in general it is a graph pattern.
For each optional block, the query processor attempts to match the query pattern. Failure to match the block does not cause this query solution to be rejected. The whole graph pattern of an optional block must match for the optional to add to the query solution.
(KW Comment: you should state if or if not there may be constraints in an optional block. Also, I think a good conceptual model for understanding optional blocks is that all solutions are found for the required part of the query. Then, for each solution, each optional block is tried and additional bindings are defined. Is this description worth including? Perhaps it will help in understanding the optional blocks.)
A query may have zero or more top-level optional blocks. These blocks will fail or provide bindings independently. Optional blocks can also be nested, that is, an optional block may appear inside another optional block +(as described in Section 5)+.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . _:a foaf:name "Alice" . _:a foaf:homepage <http://work.example.org/alice/> . _:b foaf:name "Bob" . _:b foaf:mbox <mailto:bob@work.example> .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox ?hpage WHERE ( ?x foaf:name ?name ) [ ( ?x foaf:mbox ?mbox ) ] [ ( ?x foaf:homepage ?hpage ) ]
Query result:
name | mbox | hpage |
---|---|---|
"Alice" | <http://work.example.org/alice/> | |
"Bob" | <mailto:bob@example.com> |
In this example, there are two independent optional blocks. Each depends only on variables defined in the non-optional part of the graph pattern. If a new variable is mentioned in an optional block (as mbox and hpage are mentioned in the previous example), that variable can be mentioned in that block and can not be mentioned in a subsequent block.
In an optional match, either a graph pattern matches a graph and so defines one or more pattern solutions, or gives an empty pattern solution but does not cause matching to fail overall.
Definition: Optional Matching
Given graph pattern GP1, and graph pattern GP2, let GP= (GP1
union GP2).
The optional match of GP2 of
graph G, given GP1, defines a pattern solution PS such
that:
If GP matches G, then the solutions of GP is the patterns solutions of GP
else the solutions are the pattern solutions of GP1 matching G.
Graph patterns may contain nested patterns. -We've seen this earlier in optional matches. Nested patterns are delimited with ()s:- (KW Comment: you haven't really seen it; it was just mentioned but not explained. I'd drop this sentence.)
{ ( ?s ?p ?n2 ) ( ?n2 ?p2 ?n3 ) }
Definition: Graph Pattern â?? Nesting
A graph pattern GP can contain other graph patterns
GPi. A query solution of Graph Pattern GP on graph G
is any B such that each element GPi of GP matches G
with binding B.
For example:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?foafmbox WHERE ( ?x foaf:name ?name ) { ( ?x foaf:mbox ?mbox ) }
Because this example has a simple conjunction for the nested pattern, and because the nested pattern is a conjunctive element in the outer pattern, this has the same results:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?foafmbox WHERE ( ?x foaf:name ?name ) ( ?x foaf:mbox ?mbox )
Optional blocks can be nested. The outer optional block must match for any nested one to apply. That is, the outer graph pattern pattern is fixed for the purposes of any nested optional block.
Data:
@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> . _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@work.example> . _:a vcard:N _:d . _:d vcard:Family "Hacker" . _:d vcard:Given "Alice" . _:b foaf:name "Bob" . _:b foaf:mbox <mailto:bob@work.example> . _:c foaf:name "Eve" . _:c vcard:N _:e . _:e vcard:Family "Hacker" . _:e vcard:Given "Eve" .
Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> SELECT ?foafName ?mbox ?fname ?gname WHERE ( ?x foaf:name ?foafname ) [ ( ?x foaf:mbox ?mbox ) ] [ ( ?x vcard:N ?vc ) [ ( ?vc vcard:Family ?fname ) ( ?vc vcard:Given ?gname ) ] ]
Query result:
foafName | mbox | fname | gname |
---|---|---|---|
"Alice" | <mailto:alice@work.example> | "Hacker" | "Alice" |
"Bob" | <mailto:bob@work.example> | ||
"Eve" | "Hacker" | "Eve |
This query finds the name, optionally the mbox, and also optionally the vCard structured name components. By nesting the optional access to vcard:Family and vcard:Given, the query only reaches these if there is a vcard:N property. It is possible to expand out optional blocks to remove nesting at the cost of duplication of expressions. Here, the expression is a simple triple pattern on vcard:N but it could be a complex graph match with value constraints.
(KW Comment: this example could be more illustrative. In particular, the result of this query is the same without the nesting, i.e., if Family and Given are part of the top-level, vcard:N block. It would be more interesting if there were an additional person with a vcard but no family or given name. Then, the top-level optional block would match but not the nested blocks. Would it return a result or not? This would improve the example since, for me, the result is somewhat ambiguous.)
There is an additional condition that must be met for nested optional blocks. Considering the -query-+graph+ pattern as a tree of blocks, then a variable in an optional block can only be mentioned in other optional blocks nested within -this one-+it+. A variable can not be used in two optional blocks where the outermost mention (shallowest +occurrence+ in the tree for each +occurrence+) of the two uses is not the same block.
-All occurences of variable, v, in a query, the outermost mention of v must be the same.-+For each variable v that occurs in a nested block, consider all paths from that variable in any block to the root of the tree. Those paths must all intersect at a block that also contains the variable v.+
Suggestions for better wording most welcome!
The purpose of this condition is to enable the query processor to process the query blocks in arbitrary (or optimized) order. If a variable was introduced in one optional block and mentioned in another, it would be used to constrain the second. Reversing the order of the optional blocks would reverse the blocks in which the variable was -was- introduced and was used to constrain. Such a query could give different results depending on the order in which those blocks were evaluated.
+SPARQL provides a means combining graph patterns in to more complex ones so that one of several possibilities is attempted to see if it matches.-+SPARQL provides a means combining graph patterns so that one of several alternative graph patterns may match.+Â If more than one of the alternatives matches, all the possible pattern solutions are found.
The UNION keyword is the syntax for pattern alternatives.
Data:
@prefix dc10: <http://purl.org/dc/elements/1.0/> . @prefix dc11: <http://purl.org/dc/elements/1.1/> . _:a dc10:title "SPARQL Query Language Tutorial" . _:a dc10:creator "Alice" . _:b dc11:title "SPARQL Protocol Tutorial" . _:b dc11:creator "Bob" .
Query:
PREFIX dc10: <http://purl.org/dc/elements/1.1/> PREFIX dc11: <http://purl.org/dc/elements/1.0/> SELECT ?title WHERE ( ?book dc10:title ?title ) UNION ( ?book dc11:title ?title )
Query result:
title |
---|
"SPARQL Protocol Tutorial" |
"SPARQL Query Language Tutorial" |
This query finds titles of the books in the dataset, whether the title is recorded using Dublin Core properties from version 1.0 or version 1.1. If the application wishes to know how exactly the information was recorded, then the query:
PREFIX dc10: <http://purl.org/dc/elements/1.1/> PREFIX dc11: <http://purl.org/dc/elements/1.0/> SELECT ?title10 ?title11 WHERE ( ?book dc10:title ?title10 ) UNION ( ?book dc11:title ?title11 )
title11 | title10 |
---|---|
"SPARQL Protocol Tutorial" | Â |
 | "SPARQL Query Language Tutorial" |
will return results with the variables title10 or title11 bound depending on which way the query processor matches the pattern to the dataset. Note that, unlike optionals, if no part of the union pattern matched, then the query pattern would not match.
(KW Comment: the above examples are a bit misleading as they suggest that the variables used in the disjuncts must be identical. In fact, they need have no intersection at all. This is worth pointing out.)
More than one triple pattern can be given in a +graph+ pattern being used in a pattern union:
PREFIX dc10: <http://purl.org/dc/elements/1.1/> PREFIX dc11: <http://purl.org/dc/elements/1.0/> SELECT ?title ?author WHERE { ( ?book dc10:title ?title ) ( ?book dc10:creator ?author ) } UNION { ( ?book dc11:title ?title ) ( ?book dc11:creator ?author ) }
author | title |
---|---|
"Alice" | "SPARQL Protocol Tutorial" |
"Bob" | "SPARQL Query Language Tutorial" |
This query will only match a book if it has both a title and creator property from the same version of Dublin Core.
Definition: Pattern Matching (Union)
Given graph patterns GP1 and GP2, and graph G, then a
union pattern solution of GP1 and GP2 is any
pattern solution S such that either S(GP1) matches G or
S(GP2) matches G with substitution S.
Query results involving a pattern containing GP1 and GP2, will include
separate solutions for each match where GP1 and GP2 give rise to different
sets of bindings.
...
...
...
SPARQL has a number of query forms for returning results. These result forms use the solutions from pattern matching the query pattern (KW Comment: is query pattern defined anywhere?) to form result sets or RDF graphs. A result set is a serialization of the bindings in a query result. The query forms are:
...
Results can be thought of as a table, with one row per query solution. Some cells may be empty because a variable is not bound in that particular solution. (KW Comment: specify if SPARQL defines some way to check if a variable is not bound in a solution.)
-Results form a set of tuples. However, implementations may include duplicates for implementation and performance reasons unless indicated otherwise by the presence of the DISTINCT keyword.-+Implementations may return the solutions as either a bag or set of results, i.e., with or without duplicates. However, if the DISTINCT keyword is specified, a set of results must be returned, i.e., no duplicates.+
The LIMIT form puts an upper bound on the number of solutions returned. -A query may return a number of results up to and including the limit.-+The number of results returned is the minimum of LIMIT and the actual number of query results, whichever is lower.+
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name WHERE ( ?x foaf:name ?name ) LIMIT 20
Limits on the number of results can also be applied via the SPARQL query protocol [@@ protocol document not yet published @@].
+Note that if both DISTINCT and LIMIT are specified, then duplicates are eliminated before the LIMIT is applied.+
Gives the result graph having just the triples with property foaf:name:
(KW Comment: what does [ ] denote? This is new notation, I think, and should be explained.)
...
If a triple template has a +variable+, and in a query solution, the variable is unset, then the +substitution+ of this triple template is skipped but other triple templates are still processed for the same solution and any triples from other solutions are included in the result graph.
...
A template can create an RDF graph containing bNodes, indicated by the syntax of a prefixed name with prefix _ and some label for the local name. +The+ labels are scoped to the template for each solution. If two such prefixed names share the same label in the template, then there will be one bNode created for each query solution but there will be different bNodes across triples generated by different query solutions.
@prefix foaf: <http://xmlns.com/foaf/0.1/> . _:a foaf:givenname "Alice" . _:a foaf:family_name "Hacker" . _:b foaf:firstname "Bob" . _:b foaf:surname "Hacker" .
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> CONSTRUCT ( ?x vcard:N _:a ) ( _:n vcard:givenName ?gname ) ( _:n vcard:familyName ?fname ) WHERE { ( ?x foaf:firstname ?gname ) OR (?x foaf:givenname ?gname ) } { ( ?x foaf:surname ?fname ) OR (?x foaf:familt_name ?fname ) }
(KW Comment: the notation { ... } is not explained, I think.)
...
...
(KW Comment: It is not clear what value is returned by an Ask query form. One would hope it is a boolean. This might enable nested query forms in the future. But, it appears to be a plain literal, "yes" or "no". why not make it a boolean-typed literal?)
...