SPARQL Query Language for RDF

Comments from Kevin: comments/changes from Kevin Wilkinson are in green font with deletions bracketed by '-' and additions bracketed by '+'. For simple typos, I just indicate the change with '+' and don't show the deletion.

1 Introduction

An RDF graph is a set of triples, each consisting of a +subject+, an +object+, and +predicate that specifies+ a property relationship between them+,+ as defined in RDF Concepts and Abstract syntax.

KW Comment: for consistency with the rest of this document, I added the word "predicate" above. Use it either in addition to or in place of "property".

2 Making Simple Queries

Queries match graph patterns against the target graph of the query. Patterns are like graphs but may +have+ named variables in place of some of the nodes or predicates; the simplest graph patterns are single triple patterns. -and graph- +Graph+ patterns can be combined using various operators into more complicated graph patterns.

A binding is a mapping from -the- a variable in a query to -terms- +RDF terms (see Section 2.2)+. A pattern solution is a set of bindings which, when applied to the variables in the query, -cab-+can+ be used to produce a subgraph of the target graph; -query results are- +a query result is+ a set of pattern solutions. If there are no -result mappings-+pattern solutions+, the query results is an empty set. (KW Comment: result mappings is not defined at this point.)

Pictorially, suppose we have a graph with two triples and the given triple pattern:

triple1

_:2 foaf:mbox "robt@home.example"

triple2

?who foaf:mbox ?addr

triplePattern1

reference	author
http://www.w3.org/TR/xpath	"James Clark"
http://www.w3.org/TR/xpath	"Steve DeRose"

graph1

?who foaf:mbox "alice@work.example". ?who foaf:knows ?whom. ?whom foaf:mbox ?address

graphPattern1

-A query for graphPattern1 will return the email address of people known by Alice (specifically, the person with the mbox alice@work.example). When matched against the example RDF graph, we get one result mapping which binds three variables:-

(KW Comment: the above paragraph (1) makes no sense as there is no Alice in graph1, (2) uses the phrase "result mapping" which has not been defined. An attempted rewrite is below. Also, graph1 and graphPattern1 use the prefix "dc:" which is not defined.)

+The query pattern graphPattern1 will return the URI (via dc:relation) and authors (via dc:creator) for documents referenced by the (document identified by the) bound variable referrer. When matched against graph1, we get two pattern solutions which bind three variables:+

referrer	reference	author
http://www.w3.org/TR/xpath	http://www.w3.org/TR/xpath	"James Clark"
http://www.w3.org/TR/xpath	http://www.w3.org/TR/xpath	"Steve DeRose"

(KW Comment: in the above result, I think you want the referrer result to be ~TR/xslt rather than ~TR/xpath.)

2.1 Writing a Simple Query

The example below shows a +SPARQL+ query to find the title of a book from the information in an RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. Here, the SELECT clause names the variable of interest to the application, and the WHERE clause has one triple pattern.

<http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> "SPARQL Tutorial" .

SELECT ?title
WHERE  ( <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title )

title
"SPARQL Tutorial"

The terms delimited by "<>" are URI References [13] (URIRefs); URIRefs can also abbreviated with an XML QName-like form [14]; this is syntactic assistance and is translated to the full URIRef. -Other RDF terms-+The terms delimited by double quotes+ are literals which, following N-Triples syntax [7], are a string and -optional language tag (introduced with '@') and datatype URIRef (introduced by '^^')-+optionally, either a language tag (indicated by '@') or a datatype URIRef (indicated by '^^')+.

-RDF has typed literals. Such literals are written using "^^". Integers can be directly written and are interpreted as typed literals of datatype xsd:integer.-

+RDF has typed literals. These are written by concatenating the lexical form of the literal value (in double quotes) with the URI of the datatype, separated by "^^". As a convenience, integers can be directly written (i.e. unquoted with no datatype URI) and are interpreted as typed literals of datatype xsd:integer.+

2.2 Triple Patterns

The building blocks of queries are triple patterns. Syntactically, a SPARQL triple pattern is a subject, predicate and object delimited by parentheses. The -previous- example +in section 2.1+ shows a triple pattern with a -variable subject- +subject variable+ (the variable book), a predicate of dcore:title and -a variable object-+an object variable- (the variable title).

( ?book dcore:title ?title )

A triple pattern applied to a graph matches all triples with identical RDF terms for the corresponding subject, predicate and object. The variables in the triple pattern, if any, are bound to the corresponding RDF terms in the matching triples.

(KW Comment: I think you need to elaborate on this definition of "matching". It should be precise. By identical, I assume you mean the lexical forms match, i.e., identical character strings. You need to add the caveat that prefixes are expanded prior to matching and that directly-written integers are converted to typed integers.)

-In SPARQL, a triple pattern is an RDF triple but with the addition that components can be a query variable instead.-

+In SPARQL, a triple pattern is an RDF triple in which any component can be a query variable.+

(KW Comment: the example below is confusing and does not illustrate the definition of binding. The table, in fact, shows two variables rather than one variable and the semantics are not defined. I would suggest replacing the table.)

x	y
"Alice"	"Bob"

(KW Comment: the example above is confusing and does not illustrate the definition of binding (which only mentions a single variable). The table above, in fact, shows two variables rather than one variable and the semantics of binding multiple variables is not defined. I would suggest replacing the table of two columns with a single column that shows two bindings for one variable, as shown below.)

x
"Alice" "Bob"

(KW Comment: the paragraph below on optional matches doesn't need to go here. It's confusing. I suggest removing it or moving it to section 4.)

-Not every binding needs to exist in every row of the table. So far, the exampl es have shown queries that either exactly match the graph, or do not match at all. Optional Matches can cause bindings, bit if they fail to match, they do not cause the solution to be rejected, and so can leave variables unset in a row of the table.-

(KW Comment: the definition of Substitution, below, is confusing, coming immediately after the definition of binding. They seem similar. Could you relate the two? How is a binding related to a substitution? You need to motivate the definition of substitution.)

Definition: Substitution

A substitution S is a partial functional relation from variables to RDF terms or variables. We write S[v] for the RDF term that S pairs with the variable v and define S[v] to be v where there is no such pairing.

Definition: Triple Pattern Matching

For +substitution+ S and Triple Pattern T, S(T) is -the-+a+ triple pattern +formed+ by replacing any variable v in T with S[v]. (KW Comment: there may be more than one such triple pattern, correct?)

Triple Pattern T matches RDF graph G with substitution S, if S(T) is a triple of G.

(KW Comment: the above definition (of Triple Pattern T match G) is a second definition of triple pattern matching. Previously, at the start of section 2.2, you say that a pattern matches all triples with "identical" RDF terms. Is it obvious that these two definitions are identical? Maybe prefix the first definition by saying it is an informal definition.)

If the same variable name is used more than once in a pattern then, within each -solution-+match+ to the query, the variable has the same value.

(KW Comment: "solution" is undefined in the above sentence; perhaps "match" is a better word choice.)

(KW Comment: I found the above the definition of Triple Pattern Matching and Substitution confusing. 'S' is used in different ways; as a partial function of one variable and a mapping from a triple pattern to another triple. Substitution, below, is confusing, coming immediately after the definition of binding. They seem similar. Could you relate the two? How is a binding related to a substitution? You need to motivate the definition of substitution.)

SELECT * WHERE ( ?x ?x ?v )

rdf:type rdf:type rdf:Property .

x	v
rdf:type	rdf:Property

rdfs:seeAlso rdf:type rdf:Property .

because the variable x would need to be both rdfs:seeAlso and rdf:type in the same solution.

(KW Comment: again, the word "solution" is used here. It has not been defined. It should be replaced by "match" or some other word or else "solution" should be defined, if only informally).

2.3 Graph Patterns

-The keyword WHERE is followed by a Graph Pattern which is made of one or more Triple Patterns. These Triple Patterns are "and"ed together. More formally, the Graph Pattern is the conjunction of the Triple Patterns. In each query solution, all the triple patterns must be satisfied with the same binding of variables to values.-

+A Graph Pattern is one or more Triple Patterns "and"ed together, i.e., a conjunction of Triple Patterns. In a match, all the triple patterns must be satisfied with the same binding of variables to values.+

@prefix foaf:    <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .

-There is a bNode [12] in this dataset. Just within the file, for encoding purposes, the bNode is identified by _:a but the information about the bNode label is not in the RDF graph. No query will be able to identify that bNode by the label used in the serialization.-

+Note that there is a bNode [12] in this dataset, identified by _:a. This label is used only for encoding within a file; once the file is read into an RDF graph, a new label may be assigned. Consequently, applications should not assume that bnode labels used in a serialization (e.g., a file) can be used to query an RDF graph containing that serialized dataset.+

PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?mbox
WHERE
  ( ?x foaf:name "Johnny Lee Outlaw" )
  ( ?x foaf:mbox ?mbox )

mbox
<mailto:jlow@example.com>

-This query contains a conjunctive graph pattern. A conjunctive graph pattern is a set of triple patterns, each of which must match for the graph pattern to match.-

+The above query contains a conjunctive graph pattern of two triple patterns, each of which must match for the graph pattern to match.+

Definition: Graph Pattern (Partial Definition) �?? Conjunction

A set of triple patterns is a graph pattern GP. For such a graph pattern to match with substitution +S(T)+, each triple pattern in GP must match with substitution +S(T)+.

Definition: Graph Pattern Matching

For substitution S, we write S(GP) for the graph pattern produced by applying S to each triple pattern T in GP.

If GP = { T | T triple pattern } then S(GP) = { S(T) }

Graph Pattern GP matches RDF graph G with substitution S if G simply entails S(GP).

(KW Comment: here's yet another definition of matching. Does this supercede the definition of matching for triple patterns? At any rate, having a hyperlink to the definition is not good since this document is likely to be printed. So, as a convenience, it would be very, very nice to provide here an informal definition of "simply entails" (i.e., the gotchas; which, I think mainly have to do with bnodes but there may be other ones, too) and then refer the reader to the other document. Also, rather than a hyperlink, this should be a reference, i.e., to some document in Appendix B)..

2.4 Multiple Matches

The results of a query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query, depending on the data.

@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Johnny Lee Outlaw" .
_:a  foaf:mbox   <mailto:jlow@example.com> .
_:b  foaf:name   "Peter Goodguy" .
_:b  foaf:mbox   <mailto:peter@example.org> .

PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?name, ?mbox
WHERE
  ( ?x foaf:name ?name )
  ( ?x foaf:mbox ?mbox )

name	mbox
"Johnny Lee Outlaw"	<mailto:jlow@example.com>
"Peter Goodguy"	<mailto:peter@example.org>

The results enumerate the RDF terms to which the selected variables can be bound in the graph pattern. In the above example, the following two subsets of the data caused the two matches.

(KW Comment: I don't like the above example because it illustrates two concepts. First, it shows that a query may have multiple solutions. That's fine. But, it also illustrates the results can be a projection of the query variables. This raises additional questions, specifically, how are duplicates handled. I'd feel better if this example included variable 'x' in the result list.).

For a simple, conjunctive graph pattern match, all the variables used in the -query-+graph+ pattern will be bound in every solution.

Definition: Pattern Solution

A Pattern Solution of Graph Pattern GP on graph G is any substitution S such that GP matches G with S.

For a graph pattern GP formed as a set of triple patterns, -S(G)-+S(GP)+, has no variables and is a subgraph of G.

Definition: Query Solution

A Query Solution is a Pattern Solution where the pattern is the whole pattern of the query.

Definition: Query Results

The Query Results, for a given graph pattern GP on G, is written R(GP,G), and is the set of all query solutions such that GP matches G.

R(GP, G) may be the empty set.

2.5 Blank Nodes

Blank Nodes and Queries

There is no standard representation of bNodes in RDF and the syntax of SPARQL queries does not allow them. They can form part of a pattern match and do take part in in the pattern matching process.

(KW Comment: I'm not sure how to reword it because I'm not sure what it's trying to say. Here are two possible interpretations. 1. The representation of bNodes (i.e., the bNode label) is not specified in RDF and is implementation-specific. Consequently, SPARQL syntax does not prescribe a representation. Applications may use bNode labels in a pattern match but such queries are implementation-specific and are should not be considered portable. 2. The representation of bNodes (i.e., the bNode label) is not specified in RDF. Consequently, bNode labels should not be used in a triple pattern because that representation is not portable across implementations and, if fact, the label may even change within one implementation (labels need not be persistent). However, variables in a triple pattern may be bound to bNodes. Regardless of the interpretation, the definition of Triple Pattern in section 2.2 should perhaps be modified to reflect the notion that bNodes should not be in a triple pattern (or should they?).

Blank Nodes and Results

In the results of queries, the presence of bNodes can be indicated but the internal system -identification-+label+ -is not-+may not be+ preserved. Thus, a client can tell that two solutions to a query differ in bNodes needed to perform the graph match but this information is only scoped to the results (result set or RDF graph). +Repeating the query (on the identical graph) may produce different labels for the bNodes.+

@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name   "Alice" .
_:b  foaf:name   "Bob" .

PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?x ?name
WHERE  ( ?x foaf:name ?name )

x	name
_:a	"Alice"
_:b	"Bob"

x	name
_:r	"Alice"
_:s	"Bob"

These two results have the same information: the blank node used to match the query was different in the two solutions. There is no relation between using _:a in the results and any internal blank node label in the data graph; the labels in the results only indicate whether -elements-+terms+ in the +solutions+ were the same or different.

3 Working with RDF Literals

-RDF Literals are written in SPARQL as strings, with optional language tag (indicted by '@') or optional datatype (indicated by '^^'), with additional convenience forms for xsd:integers and xsd:doubles:-

+An RDF Literal is written in SPARQL as a string containing the lexical form of the literal delimited by doulbe quotes, optionally followed by a language tag (indicated by '@') or a datatype (indicated by '^^'). There are convenience forms for the numeric-typed literals xsd:integer and xsd:double in which the datatype may be omitted.+

3.1 Matching RDF Literals

@prefix dt:   <http://example.org/datatype#> .
@prefix ns:   <http://example.org/ns#> .
@prefix :     <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

:x   ns:p     "42"^^xsd:integer .
:x   ns:p     "abc"^^dt:specialDatatype .
:x   ns:p     "cat"@en .

The pattern in the following query has a solution because 42 is interpreted as "42"^^ http://www.w3.org/2001/XMLSchema#integer.+

-This query matches, without requiring the query processor to have any understanding of the values in the space:-

+The following query has one solution. Note that the query processor has no understanding of the datatype. This is a syntactic match.+

-This query has a pattern that-+The following query+ fails to match because "cat" is not the same RDF literal as "cat"@en:

An implementation of SPARQL only needs to -be able to- match +lexical+ forms and datatypes in graph patterns. It is not required to -provide- support the datatype hierarchy of XML schema nor -for-+any+ application-defined hierarchies. It is not required to provide matching in patterns based on value spaces. Thus, testing +numerical+ equality in a constraint is not identical to +literal matching+ in pattern matching.

(KW Comment: I don't follow this. It seems to say that constraint matching is different from pattern matching but I don't understand why. Could you motivate this? It seems intuitive that the two queries below should return the same result.)

@prefix ns:   <http://example.org/ns#> .
@prefix :     <http://example.org/ns#> .
@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .

:x   ns:p     "42"^^xsd:short .

An implementation may choose to -provide-+support+ datatype hierarchies and value based pattern matching. Applications using a SPARQL processor should not assume that the processor provides datatype hierarchies or matching based on value-spaces of literals unless the application knows explicitly that this is the case.

3.2 Constraining Values

Graph pattern matching creates bindings of variables. It is possible to further restrict possible solutions by constraining the allowable binding of variables to RDF Terms. Constraints in SPARQL take the form of boolean-valued expressions; the language also allows application-specific filter functions.

@prefix dc:   <http://purl.org/dc/elements/1.1/> .
@prefix :     <http://example.org/book/> .
@prefix ns:   <http://example.org/ns#> .

:book1  dc:title  "SPARQL Tutorial" . 
:book1  ns:price  42 .
:book2  dc:title  "The Semantic Web" . 
:book2  ns:price  23 .

PREFIX  dc:  <http://purl.org/dc/elements/1.1/>
PREFIX  ns:  <http://example.org/ns#> 
SELECT  ?title ?price
WHERE   ( ?x dc:title ?title )
        ( ?x ns:price ?price ) AND ?price < 30

title	price
"The Semantic Web"	23

By having a constraint on the "price" variable, only one of the books matches the query. Like a triple pattern, this is just a restriction on the allowable values of a variable.

Definition: Constraints

A constraint is a boolean-valued expression of variables and RDF Terms that can be applied to restrict query solutions.

Definition: Graph Pattern (Partial Definition) �?? Constraints

A graph pattern can also include constraints. These constraints further restrict the possible query solutions of matching a graph pattern with a graph.

SPARQL defines a set of operations that all implementations must provide. In addition, there is an extension mechanism for boolean tests that are specific to an application domain or kind of data.

(KW Comment: 1) you should reference the section number (11.3?). this helps readers who print this document. 2) may a constraint reference another variable, e.g. ?x < ?y. it's not clear if this is allowed or not.)

A constraint may lead to an error condition when testing some variable binding. The exact error will depend on the constraint: in numeric operations, supplying a non-number or a bNode will lead to such an error. Any potential solution that causes an error condition in a constraint will not form part of the final results.

(KW Comment: Is any error indication provided or is it silent or implementation-dependent. Please specify.)

4 Including Optional Values

4.1 Optional Matching

Optional -portions of the-+matches on a+ graph may be specified in either of two equivalent ways:

 OPTIONAL (?s ?p ?o)...

@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .

_:a  rdf:type        foaf:Person .
_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .

_:b  rdf:type        foaf:Person .
_:b  foaf:name       "Bob" .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE  ( ?x foaf:name  ?name )
       [ ( ?x  foaf:mbox  ?mbox ) ]

name	mbox
"Alice"	<mailto:alice@example.com>
"Bob"

Now, there is no value of mbox where the name is "Bob". It is left -unset-+unbound+ in the result.

(KW Comment: I think this raises the question of HOW to determine if a variable is unset/unbound in a result. Does SPARQL assume this is implementation-dependent? Or does it require that an implementation provide an isBound function? This should be stated.)

This query finds the names of people in the dataset, and, if there is an mbox property, +retrieves+ that as well. In the example, only a single triple pattern is given in the optional match part of the query but in general it is a graph pattern.

For each optional block, the query processor attempts to match the query pattern. Failure to match the block does not cause this query solution to be rejected. The whole graph pattern of an optional block must match for the optional to add to the query solution.

(KW Comment: you should state if or if not there may be constraints in an optional block. Also, I think a good conceptual model for understanding optional blocks is that all solutions are found for the required part of the query. Then, for each solution, each optional block is tried and additional bindings are defined. Is this description worth including? Perhaps it will help in understanding the optional blocks.)

4.2 Multiple Optional Blocks

A query may have zero or more top-level optional blocks. These blocks will fail or provide bindings independently. Optional blocks can also be nested, that is, an optional block may appear inside another optional block +(as described in Section 5)+.

@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .

_:a  foaf:name       "Alice" .
_:a  foaf:homepage   <http://work.example.org/alice/> .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox ?hpage
WHERE  ( ?x foaf:name  ?name )
       [ ( ?x foaf:mbox ?mbox ) ]
       [ ( ?x foaf:homepage ?hpage ) ]

name	mbox	hpage
"Alice"		<http://work.example.org/alice/>
"Bob"	<mailto:bob@example.com>

In this example, there are two independent optional blocks. Each depends only on variables defined in the non-optional part of the graph pattern. If a new variable is mentioned in an optional block (as mbox and hpage are mentioned in the previous example), that variable can be mentioned in that block and can not be mentioned in a subsequent block.

4.3 Optional Matching �?? Formal Definition

In an optional match, either a graph pattern matches a graph and so defines one or more pattern solutions, or gives an empty pattern solution but does not cause matching to fail overall.

Definition: Optional Matching

Given graph pattern GP1, and graph pattern GP2, let GP= (GP1 union GP2).

The optional match of GP2 of graph G, given GP1, defines a pattern solution PS such that:

If GP matches G, then the solutions of GP is the patterns solutions of GP else the solutions are the pattern solutions of GP1 matching G.

5 Nested Patterns

Graph patterns may contain nested patterns. -We've seen this earlier in optional matches. Nested patterns are delimited with ()s:- (KW Comment: you haven't really seen it; it was just mentioned but not explained. I'd drop this sentence.)

{ ( ?s ?p ?n2 ) ( ?n2 ?p2 ?n3 ) }

Definition: Graph Pattern �?? Nesting

A graph pattern GP can contain other graph patterns GP_i. A query solution of Graph Pattern GP on graph G is any B such that each element GP_i of GP matches G with binding B.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name ?foafmbox
WHERE  ( ?x foaf:name ?name )
         { ( ?x foaf:mbox ?mbox ) }

Because this example has a simple conjunction for the nested pattern, and because the nested pattern is a conjunctive element in the outer pattern, this has the same results:

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name ?foafmbox
WHERE  ( ?x foaf:name ?name ) ( ?x foaf:mbox ?mbox )

Optional blocks can be nested. The outer optional block must match for any nested one to apply. That is, the outer graph pattern pattern is fixed for the purposes of any nested optional block.

@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix rdf:        <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:       <http://www.w3.org/2000/01/rdf-schema#> .
@prefix vcard:      <http://www.w3.org/2001/vcard-rdf/3.0#> .
 
_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .
_:a  vcard:N         _:d .

_:d  vcard:Family    "Hacker" .
_:d  vcard:Given     "Alice" .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .

_:c  foaf:name       "Eve" .
_:c  vcard:N         _:e .

_:e  vcard:Family    "Hacker" .
_:e  vcard:Given     "Eve" .

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?foafName ?mbox ?fname ?gname
WHERE  ( ?x foaf:name ?foafname )
       [ ( ?x foaf:mbox ?mbox ) ]
       [ ( ?x  vcard:N  ?vc )
          [ ( ?vc vcard:Family ?fname ) 
            ( ?vc vcard:Given  ?gname )
          ]
       ]

foafName	mbox	fname	gname
"Alice"	<mailto:alice@work.example>	"Hacker"	"Alice"
"Bob"	<mailto:bob@work.example>
"Eve"		"Hacker"	"Eve

This query finds the name, optionally the mbox, and also optionally the vCard structured name components. By nesting the optional access to vcard:Family and vcard:Given, the query only reaches these if there is a vcard:N property. It is possible to expand out optional blocks to remove nesting at the cost of duplication of expressions. Here, the expression is a simple triple pattern on vcard:N but it could be a complex graph match with value constraints.

(KW Comment: this example could be more illustrative. In particular, the result of this query is the same without the nesting, i.e., if Family and Given are part of the top-level, vcard:N block. It would be more interesting if there were an additional person with a vcard but no family or given name. Then, the top-level optional block would match but not the nested blocks. Would it return a result or not? This would improve the example since, for me, the result is somewhat ambiguous.)

5.1 Nested Optional Blocks

There is an additional condition that must be met for nested optional blocks. Considering the -query-+graph+ pattern as a tree of blocks, then a variable in an optional block can only be mentioned in other optional blocks nested within -this one-+it+. A variable can not be used in two optional blocks where the outermost mention (shallowest +occurrence+ in the tree for each +occurrence+) of the two uses is not the same block.

-All occurences of variable, v, in a query, the outermost mention of v must be the same.-+For each variable v that occurs in a nested block, consider all paths from that variable in any block to the root of the tree. Those paths must all intersect at a block that also contains the variable v.+

The purpose of this condition is to enable the query processor to process the query blocks in arbitrary (or optimized) order. If a variable was introduced in one optional block and mentioned in another, it would be used to constrain the second. Reversing the order of the optional blocks would reverse the blocks in which the variable was -was- introduced and was used to constrain. Such a query could give different results depending on the order in which those blocks were evaluated.

6 More Pattern Matching �?? Alternatives

+SPARQL provides a means combining graph patterns in to more complex ones so that one of several possibilities is attempted to see if it matches.-+SPARQL provides a means combining graph patterns so that one of several alternative graph patterns may match.+ If more than one of the alternatives matches, all the possible pattern solutions are found.

6.1 Joining Patterns with UNION

@prefix dc10:  <http://purl.org/dc/elements/1.0/> .
@prefix dc11:  <http://purl.org/dc/elements/1.1/> .

_:a  dc10:title     "SPARQL Query Language Tutorial" .
_:a  dc10:creator   "Alice" .

_:b  dc11:title     "SPARQL Protocol Tutorial" .
_:b  dc11:creator   "Bob" .

PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title
WHERE  ( ?book dc10:title  ?title ) UNION ( ?book dc11:title  ?title )

title
"SPARQL Protocol Tutorial"
"SPARQL Query Language Tutorial"

This query finds titles of the books in the dataset, whether the title is recorded using Dublin Core properties from version 1.0 or version 1.1. If the application wishes to know how exactly the information was recorded, then the query:

PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title10 ?title11
WHERE  ( ?book dc10:title ?title10 ) UNION ( ?book dc11:title  ?title11 )

title11	title10
"SPARQL Protocol Tutorial"
	"SPARQL Query Language Tutorial"

will return results with the variables title10 or title11 bound depending on which way the query processor matches the pattern to the dataset. Note that, unlike optionals, if no part of the union pattern matched, then the query pattern would not match.

(KW Comment: the above examples are a bit misleading as they suggest that the variables used in the disjuncts must be identical. In fact, they need have no intersection at all. This is worth pointing out.)

6.2 Blocks in Union Patterns

More than one triple pattern can be given in a +graph+ pattern being used in a pattern union:

PREFIX dc10:  <http://purl.org/dc/elements/1.1/>
PREFIX dc11:  <http://purl.org/dc/elements/1.0/>

SELECT ?title ?author
WHERE  { ( ?book dc10:title ?title )  ( ?book dc10:creator ?author ) }
     UNION
       { ( ?book dc11:title ?title )  ( ?book dc11:creator ?author ) }

This query will only match a book if it has both a title and creator property from the same version of Dublin Core.

author	title
"Alice"	"SPARQL Protocol Tutorial"
"Bob"	"SPARQL Query Language Tutorial"

6.3 Alternative Matching �?? Formal Definition

Definition: Pattern Matching (Union)

Given graph patterns GP1 and GP2, and graph G, then a union pattern solution of GP1 and GP2 is any pattern solution S such that either S(GP1) matches G or S(GP2) matches G with substitution S.

Query results involving a pattern containing GP1 and GP2, will include separate solutions for each match where GP1 and GP2 give rise to different sets of bindings.

7 More Pattern Matching �?? Unsaid

8 Choosing What to Query

9 Querying the Origin of Statements

10 Result Forms

SPARQL has a number of query forms for returning results. These result forms use the solutions from pattern matching the query pattern (KW Comment: is query pattern defined anywhere?) to form result sets or RDF graphs. A result set is a serialization of the bindings in a query result. The query forms are:

Results can be thought of as a table, with one row per query solution. Some cells may be empty because a variable is not bound in that particular solution. (KW Comment: specify if SPARQL defines some way to check if a variable is not bound in a solution.)

-Results form a set of tuples. However, implementations may include duplicates for implementation and performance reasons unless indicated otherwise by the presence of the DISTINCT keyword.-+Implementations may return the solutions as either a bag or set of results, i.e., with or without duplicates. However, if the DISTINCT keyword is specified, a set of results must be returned, i.e., no duplicates.+

SELECT LIMIT

The LIMIT form puts an upper bound on the number of solutions returned. -A query may return a number of results up to and including the limit.-+The number of results returned is the minimum of LIMIT and the actual number of query results, whichever is lower.+

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name
WHERE ( ?x foaf:name ?name )
LIMIT 20

Limits on the number of results can also be applied via the SPARQL query protocol [@@ protocol document not yet published @@].

+Note that if both DISTINCT and LIMIT are specified, then duplicates are eliminated before the LIMIT is applied.+

10.2 Constructing an Output Graph

Gives the result graph having just the triples with property foaf:name:

(KW Comment: what does [ ] denote? This is new notation, I think, and should be explained.)

CONSTRUCT with a template

If a triple template has a +variable+, and in a query solution, the variable is unset, then the +substitution+ of this triple template is skipped but other triple templates are still processed for the same solution and any triples from other solutions are included in the result graph.

Templates with bNodes

A template can create an RDF graph containing bNodes, indicated by the syntax of a prefixed name with prefix _ and some label for the local name. +The+ labels are scoped to the template for each solution. If two such prefixed names share the same label in the template, then there will be one bNode created for each query solution but there will be different bNodes across triples generated by different query solutions.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>

CONSTRUCT   ( ?x vcard:N _:a )
            ( _:n vcard:givenName  ?gname )
            ( _:n vcard:familyName ?fname )
WHERE
       { ( ?x foaf:firstname ?gname ) OR (?x foaf:givenname   ?gname ) }
       { ( ?x foaf:surname   ?fname ) OR (?x foaf:familt_name ?fname ) }

creates vcard properties corresponding to the FOAF information:

(KW Comment: the notation { ... } is not explained, I think.)

10.3 Descriptions of Resources

10.4 Asking "yes or no" questions

(KW Comment: It is not clear what value is returned by an Ask query form. One would hope it is a boolean. This might enable nested query forms in the future. But, it appears to be a plain literal, "yes" or "no". why not make it a boolean-typed literal?)