NOTE: deletions are delimited by '-', insertions delimited by '+'.  comments/questions begin with "(Comment:".

SPARQL Query Language for RDF

Editors working draft.
Live Draft - version:
$Revision: 1.264 $ of $Date: 2005/03/22 10:23:24 $
Editors:
Eric Prud'hommeaux, W3C <eric@w3.org>
Andy Seaborne, Hewlett-Packard Laboratories, Bristol <andy.seaborne@hp.com>
published W3C Technical Report Version:
17 Feb 2005; see also http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/

2 Making Simple Queries

-Combining these gives a basic pattern, where an exact match to a graph is needed to -fulfil-+fulfill+ a pattern.

In this section, we cover simple triple patterns, basic patterns and the SPARQL syntax that +is+ related to these.

2.1 Writing a Simple Query

Data descriptions used in this document

Prefixes are syntactic: the prefix name does not -effect-+affect+ the query, nor do prefix names in queries need to be the same prefixes as used for data. This query is equivalent to the previous one and will give the same results when applied to the same graph.

2.2 Graph Patterns

Definition: RDF Ground Term

The set of RDF Ground Terms, RDF-G, is the set of RDF terms that are not -a- bNodes. RDF-G is RDF-U union RDF-L.

2.3 Graph Pattern Matching

Definition: Substitution

Substitution is a function from a subset of the set of variables, V, the domain of the substitution, dom(S), to the set of RDF terms, T. (Comment: I think you want to indicate the notation to be used, e.g., "A substitution is written as S(X)=v". Otherwise, S(v) is the definition for restriction is not defined.)

2.4 Examples of Graph Patterns

This query contains a basic -graph- pattern of two triple patterns, each of which must match for the graph pattern to match.

2.6 Blank Nodes

Blank Nodes and Queries

A blank node can appear in a SPARQL query patterns.  In the query pattern, a blank node behaves as a variable, although it can not be named in the query result form, such as SELECT.

(Comment: this contradicts the definition of triple pattern which excludes bnodes. also, you don't specify if a bnode in a query pattern matches only bnodes or can it also match a literal or a URI. if it matches only bnodes, then it does NOT behave as a variable. if it does match literals and URI's then it is really not needed so why not leave it out? an example here would be helpful.)

2.7 Other Syntactic Forms

Triples with +a+ common subject can be written so that the subject is written once, and used for more than one triple using the ";" notation.

3 Working with RDF Literals

An RDF Literal is written in SPARQL as +a+ string containing the lexical form of the literal, delimited by "", followed by an optional language tag (indicted by '@') or optional datatype (indicated by '^^').  There are convenience forms for numeric-type-s-+d+ literals which are xsd:integers and xsd:doubles.

3.2 Constraining Values

Definition: Constraints

A pattern may be a constraint (Comment: huh? reword the start of the sentence.) is a boolean-valued expression of variables and RDF Terms that can be applied to restrict query solutions.

Solutions are required to bind variables occurring in constraints to typed literals such that the constraint is true when applied to the value of the typed literal. In this way, patterns may constrain literals to have particular values rather than having particular syntactic forms.

(Comment: this is a subtle point, i think. in fact, i'm confused. is this different from query patterns, i.e., do query patterns match values or lexical forms for typed literals? if values, then there's no difference and it's not clear why you're making this statement. if lexical forms, then there is a difference and you should give an example of how two seemingly identical queries might return different results.)

4 Combining Patterns

Definition: Graph Pattern – Grouping

A graph pattern GP may be a set of graph patterns, GPi.+ + and set of constraints Cj. A solution of Graph Pattern GP on graph G is any solution S such that for each element GPi of GP, S is a solution of GPi and each constraint Cj is true.

5 Including Optional Values

Basic patterns and value constraints allow queries to perform queries where al the part of the query pattern must match for there to be a solution (Comment: huh? need to reword the last part of this sentence.).

5.1 Optional Pattern Matching

There is no value of mbox in the solution where the name is "Bob". It is left -unset-+unbound+.

5.2 Constraints in Optional Blocks

Constraints can be given in optional blocks as this example-s- shows:

5.3 Multiple Optional Blocks

Query patterns are defined recursively.  A query may have zero or more optional blocks and any part -fo-+of+ a query may have an optional part.

5.4 Optional Matching – Formal Definition

Definition: Optional Matching

Given graph pattern GP1, and graph pattern GP2, let GP= (GP1 union GP2).
(Comment: do you mean UNION as in the definition in 6.1? or is an informal usage?)

The optional match of GP2 of graph G, given GP1, defines a pattern solution PS such that:

If GP matches G, then the solutions of GP -is-+are+ the patterns solutions of GP else the solutions are the pattern solutions of GP1 matching G.

 

5.6 Requirements for Nested Optional Patterns

@@ToDo@@ Move to execution order section.

There is an additional condition that must be met for nested optional blocks. Considering the graph pattern as a tree of blocks, then a variable in an optional block can only be mentioned in other optional blocks nested within it (or in the SELECT clause). A variable can not be used in two optional blocks where the outermost mention (shallowest occurrence in the tree for each occurrence) of the two uses is not the same block.

(Comment: the above seems inconsistent with the claim in 5.5 that optional blocks can be unnested, in which case a variable would appear in two optional blocks after unnesting. maybe i'm confused on this.)

6 More Pattern Matching – Alternatives

6.1 Joining Patterns with UNION

The UNION keyword is the syntax for pattern alternatives.

Data:

@prefix dc10:  <http://purl.org/dc/elements/1.0/> .
@prefix dc11:  <http://purl.org/dc/elements/1.1/> .

_:a  dc10:title     "SPARQL Query Language Tutorial" .
_:a  dc10:creator   "Alice" .

_:b  dc11:title     "SPARQL Protocol Tutorial" .
_:b  dc11:creator   "Bob" .

(Comment: to make this example more illuminating, i would add two more triples.
_:c dc10:title "foo"
_:c dc11:title "bar"

7 RDF Dataset

The RDF data model expresses information as a graph comprising -of- triples with subject, predicate and object. 

A query processor is not required to support named graphs. (Comment: is it required to support background graphs, i.e., might it only support named graphs along with some convention that the graph named "default" is the background graph?)

7.1 Examples of RDF Datasets

Example 2:

In this next example, the named graphs contain the same information as before. The RDF dataset includes an RDF merge of the named graphs in the background graph, relabelling blank nodes to keep them distinct. Doing this is trusting the contents of the named graphs. An implementation can efficiently provide datasets of this form without duplicating stored triples.

(Comment: this example is a little confusing. what if i change the content of a named graph? is it reflected in the background graph or have they diverged?)

# Background graph
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:x foaf:name "Bob" .
_:x foaf:mbox <mailto:bob@oldcorp.example.org> .

_:y foaf:name "Alice" .
_:y foaf:mbox <mailto:alice@work.example.org> .
# Graph: http://example.org/bob
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Bob" .
_:a foaf:mbox <mailto:bob@oldcorp.example.org> .
# Graph: http://example.org/alice
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:a foaf:name "Alice" .
_:a foaf:mbox <mailto:alice@work.example> .

8 Querying the Dataset

When querying a collection of graphs, the GRAPH keyword allows access to the URIs naming the graphs in the RDF Dataset, or -allows restriction-+restricts+ a graph pattern to be applied to a specific named graph.

8.4 GRAPH and a background graph

The -default-+background+ graph is being used to record the provenance information and the RDF data actually read is kept in two separate graphs, each of which is given a different URI by the system. The RDF dataset consists of two, named graphs and the information about them.

The URI for the date data- -type has been abbreviated in the results -just- for convenience.

10 Result Forms

10.1 Solution Sequences and Result Forms

DISTINCT

The solution sequence can be modified by adding the DISTINCT keyword which ensures that every combination of variable bindings (i.e. each solution) in a -the- sequence is unique. Thought of as a table, each row is different.

If DISTINCT and LIMIT/OFFSET are specified, then duplicates are eliminated before the limit or offset is applied. (Comment: you may want to specify how unbound values are treated for distinct, i.e., are all nulls equal?)

LIMIT

The LIMIT form puts an upper bound on the number of solutions returned. -A query may return a number of results up to and including the limit.- If the number of actual solutions is less than the limit, all solutions will be returned.+ If the number of actual solutions is greater than or equal to the limit, then the limit number of solutions will be returned.+

(Comment: you may want to specify what a limit of 0 or a limit of -1 means, if anything.)

OFFSET

The order in which solutions are returned is undefined so using LIMIT and OFFSET to select different subsets of the query solutions will -given- not be useful unless the order is made -predicable-+predictable+ by ensuring ordered results using ORDER BY. (Comment: you may want to add a note that it may be impossible to guarantee consistent results if the underlying graphs are concurrently modified between successive queries.)

10.2 Selecting which Variables to Return

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:name   "Alice" .
_:a    foaf:knows  _:b .
_:a    foaf:knows  _:c .

_:b    foaf:name   "Bob" .

_:c    foaf:name   "Clare" .
_:c    foaf:nick   "CT" .
(Comment: i suggest making the object of foaf:nick a URI so you can show in 
the resulting serialization how URIs are formatted.)

Result sets can be accessed by the local API but also can be serialized into either XML or an RDF graph. (Comment: Is this an option in the query statement or a programmatic option only? if in the statement, give an example.) The XML result set form gives:

10.3 Constructing an Output Graph

Templates with bNodes

A template can create an RDF graph containing bNodes, indicated by the syntax of a prefixed name with prefix "_" and some label for the local name.  The labels are scoped to the template for each solution. If two such prefixed names share the same label in the template, then there will be one bNode created for each query solution but there will be different bNodes across triples generated by different query solutions.

(Comment: the example below does not illustrate the creation of a graph with a bnode by use of a  "_" prefixed name as per the paragraph above.)

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:a    foaf:givenname   "Alice" .
_:a    foaf:family_name "Hacker" .

_:b    foaf:firstname   "Bob" .
_:b    foaf:surname     "Hacker" .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX vcard:   <http://www.w3.org/2001/vcard-rdf/3.0#>

CONSTRUCT { ?x vcard:N [ vcard:givenName ?gname ; vcard:familyName ?fname ] }
WHERE
 {
    { ?x foaf:firstname ?gname } UNION  { ?x foaf:givenname   ?gname } .
    { ?x foaf:surname   ?fname } UNION  { ?x foaf:family_name ?fname } .
 }  
creates vcard properties corresponding to the FOAF information:
@prefix vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> .

_:v1 vcard:N         _:x .
_:x vcard:givenName  "Alice" .
_:x vcard:familyName "Hacker" .

_:v2 vcard:N         _:z .
_:z vcard:givenName  "Bob" .
_:z vcard:familyName "Hacker" .

10.4 Descriptions of Resources

The DESCRIBE form returns a single RDF graph containing RDF data -associated- about resources.

The query pattern is used to create a result set. The DESCRIBE form takes each of the resources identified in a solution-s-, together with any resources directly named by URI, and assembles a single RDF graph by taking a "description" from the target knowledge base.

11 Testing Values

11.1 Operand Data Types

11.1.1 Type Promotion

In summary: -Each-+each+ of the numeric types may be promoted to any type higher in the above list.

11.2 SPARQL Functions and Operators

SPARQL provides a subset of the functions and operators defined by XQuery Operator Mapping. The XPath evaluation rules are -ammended-+amended + by the following rules to -accomodate-+accommodate+  the additional types and states introduced by RDF and SPARQL:

11.2.0 Invocation

11.2.0.1 -Effictive-+Effective+ Boolean Value

Table 11.1 Operator Mapping

The following table associates SPARQL infix operators taking specific argument types with the XPath -opperation-+operation+  name.