DINSTINCT and LOOSE

Table of Contents

9 Solution Sequences and Modifiers

Query patterns generate an unordered collection of solutions, each solution being a function from variables to RDF terms. These solutions are then treated as a sequence, initially in no specific order; any sequence modifiers are then applied to create another sequence. Finally, this latter sequence is used to generate one of the SPARQL result forms. The solution sequence from matching the query pattern is a collection formed from the solutions of the query pattern with no defined order.

A solution sequence modifier is one of:

Modifiers are applied in the order given by the list above.

Grammar rules:
[5]   SelectQuery   ::=   'SELECT' ( 'DISTINCT' | 'LOOSE' )? ( Var+ | '*' ) DatasetClause* WhereClause SolutionModifier
[14]   SolutionModifier   ::=   OrderClause? LimitOffsetClauses?
[15]   LimitOffsetClauses   ::=   ( LimitClause OffsetClause? | OffsetClause LimitClause? )
[16]   OrderClause   ::=   'ORDER' 'BY' OrderCondition+
[17]   OrderCondition   ::=   ( ( 'ASC' | 'DESC' ) BrackettedExpression )
| ( Constraint | Var )
[18]   LimitClause   ::=   'LIMIT' INTEGER
[19]   OffsetClause   ::=   'OFFSET' INTEGER

9.1 ORDER BY

The ORDER BY clause establishes the order of a solution sequence.

Following the ORDER BY clause is a sequence of order comparators, composed of an expression and an optional order modifier (either ASC() or DESC()). Each ordering comparator is either ASCENDING (indicated by the ASC() modifier, or no modifier) or DESCENDING (indicated by the DESC() modifier).

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT ?name
WHERE { ?x foaf:name ?name }
ORDER BY ?name
PREFIX     :    <http://example.org/ns#>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>

SELECT ?name
WHERE { ?x foaf:name ?name ; :empId ?emp }
ORDER BY DESC(?emp)
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT ?name
WHERE { ?x foaf:name ?name ; :empId ?emp }
ORDER BY ?name DESC(?emp)

The "<" operator (see the Operator Mapping) defines the relative order of pairs of numerics, simple literals, xsd:strings, xsd:booleans and xsd:dateTimes. IRIs are ordered by comparing their codepoint representations with the "<" operator.

SPARQL also defines a fixed, arbitrary order between some kinds of RDF terms that would not otherwise be ordered. This arbitrary order is necessary to provide consistent slicing of query solutions using LIMIT and OFFSET.

  1. (Lowest) no value assigned to the variable or expression in this solution.
  2. Blank nodes
  3. IRIs
  4. RDF literals
  5. A plain literal is lower than an RDF literal with type xsd:string of the same lexical form.

The ASCENDING order of two solutions with respect to an ordering comparator is established by substituting the solution bindings into the expressions and comparing them with the "<" operator. The DESCENDING order is the reverse of the ASCENDING order.

The relative order of two solutions is the relative order of the two solutions with respect to the first ordering comparator in the sequence. For solutions where the substitutions of the solution bindings produce the same RDF term, the order is the relative order of the two solutions with respect to the next ordering comparator. The relative order of two solutions is undefined if no order expression evaluated for the two solutions produces a distinct RDF term.

Ordering a sequence of solutions always results in a sequence with the same number of solutions in it.

Using ORDER BY on a solution sequence for a CONSTRUCT or DESCRIBE query has no direct effect because only SELECT returns a sequence of results. Used in combination with LIMIT and OFFSET, ORDER BY can be used to return results generated from a different slice of the solution sequence.

Grammar rules:
[16]   OrderClause   ::=   'ORDER' 'BY' OrderCondition+
[17]   OrderCondition   ::=   ( ( 'ASC' | 'DESC' ) BrackettedExpression )
| ( Constraint | Var )
[18]   LimitClause   ::=   'LIMIT' INTEGER
[19]   OffsetClause   ::=   'OFFSET' INTEGER

9.2 Projection

The solution sequence can be transformed into one involving only a subset of the variables. For each solution in the sequence, a new solution is formed using a specified selection of the variables.

The following example shows a query to extract just the names of people described in an RDF graph using FOAF properties.

@prefix foaf:        <http://xmlns.com/foaf/0.1/> .

_:a  foaf:name       "Alice" .
_:a  foaf:mbox       <mailto:alice@work.example> .

_:b  foaf:name       "Bob" .
_:b  foaf:mbox       <mailto:bob@work.example> .
PREFIX foaf:       <http://xmlns.com/foaf/0.1/>
SELECT ?name
WHERE
 { ?x foaf:name ?name }
name
"Bob"
"Alice"

9.3 DISTINCT

The solution sequence with no DISTINCT or LOOSE modifier is defined by the SPARQL algebra in 12 Definition of SPARQL:

@prefix  foaf:  <http://xmlns.com/foaf/0.1/> .

_:x    foaf:name   "Alice" .
_:x    foaf:mbox   <mailto:alice@example.com> .

_:y    foaf:name   "Alice" .
_:y    foaf:mbox   <mailto:asmith@example.com> .

_:z    foaf:name   "Alice" .
_:z    foaf:mbox   <mailto:alice.smith@example.com> .
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT ?name WHERE { ?x foaf:name ?name }
name
"Alice"
"Alice"
"Alice"

The DISTINCT solution modifier eliminates duplicate solutions. Specifically, each solution that binds the same variables to the same RDF terms as another solution is eliminated from the solution set.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name WHERE { ?x foaf:name ?name }
name
"Alice"

If DISTINCT and LIMIT or OFFSET are specified, then duplicates are eliminated before the limit or offset is applied.

9.4 LOOSE

While the DISTINCT modifier ensures that duplicate solutions are eliminated from the solution set, LOOSE simply permits them to be eliminated. The cardinality of any set of variable bindings in an LOOSE solution set is at least one and not more than the cardinality of the solution set with no DISTINCT or LOOSE modifier. For example, the query

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
SELECT LOOSE ?name WHERE { ?x foaf:name ?name }

may have one, two (shown here) or three solutions:

name
"Alice"
"Alice"

9.5 OFFSET

OFFSET causes the solutions generated to start after the specified number of solutions. An OFFSET of zero has no effect.

Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using ORDER BY.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT  ?name
WHERE   { ?x foaf:name ?name }
ORDER BY ?name
LIMIT   5
OFFSET  10

9.6 LIMIT

The LIMIT clause puts an upper bound on the number of solutions returned. If the number of actual solutions is greater than the limit, then at most the limit number of solutions will be returned.

PREFIX foaf:    <http://xmlns.com/foaf/0.1/>

SELECT ?name
WHERE { ?x foaf:name ?name }
LIMIT 20

A limit of 0 would cause no results to be returned. A limit may not be negative.

A.8 Grammar

The EBNF notation used in the grammar is defined in Extensible Markup Language (XML) 1.1 [XML11] section 6 Notation.

Keywords are matched in a case-insensitive manner with the exception of the keyword 'a' which, in line with Turtle and N3, is used in place of the IRI rdf:type (in full, http://www.w3.org/1999/02/22-rdf-syntax-ns#type).

Keywords:

BASE SELECT ORDER BY FROM GRAPH STR isURI
PREFIX CONSTRUCT LIMIT FROM NAMED OPTIONAL LANG isIRI
  DESCRIBE OFFSET WHERE UNION LANGMATCHES isLITERAL
  ASK DISTINCT   FILTER DATATYPE REGEX
        a BOUND true
          sameTERM false

Escape sequences are case sensitive.

When choosing a rule to match, the longest match is chosen.

[1]   Query   ::=   Prologue
( SelectQuery | ConstructQuery | DescribeQuery | AskQuery )
[2]   Prologue   ::=   BaseDecl? PrefixDecl*
[3]   BaseDecl   ::=   'BASE' Q_IRI_REF
[4]   PrefixDecl   ::=   'PREFIX' QNAME_NS Q_IRI_REF
[5]   SelectQuery   ::=   'SELECT' ( 'DISTINCT' | 'LOOSE' )? ( Var+ | '*' ) DatasetClause* WhereClause SolutionModifier
[6]   ConstructQuery   ::=   'CONSTRUCT' ConstructTemplate DatasetClause* WhereClause SolutionModifier
[7]   DescribeQuery   ::=   'DESCRIBE' ( VarOrIRIref+ | '*' ) DatasetClause* WhereClause? SolutionModifier
[8]   AskQuery   ::=   'ASK' DatasetClause* WhereClause
[9]   DatasetClause   ::=   'FROM' ( DefaultGraphClause | NamedGraphClause )
[10]   DefaultGraphClause   ::=   SourceSelector
[11]   NamedGraphClause   ::=   'NAMED' SourceSelector
[12]   SourceSelector   ::=   IRIref
[13]   WhereClause   ::=   'WHERE'? GroupGraphPattern
[14]   SolutionModifier   ::=   OrderClause? LimitOffsetClauses?
[15]   LimitOffsetClauses   ::=   ( LimitClause OffsetClause? | OffsetClause LimitClause? )
[16]   OrderClause   ::=   'ORDER' 'BY' OrderCondition+
[17]   OrderCondition   ::=   ( ( 'ASC' | 'DESC' ) BrackettedExpression )
| ( Constraint | Var )
[18]   LimitClause   ::=   'LIMIT' INTEGER
[19]   OffsetClause   ::=   'OFFSET' INTEGER
[20]   GroupGraphPattern   ::=   '{' TriplesBlock? ( ( GraphPatternNotTriples | Filter ) '.'? TriplesBlock? )* '}'
[21]   TriplesBlock   ::=   TriplesSameSubject ( '.' TriplesBlock? )?
[22]   GraphPatternNotTriples   ::=   OptionalGraphPattern | GroupOrUnionGraphPattern | GraphGraphPattern
[23]   OptionalGraphPattern   ::=   'OPTIONAL' GroupGraphPattern
[24]   GraphGraphPattern   ::=   'GRAPH' VarOrIRIref GroupGraphPattern
[25]   GroupOrUnionGraphPattern   ::=   GroupGraphPattern ( 'UNION' GroupGraphPattern )*
[26]   Filter   ::=   'FILTER' Constraint
[27]   Constraint   ::=   BrackettedExpression | BuiltInCall | FunctionCall
[28]   FunctionCall   ::=   IRIref ArgList
[29]   ArgList   ::=   ( NIL | '(' Expression ( ',' Expression )* ')' )
[30]   ConstructTemplate   ::=   '{' ConstructTriples? '}'
[31]   ConstructTriples   ::=   TriplesSameSubject ( '.' ConstructTriples? )?
[32]   TriplesSameSubject   ::=   VarOrTerm PropertyListNotEmpty | TriplesNode PropertyList
[33]   PropertyListNotEmpty   ::=   Verb ObjectList ( ';' ( Verb ObjectList )? )*
[34]   PropertyList   ::=   PropertyListNotEmpty?
[35]   ObjectList   ::=   Object ( ',' Object )*
[36]   Object   ::=   GraphNode
[37]   Verb   ::=   VarOrIRIref | 'a'
[38]   TriplesNode   ::=   Collection | BlankNodePropertyList
[39]   BlankNodePropertyList   ::=   '[' PropertyListNotEmpty ']'
[40]   Collection   ::=   '(' GraphNode+ ')'
[41]   GraphNode   ::=   VarOrTerm | TriplesNode
[42]   VarOrTerm   ::=   Var | GraphTerm
[43]   VarOrIRIref   ::=   Var | IRIref
[44]   Var   ::=   VAR1 | VAR2
[45]   GraphTerm   ::=   IRIref | RDFLiteral | NumericLiteral | BooleanLiteral | BlankNode | NIL
[46]   Expression   ::=   ConditionalOrExpression
[47]   ConditionalOrExpression   ::=   ConditionalAndExpression ( '||' ConditionalAndExpression )*
[48]   ConditionalAndExpression   ::=   ValueLogical ( '&&' ValueLogical )*
[49]   ValueLogical   ::=   RelationalExpression
[50]   RelationalExpression   ::=   NumericExpression ( '=' NumericExpression | '!=' NumericExpression | '<' NumericExpression | '>' NumericExpression | '<=' NumericExpression | '>=' NumericExpression )?
[51]   NumericExpression   ::=   AdditiveExpression
[52]   AdditiveExpression   ::=   MultiplicativeExpression ( '+' MultiplicativeExpression | '-' MultiplicativeExpression | NumericLiteralPositive | NumericLiteralNegative )*
[53]   MultiplicativeExpression   ::=   UnaryExpression ( '*' UnaryExpression | '/' UnaryExpression )*
[54]   UnaryExpression   ::=     '!' PrimaryExpression
| '+' PrimaryExpression
| '-' PrimaryExpression
| PrimaryExpression
[55]   PrimaryExpression   ::=   BrackettedExpression | BuiltInCall | IRIrefOrFunction | RDFLiteral | NumericLiteral | BooleanLiteral | Var
[56]   BrackettedExpression   ::=   '(' Expression ')'
[57]   BuiltInCall   ::=     'STR' '(' Expression ')'
| 'LANG' '(' Expression ')'
| 'LANGMATCHES' '(' Expression ',' Expression ')'
| 'DATATYPE' '(' Expression ')'
| 'BOUND' '(' Var ')'
| 'sameTerm' '(' Expression ',' Expression ')'
| 'isIRI' '(' Expression ')'
| 'isURI' '(' Expression ')'
| 'isBLANK' '(' Expression ')'
| 'isLITERAL' '(' Expression ')'
| RegexExpression
[58]   RegexExpression   ::=   'REGEX' '(' Expression ',' Expression ( ',' Expression )? ')'
[59]   IRIrefOrFunction   ::=   IRIref ArgList?
[60]   RDFLiteral   ::=   String ( LANGTAG | ( '^^' IRIref ) )?
[61]   NumericLiteral   ::=   NumericLiteralUnsigned | NumericLiteralPositive | NumericLiteralNegative
[62]   NumericLiteralUnsigned   ::=   INTEGER | DECIMAL | DOUBLE
[63]   NumericLiteralPositive   ::=   INTEGER_POSITIVE | DECIMAL_POSITIVE | DOUBLE_POSITIVE
[64]   NumericLiteralNegative   ::=   INTEGER_NEGATIVE | DECIMAL_NEGATIVE | DOUBLE_NEGATIVE
[65]   BooleanLiteral   ::=   'true' | 'false'
[66]   String   ::=   STRING_LITERAL1 | STRING_LITERAL2 | STRING_LITERAL_LONG1 | STRING_LITERAL_LONG2
[67]   IRIref   ::=   Q_IRI_REF | QName
[68]   QName   ::=   QNAME_LN | QNAME_NS
[69]   BlankNode   ::=   BLANK_NODE_LABEL | ANON
[70]   Q_IRI_REF   ::=   '<' ([^<>'{}|^`]-[#x00-#x20])* '>'
[71]   QNAME_NS   ::=   NCNAME_PREFIX? ':'
[72]   QNAME_LN   ::=   QNAME_NS NCNAME
[73]   BLANK_NODE_LABEL   ::=   '_:' NCNAME
[74]   VAR1   ::=   '?' VARNAME
[75]   VAR2   ::=   '$' VARNAME
[76]   LANGTAG   ::=   '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
[77]   INTEGER   ::=   [0-9]+
[78]   DECIMAL   ::=   [0-9]+ '.' [0-9]* | '.' [0-9]+
[79]   DOUBLE   ::=   [0-9]+ '.' [0-9]* EXPONENT | '.' ([0-9])+ EXPONENT | ([0-9])+ EXPONENT
[80]   INTEGER_POSITIVE   ::=   '+' INTEGER
[81]   DECIMAL_POSITIVE   ::=   '+' DECIMAL
[82]   DOUBLE_POSITIVE   ::=   '+' DOUBLE
[83]   INTEGER_NEGATIVE   ::=   '-' INTEGER
[84]   DECIMAL_NEGATIVE   ::=   '-' DECIMAL
[85]   DOUBLE_NEGATIVE   ::=   '-' DOUBLE
[86]   EXPONENT   ::=   [eE] [+-]? [0-9]+
[87]   STRING_LITERAL1   ::=   "'" ( ([^#x27#x5C#xA#xD]) | ECHAR )* "'"
[88]   STRING_LITERAL2   ::=   '"' ( ([^#x22#x5C#xA#xD]) | ECHAR )* '"'
[89]   STRING_LITERAL_LONG1   ::=   "'''" ( ( "'" | "''" )? ( [^'\] | ECHAR ) )* "'''"
[90]   STRING_LITERAL_LONG2   ::=   '"""' ( ( '"' | '""' )? ( [^"\] | ECHAR ) )* '"""'
[91]   ECHAR   ::=   '\' [tbnrf\"']
[92]   NIL   ::=   '(' WS* ')'
[93]   WS   ::=   #x20 | #x9 | #xD | #xA
[94]   ANON   ::=   '[' WS* ']'
[95]   NCCHAR1P   ::=   [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[96]   NCCHAR1   ::=   NCCHAR1P | '_'
[97]   VARNAME   ::=   ( NCCHAR1 | [0-9] ) ( NCCHAR1 | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040] )*
[98]   NCCHAR   ::=   NCCHAR1 | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
[99]   NCNAME_PREFIX   ::=   NCCHAR1P ((NCCHAR|'.')* NCCHAR)?
[100]   NCNAME   ::=   NCCHAR1 ((NCCHAR|'.')* NCCHAR)?

Notes:

  1. The SPARQL grammar is LL(1) when the rules with uppercased names are used as terminals.
  2. In signed numbers, no white space is allowed between the sign and the number. The AdditiveExpression grammar rule allows for this by covering the the two cases of an expression followed by a signed number. These produce an addition or substraction of the unsigned number as appropriate.

Some grammar files for some commonly used tools are available here (parsers/).