W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2006

Review of "The SPARQL algebra"

From: Fred Zemke <fred.zemke@oracle.com>
Date: Wed, 27 Dec 2006 16:08:23 -0800
Message-ID: <45930AF7.5080107@oracle.com>
To: public-rdf-dawg@w3.org

review of "The SPARQL algebra"

Overall, I am pleased to see this.

substance
---------

X.1 Mapping a SPARQL query to a SPARQL abstract query
I believe this treatment omits what I would call step 0,
removing abbreviations.  In this step, the various abbreviated
syntaxes permitted by SPARQL would be removed.  
 From RQ24 section 3 "SPARQL term syntax" this includes:

1. IRI abbreviations using prefixes and IRIs that are
relative to the base IRI (section 3.1.1 "Syntax for IRIs")

2. abbreviated literals (section 3.1.2 "Syntax for literals")

3. perhaps normalize variables to begin with a question mark
rather than a dollar sign (section 3.1.3 "Syntax for query
variables) though the case can be made that the punctuation
is not part of the variable; the variable is the string
corresponding to rule [90] VARNAME.

4. blank nodes of the form [] and [ :p :v ]
(Section 3.1.4 "Syntax for blank nodes")

5. comma and semicolon delimited lists
(Section 3.2.1 "Predicate-object lists" and 3.2.2 "Object lists")

6. RDF collections (3.2.3 "RDF collections")

7. Keyword "a" (section 3.2.4 "rdf:type")

As a result of removing these abbreviations, the query will
conform to a simpler grammar which does not contain rule [22]
BasicGraphPattern nor rule [32]
TriplesSameSubject and many others that are referenced by
expanding it.  Instead we can substitute the rules

[22.1] ListOfTriplePatterns ::= TriplePattern ('.' TriplePattern )?

[32.1] TriplePattern ::= Subject PureVerb Object
[32.2] Subject ::= VarOrPureIRI
[32.3] PureVerb ::= Q_IRI_REF
[32.4] Object ::= VarOrPureIRI
[32.5] VarOrPureIRI ::= Var | Q_IRI_REF

This transformation is necessary before we can process
Step 1, which talks about "list of triple patterns" but does
not indicate how to get that list of triple patterns.

We might also elect to treat "*" as an abbreviation and remove
it at the outset.  This is probably just a question of taste,
whether to handle it here or later when the Project operator
is introduced.


Step 1: With the foregoing in place, we can replace
"list of triple patterns" by "ListOfTriplePatterns".

Below step 4, below the three bullets, the next sentence
has the words "graph pattern".  I believe you mean "GraphPattern"
as introduced prior to Step 1.


Step 4, indented para in smaller font says "This is the
left-associative operator assignment... can take a filter from
it's left subelement".  I believe you mean right subelement.
Also, spelling correction "it's" should be "its".
Possibly you want to capitalize "filter" (I'm not sure if you
mean "Filter" here as the proper noun introduced by the third
bullet above.)

Line immediately prior to the pseudocode for OPTIONAL:
"Write 'A' for an algebra expression: true for filter(true)".
Moving from left to right in this sentence, I believe:
-- you want a semicolon instead of a colon
-- you want "Filter" instead of "filter"
-- if so, Filter requires two arguments.  The second argument
can be the empty pattern {}
-- possibly you want to capitalize the first "true" to
distinguish it graphically from the second, which I take to be
a Boolean literal.

Pseudocode algorithm for interpreting OPTIONAL:
I think you should adopt and state a convention that distinguishes
assignment from equality testing.  Common conventions for
assignment are "Set x = y" or "x := y".
Common conventions for equality testing are "=" or "==".
The pseudocode currently uses "=" for both meanings.


X.1.2 Examples
Examples do not show removal of prefixes, which I think is
acceptable for space reasons, though I think the algorithm
should expand abbreviations.

Third example, second line, shows a couple
Join({}, BGP(...)).  When I work this example I do not get
such Joins.  I don't see that either Step 1 or Step 2 creates
any {} patterns.  I have not tried the other examples but I
wonder if all the {} that are shown are actually generated by
the algorithm.

I would like to see a worked example of my old bugbear
{...} UNION {...} OPTIONAL {...}

Last example shows FILTER in the right operand of OPTIONAL.
I'd like to see an example with FILTER in the left operand.
Perhaps a single example with FILTER in both would suffice.


X.2.1 Algebra operators
first @@ note says "Cardinality function is not defined for
mu not in Omega".  I am not sure that there would ever be an
attempt to evaluate Card[Omega](mu) if mu is not in Omega,
in which case it is okay for this to be undefined.
However, if desired, you could define Card[Omega](mu) = 0
if mu is not in Omega.

Definition of Diff
the second operand of the set union contains "mu + mu'"
but addition is not defined on mappings. You mean
"mu and mu' are compatible"


Definition of Union
Second sentence of this definition includes "Omega1 + Omega2".
It is not clear if this is intended to define addition on
multisets of mappings, or if it is assumed that addition is
already defined and can be taken as an alternative definition
of Union.  Either way, I think we can do without this.

Definition of ToList
last line, after semicolon, defining card, it should be
card[ToList(Omega)](mu) = card[Omega](mu).

Definintion of Project
the definintion of card is incorrect.  I believe it should be
card[Project(Psi,vars)](mu) =
   sum { card[Psi](mu') | mu' in Psi and mu = mu' restricted to vars }


X.3 Evaluation semantics

Definition of evaluation of a graph pattern.
For eval(D(G), Graph(var,P)) I don't think it is correct to
use set-union; instead we need a multiset-union, the distinction
being that multiset-union preserves cardinalities.  The
notion of multiset-union might be used profitably in section
X.2.1 "Algebra operators" in the definition of LeftJoin
as well.  



typos not noted sprinkled among the substance
---------------------------------------------

X.1.1 Mapping graph patterns
first sentence after second set of bullets:
"The results is" -> "The result is" or "The results are".
I prefer the former.

sentence continues "is a SPARQL abstract query uses..."
Insert "that" before "uses"

After step 4, third para, "a group pattern is mapping..."
change "mapping" to "mapped"

pseudocode for handling OPTIONAL: Filter is not always
capitalized.  I spot the third line "If SP[i] is..."


X.2.1 Algebra operators
definition of substitution function needs a colon between
mu and V.

Third sentence after that definition,
"Write mu(?x->t) for the substitution function mapping x to
RDF term : { (x, t) }"  needs "t" after "RDF term".

first @@ note

Definition of Union: the sentence before this says
"As this are distinct...".  What you mean is
"As these are disjoint".

Definition of OrderBy: second sentence "...and the sequence is
satisfies" - delete "is"

X.3 Evaluation semantics

Definition of evaluation of a union pattern
second occurrence of "union" should be capitalized


Fred
Received on Thursday, 28 December 2006 00:09:57 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT