W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2006

review of rq24-algebra

From: Lee Feigenbaum <feigenbl@us.ibm.com>
Date: Thu, 14 Dec 2006 14:59:21 -0500
To: public-rdf-dawg@w3.org
Message-ID: <OF9AE0334F.3E8E9CE8-ON85257244.0069394E-85257244.006DCD3D@us.ibm.com>

As promised. I've mainly focused on wording and presentation techniques in 
an effort to make this as clear as possible. The details of the algebra 
per se are being worked out as separate matters. 

Summary:I think it needs a bit more rigor in a few places I've tried to 
explain below. I think it's excellent overall and (IBM hat on) will make 
the specification far more accessible to implementors and those wishing to 
study the query language. When we combine it with the example-driven 
presentation of the rest of rq24, I think we'll be in very good shape 
indeed. 

Lee

--

X SPARQL The SPARQL Algebra

"""
The algebraic expression is derived from a query string by parsing and 
transforming the abstract syntax tree. 
"""

This sentence confuses me a bit as it can be read as "parsing ... the 
abstract syntax tree". Perhaps better as:

"""
Parsing a query string yields an abstract syntax tree which can be 
transformed into an algebraic expression.
"""


X.1.1 Mapping Graph Patterns


Having implemented the mapping based on rq23, this section is very 
appreciated by me as being helpful to implementors looking for guidance in 
associating the syntax of a query with its logical meaning. I think that 
the content in its look fine, but the presentation / elucidation needs 
some work to make the mapping from concrete to abstract syntax more 
formal. In particular, I think that we should completely avoid conflating 
concrete syntax tokens (e.g. GroupGraphPattern) with abstract syntax nodes 
(e.g. Join(...)). So I think that we should list the relevant concrete 
syntax rules (that's there already), list the abstract syntax nodes 
(that's sort of there in the bullet list, but it should explicitly list 
the names of the nodes -- "Join", "LeftJoin", etc.). Then when Step 1 says 
"Replace all BasicGraphPattern  elements by BGP(list of triple patterns)", 
"BGP" will be the name of an abstract syntax node already introduced. 
Similarly for the other steps.

[I read to the end; I think perhaps you're distinguishing between 
"abstract syntax tree" and "abstract query"? If that's the case, I don't 
think I agree with the distinction. I see a concrete syntax tree full of 
OPTIONAL and GroupGraphPattern and the like, and then an abstract syntax 
tree (== abstract query) full of Join() and LeftJoin(). The algebra 
operates over this latter tree.]

Step 4 - I still have a TODO to write up some examples to see if I can 
concerns about this step, but I'd really like to hear someone else's take 
on the algorithm in Step 4.

Step 5 Simplification - While this doesn't hurt, we don't really need to 
specify this sort of trivial transformation -- the evaluation of the 
algebra should yield the same result regardless of how simplified the 
abstract syntax tree is.

X.1.2

At least one example (preferably one involving left associativity) would 
be useful which included the full transform of query string > concrete 
syntax tree > abstract syntax tree

X.1.3

There's a typo - "Slide" should be "Slice"

[There's other typos in the document too, but I'm not enumerating them 
until we've agreed on overall structure / semantics and integrated the 
resulting text into rq24 proper.]

X.1.3 could use the same, formal treatment which maps from concrete syntax 
to abstract expressions that X.1.1 receives.


X.2.1

Filter:
  We don't define what mu(expression) means anywhere before this. It's 
somewhat obvious, but should at least be defined. Also: "boolean effective 
value" --> "effective boolean value". 

Join:
  You combine solutions here using just "x union y" rather than "x 
set-union y" or "merge(x, y)" . Is this purposeful?

X.3

I like the concept of "active graph." If I recall correctly, the current 
spec. is worded such that GRAPH <g> { ... } - basically sets a new default 
graph for the stuff inside { ... }, and then BGP matching is defined to be 
against the default graph. If we adopt this active graph terminology, we 
should use it throughout, and make sure the BGP matching section refers to 
the active graph.
Received on Thursday, 14 December 2006 19:59:38 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT