[Fwd: FredZ draft comments] General comments

This is a mail about general comments to Fred Zemke's draft of
formalization of SPARQL semantics (June 16).

- jorge

---------------------------- Mensaje original ----------------------------
Asunto: draft comments
De:     Jorge  Pérez <jperez@ing.puc.cl>
Fecha:  Vie, 16 de Junio de 2006, 8:19 pm
Para:   "Fred Zemke" <fred.zemke@oracle.com>
--------------------------------------------------------------------------

[....]

I think that your treatment of blank nodes in queries simplify a lot the
treatment that is given in the current SPARQL spec. This helps a lot the
ones that (including me) still do not understand well the meaning of a
blank in query bodies. Do you have an example in wich blanks in a query
cannot be replaced by a simple use of variables?

I see some issues in the paper but in general they have to do more with
the intended meaning of the language than with your definitions.

The following are some specific comments. They are arranged in two groups:
simple comments (about typos and such minor things), and possible problems
or conflictive issues (from my  point of view) in definitions. Some of
this last comments are personal points of view or some conflicts with what
I understand about the current SPARQL spec.

    simple comments:

3.6.1 when defining S(TP) it says "... S(TP) is an RDF graph ..."
it would say "... S(TP) is an RDF triple ..." (or not?)

I think there is an inconsistent treatment for S(TP) and S(BGP). In 3.6
the definition says that S(TP) replace each blank node identifier in TP by
the blank node that it denotes. In 3.8 the definition of BGP supposes that
every blank node identifier is already replaced by the corresponding blank
node, and then one cannot apply the definition of S(TP) to define S(BGP).
I think this is a very little problem.

3.8.1 in the example there is a ':obj1' that dissapear in
the translation, and a ':obj2' appears.

3.8.1 in the basic graph pattern (the set) there are missings
'(' in the triples.

3.9.3 it says "G E-entails (G union S(BGP'))" i think that
BGP' mut be BGP (without ')

I was confused in some parts in the paper when using
english wording to denote things and its correpondence string
in the BNF grammar, for example TriplePattern vs triple pattern, etc.

    conflictive issues:

1) unbounded vs domain:
In 3.3.1 when defining a mapping i think that it would be
good to stablish the relationship between a variable not
being in the domain of a mapping and a variable being "unbounded"
in a mapping. what is the relationship? I supposed when reading
your paper that a variable is not in the domain iff it is
unbounded (is that right?). In  3.18.1 when defining OPTIONAL
matching it says "and S is undefined on all variables that ...",
do you mean "and S is unbounded in all variables that ..."? or may
be (if it is not equivalent) "and all variables that ... do not belong
to the domain of S"?

2) variables in solutions:
In 3.9 when defining pattern solution S for
BGP, the definition reads "The domian of S includes all variables
occuring in BGP", what about the variables not ocurring in BGP?
must they not belong to the domain of S (must be unbounded or
undefined)? or may be they could have arbitrary values? It is extremely
important to be specific in this case in order to define solutions
for more complex patterns. If they could have arbitary values then a query
like SELECT ?x ?y WHERE { ?x a b } would have a lot of solutions
at least one for every term in the dataset bounded to ?y every time
(?x,a,b) matches the dataset. In the other hand, i think that if
one force the solutions to bound only the variables that appear in
BGP then i think that your definition for the solution for GraphPattern
doesn't work properly....

3) General Framework vs simple entailment:
The mail about simple entailment that I send to you shows an issue
that appears when constructing solutions for simple querys following
the "General Framework"

4) Variables in Value Constraints:
In 3.14.3 when defining solutions for FileteredBasicGraphPattern
what happened if some of the variables in one of the Constaint_j
do not appear in any of the BlockOfTriples_k? is it defined in
section 11? (i must confess that i have not read in detail
section 11). What if a variable is unbounded? (i think in your
definition this case could not happen because constraints are
applied only to BlockOfTriples).

5) Solutions for empty pattern:
I think that defining the solutions for {} to be all mappings
is dangerous in a lot of ways. A simple example, for the query
{} UNION { TP } where TP is a triple pattern, then there are a
lot of solutions for the whole pattern and every solution that
matches TP will have cardinality 2 !! (by the definition of
cardinality of UNION and because every solution for {TP} is also
a solution for {}). Is this the definition of {} in the spec of
SPARQL? (I personally (as a user) hope from every
query language that 'empty' UNION A results in the same as if
I query simply A, whatever 'empty' means in the language)

6) GraphPattern vs OptionalPattern:
I think (I may be wrong) there is a
conflictive definition for OPTIONALs patterns. If one follows
the definition in 3.15 for GraphPattern, then S is a solution for
FBGP OPTIONAL GGP
if S is a solution for FBGP, and S is a solution for 'OPTIONAL GGP'
because from [20] GP ::= FBGP ( GPNT '.'? GP )? one obtain
'OPTIONAL GGP' from GPNT, but there is no definition in the document
for the solutions of 'OPTIONAL GGP' (!). On the other hand, in section 3.18
you define the solutions for FBGP OPTIONAL GGP in a different way.

7) Optional Pattern solutions and undefined variables:
When you define solution for FBGP OPTIONAL GGP, suppose S is
a solution that satisfies condition 1) and 2a), then this prevents
S to be a solution for any FBGP2 that shares a variable with GGP but
not with FBGP. A concrete example, for a pattern like
{ ?x a b . OPTIONAL { ?x c ?y }} . ?z d ?y
Suppose that S is a solution for the OPTIONAL pattern that satisfies 2a)
then it cannot be a solution for ?z d ?y because by definition
S must be undefined on ?y, and then any solution for the OPTIONAL
pattern that satisfies 2a) is not a solution for the whole pattern...
is that correct? if i am not making a mistake, are the solutions
for this pattern compatible with the current SPARQL spec?

8) Semantics vs Sintax:
(a very personal point of view)
I think it is a little dangerous to stay so close to the grammar when
defining semantics. A lot of grammars may result
in the same language, indeed, suppose that the WG decide to
change completely the grammar... it would result in a disaster for the
semantics!!! I agree with you that every syntactic construction
must have a clear meaning (semantics), but I prefeer to mantain the
definitions in an abstract level. In some parts of your paper you
left the grammar and go up to an abstract level, for example when
defining OPTIONAL and UNION, and the result (definitions) are very
clear. Consider too the comment 6) above...

If you could clarify any of my questions in the comments above I could
give you more feedback.

bye.
- jorge

Received on Thursday, 22 June 2006 14:49:39 UTC