[CLOSED] Re: [OK?] Re: Last Call for comments on "SPARQL Query Language for RDF" from Lee Feigenbaum on 2007-05-10 (public-rdf-dawg-comments@w3.org from May 2007)

From: Lee Feigenbaum <feigenbl@us.ibm.com>
Date: Thu, 10 May 2007 14:07:16 -0600
To: axel@polleres.net
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <OF2A34DF71.2C05A13B-ON872572D7.006E706B-872572D7.006E879D@us.ibm.com>
Axel Polleres <axel.polleres@deri.org> wrote on 05/10/2007 01:59:58 PM:

> All,
> 
> I got the chance to discuss the comments directly with Eric here at WWW.
> I see that the WG wants to close this and will try to not cause more 
> trouble ... :-)

:-)
 
> hope still the comments were helpful and let me emphasize agein what I 
> already said to Eric: If there's a follow-up after the first Rec on 
> issues like aggregates, etc. I 'd be very glad to join the WG
> (at the moment I guess I'm busy enough with RIF)!

thanks, Axel. I think that, in the meantime, public-sparql-dev@w3.org is a 
good place for users and implementors to discuss possible "next gen" 
SPARQL features.

Lee

> 
> best,
> Axel
> 
> Lee Feigenbaum wrote:
> > Axel Polleres <axel.polleres@deri.org> wrote on 05/03/2007 07:55:51 
PM:
> > 
> > 
> >>Where asked back, replies inline.
> > 
> > 
> > Hello Axel,
> > 
> > Thanks for a timely reply. We've responded to the points that remain 
> > unaddressed below inline (and have cut the remaining text for 
brevity). 
> > Please let us know if this response satisfies you. If it does, you can 

> > help our comment tracking by replying to this message and adding 
[CLOSED] 
> > in the subject line. Again, in the interests of our schedule, we'd 
like to 
> > ask that you get back to us as soon as possible, and if we do not hear 

> > from you in 10 days we will consider these comments closed.
> > 
> > Lee
> > 
> > 
> >>Lee Feigenbaum wrote:
> >>
> >>>Axel Poleres wrote on 04/17/2007 12:53:55 PM:
> >>>
> >>>
> >>>>Dear all,
> >>>>
> >>>>below my review on the current SPARQL draft from
> >>>>
> >>>>http://www.w3.org/TR/rdf-sparql-query/
> >>>>
> >>>>on behalf of W3C member organization DERI Galway.
> > 
> > 
> > ...
> > 
> > 
> >>>>Section 5
> >>>>
> >>>>Section 5 is called Graph patterns and has only subsections
> >>>>5.1 and 5.2 for basic and group patterns, whereas the other types 
are
> >>>>devoted separate top level sections.. this structuring seems a bit
> >>>>unlogical.
> >>>
> >>>
> >>>In the interest of keeping the document numbering as is, we've 
decided 
> > 
> > to 
> > 
> >>>keep this section as is. If you have a better suggestion for the name 

> > 
> > of 
> > 
> >>>the section, we'd be glad to hear it. ("Basic Graph Patterns and 
Group 
> > 
> > 
> >>>Graph Patterns" does not seem particularly helpful to a reader.)
> >>
> >>
> >>I'd suggest to have two separate top level sections.
> >>"In the interest of keeping the document numbering as is"
> >>is not an argument which seems very logical to me, to be honest.
> > 
> > 
> > We've added a line in the introduction explaining the relationship of 
the 
> > two topics covered in Section 5:
> > 
> > """
> > In this section we describe the two forms that combine patterns by 
> > conjunction: basic graph patterns, which combine triples patterns, and 

> > group graph patterns, which combine all other graph patterns.
> > """
> > 
> > There is a close-but-different relationship between the two types of 
graph 
> > patterns and we feel that keeping one section helps keep this clear. 
The 
> > editors are not motivated to split Section 5 into two top-level 
sections. 
> > (Do note, however, that editorial changes are permitted during CR, and 

> > anyone submitting proposed changes will of course be given due 
> > consideration.)
> > 
> > ...
> > 
> >>>>Another one about FILTERs: What about this one, ie. a FILTER which
> >>>>refers to the outside scope:
> >>>>
> >>>>?x p o OPTIONAL { FILTER (?x != s) }
> >>>>
> >>>>concrete example:
> >>>>
> >>>>SELECT ?n ?m
> >>>>{ ?x a foaf:Person .  ?x foaf:name ?n .
> >>>>  OPTIONAL { ?x foaf:mbox ?m FILTER (?n != "John Doe") }  }
> > 
> > 
> > Call this query [X].
> > 
> > 
> >>>>Supresses the email address for John Doe in the output!
> >>>>Note: This one is interesting, since the OPTIONAL part may NOT be
> >>>>evaluated separately!, but carries over a binding from the
> >>>>super-pattern! 
> >>>
> >>>
> >>>A filter in the optional part of an OPTIONAL construct applies to the
> >>>solutions from the required part as (possibly) extended by the 
> > 
> > optional
> > 
> >>>part. In the algebra, the example above becomes:
> >>>
> >>>LeftJoin(
> >>>  BGP(?x a foaf:Person .  ?x foaf:name ?n),
> >>>  BGP(?x foaf:mbox ?m),
> >>>  (?n != "John Doe")
> >>>)
> >>
> >>Hmmm, does this mean, that the query would simple be the same as 
writing
> >>
> >>SELECT ?n ?m
> >>{ ?x a foaf:Person .  ?x foaf:name ?n . FILTER (?n != "John Doe")
> >>   OPTIONAL { ?x foaf:mbox ?m }  }
> > 
> > 
> > Call this query [Y].
> > 
> > 
> >>in this case?
> > 
> > 
> > No. This latter query, [Y], is:
> > 
> > Filter(?n != "John Doe",
> >   LeftJoin(
> >     BGP(?x a foaf:Person .  ?x foaf:name ?n),
> >     BGP(?x foaf:mbox ?m),
> >     true
> >   )
> > )
> > 
> > 
> >>(How) would it be possible then to encode my intended meaning of the 
> >>query, ie. that I want to give all names, but supress the email 
address 
> >>of John Doe?
> > 
> > 
> > The original query, [X], has these semantics. The second query, [Y], 
does 
> > not.
> > 
> > ... 
> > 
> >>>>Would it make sense to add some non-well-defined OPTIONAL patterns,
> >>>>following [Perez et al. 2006] in the document? As mentioned before, 
I
> >>>>didn't yet check section 12, maybe these corner case examples are
> >>>>there.. 
> >>
> >> > We're not motivated to add these
> >> > examples to the document.
> >>
> >>Why? I would object here, but not being part of the WG, I have to 
leave 
> >>this decision to you of course.
> > 
> > 
> > The editors do not believe that such an example would add to the 
quality 
> > of the specification.
> > 
> > ...
> > 
> >>>>Section 9:
> >>>>
> >>>>What is "reduced" good for? I personally would tend to make reduced
> >>>>the default, and instead put a modifier "STRICT" or "WITHDUPLICATES"
> >>>>which enforces that ALL non-unique solutions are displayed.
> >>>
> >>>REDUCED can be used to permit certain optimizations by the SPARQL 
> > 
> > query 
> > 
> >>>engine. The WG discussed various design options in this space 
> > 
> > including 
> > 
> >>>the design you are suggesting, and decided to add the REDUCED keyword 

> > 
> > and 
> > 
> >>>mark the feature at-risk. More information:
> >>>
> >>>http://lists.w3.org/Archives/Public/public-rdf-
> >>
> >>dawg/2007JanMar/att-0194/20-dawg-minutes.html#item02
> >>
> > 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/0162.html
> > 
> > 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/0128.html
> > 
> >>>...and surrounding.
> >>
> >>By making REDUCED the exception and all tuples with duplicates the 
> >>default, you somewhat implicitly single out implementations which use 
> >>per-set rather than per-tuple strategy with that, in my opinion. I 
find 
> >>this limiting.
> > 
> > 
> > This was an intentional choice by the WG, considering all of the 
> > information you have mentioned. I do not see any new information here 
at 
> > this time to ask the WG to reconsider this decision.
> > 
> > ...
> > 
> >>>>Section 9.1
> >>>>
> >>>>The ORDER BY construct allows arbitrary constraints/expressions as
> >>>>parameter...ie. you could give an arbitrary constraint condition 
here,
> >>>>right? What is the order of that? TRUE > FALSE? Would be good to add
> >>>>a remark on that. 
> >>>
> >>>
> >>>This is for generality because semantic web data is not as structured 

> > 
> > (and 
> > 
> >>>typed) as a database.  It allows the query to proceed without an 
error 
> > 
> > 
> >>>condition so it generates some defined outcome.
> >>>
> >>>SPARQL doesn't provide total ordering, but the example you asked 
about 
> > 
> > is 
> > 
> >>>specified.
> >>>
> >>>[[
> >>>The "<" operator (see the Operator Mapping) defines the relative 
order 
> > 
> > of 
> > 
> >>>pairs of numerics, simple literals, xsd:strings, xsd:booleans and 
> >>>xsd:dateTimes.
> >>>]]
> >>>
> >>>"<" operator in the Operator Mapping table has an entry for
> >>>  A < B  xsd:boolean  xsd:boolean  op:boolean-less-than(A, B)
> >>>op:boolean-less-than is defined in XPath Functions and Operators
> >>>  http://www.w3.org/TR/xpath-functions/#func-boolean-less-than
> >>>
> >>>[[
> >>>Summary: Returns true if $arg1 is false and $arg2 is true. Otherwise, 

> >>>returns false.
> >>>]]
> >>>
> >>>I think the LC document specifies all the orderings intended by the 
> > 
> > DAWG, 
> > 
> >>>but am certainly open to counter-example.
> >>
> >>What I meant to say was that a short clarifying remark would keep 
> >>readers from having to look through the separate spec.
> > 
> > 
> > I believe that the intuitive interpretation of < is enough to give 
people 
> > a good understanding. As SPARQL specifically uses XPath functions and 
> > operators (for user familiarity, library re-use, and to leverage 
reviewed 
> > specifications), we can't replace them with our own definitions. 
> > Bulk-including those sections of the XPath spec would be costly and 
> > confusing.
> > 
> > As frustrating as it may appear, I think this is optimized.
> > 
> > ...
> > 
> >>>ASK doesn't permit a SolutionModifier. Adding ASK there could imply 
> > 
> > that 
> > 
> >>>it was allowed and even had some effect (other than syntax error).
> >>
> >>wouldn't that be worth a footnote then, maybe?
> > 
> > 
> > I reluctantly added a sentence to the end:
> > [[
> > Using ORDER BY on a solution sequence for a CONSTRUCT or DESCRIBE 
query 
> > has no direct effect because only SELECT returns a sequence of 
results. 
> > Used in combination with LIMIT and OFFSET, ORDER BY can be used to 
return 
> > results generated from a different slice of the solution sequence. An 
ASK 
> > query does not include ORDER BY, LIMIT or OFFSET.
> > ]]
> > ...
> > 
> > 
> >>>>Sec 9.2:
> >>>>
> >>>>Add somewhere in the prose: "using the SELECT result form"...
> >>>>
> >>>>It is actually a bit weird that you mix select into the solution
> >>>>modifiers, IMO, it would be better to mention SELECT first in 
section
> >>>>9 
> >>>>and then introducing the solution modifiers.
> >>>
> >>>
> >>>
> >>>SELECT is both an indicator of the query result form and also 
contains 
> > 
> > the 
> > 
> >>>projection.
> >>
> >>yes, that's my point.
> > 
> > 
> > We see no reason to make a change. This has been part of the SPARQL 
> > specification for a long time, and the experience of the community 
seems 
> > to indicate that it is comprehensible.
> > 
> > ... 
> > 
> >>>>Sec 9.5/9.6:
> >>>>
> >>>>OFFSET 0 has no effect, LIMIT 0 obviously makes no sense since the
> >>>>answer is always the empty solution set... So why for both not 
simply
> >>>>only allowing positive integers? I see no benefit in allowing 0 at
> >>>>all. 
> >>>
> >>>
> >>>The WG believes that allowing 0 eases the burden on programmatically 
> >>>generated queries.
> >>
> >>What is the justification for this belief if I may ask?
> > 
> > 
> > The belief arises from implementation experience by various members of 
the 
> > workgroup.
> > 
> > (For example:
> > 
> >     query = sprintf("SELECT ... LIMIT %d OFFSET %d", limit, offset);
> > is easier than
> >     if (offset == 0) {
> >         query = sprintf("SELECT ... ");
> >     } else {
> >         query = sprintf("SELECT ... LIMIT %d OFFSET %d", limit, 
offset);
> >     }
> > )
> > 
> > Google, for another instance, serves from offset 0:
> >   http://www.google.com/search?q=search&hl=en&start=0&sa=N
> > 
> > 
> >>>>Section 10.2
> >>>>
> >>>>CONSTRUCT combines triples "by set union"?
> >>>>So, I need to eliminate duplicate triples if I want to implement
> >>>>CONSTRUCT in my SPARQL engine?
> >>>>Is this really what you wanted? In case of doubt, I'd suggest to
> >>>>remove "by set union", or respectively, analogously to SELECT,
> >>>>introduce a DISTINCT (or alternatively a WITHDUPLICATES)
> >>>>modifier for CONSTRUCT...
> >>>
> >>>
> >>>A set represented with duplicate triples is identical to a 
> > 
> > representation 
> > 
> >>>without any duplicates, 
> >>
> >>no, it is not identical if viewed as dataset for another query: if I 
> >>apply another (SELECT) query on the output of the CONSTRUCT  - which 
> >>again is RDF, so why not? - then ther is potentially a difference (see 

> >>the distinct/reduced issue)
> > 
> > 
> > See below.
> > 
> > 
> >>>so I believe the text is correct as written.  That 
> >>>is, the following are representations of the same graph:
> >>>
> >>><x> <y> <z> .
> >>>
> >>>and
> >>>
> >>><x> <y> <z> .
> >>><x> <y> <z> .
> >>
> >>if I ask aquery with solution modifiers on these two graphs, then it 
is 
> >>not the same! Attention!
> > 
> > 
> > No, the above are two representations of the same RDF graph. (A graph 
with 
> > a single triple.) Any SPARQL query against either of these two 
> > representations of the same graph will have the same solutions.
> > 
> > 
> >>>>BTW, I miss the semantics for CONSTRUCT given formally in Section 
12.
> >>>
> >>>
> >>>We do not right now intend to include CONSTRUCT in Section 12. 
> > 
> > CONSTRUCT 
> > 
> >>>is defined normatively in section 10.2. ( 
> >>>http://www.w3.org/TR/rdf-sparql-query/#construct ).
> >>
> >>I fail to find a definition of the formal semantics of CONSTRUCT 
there.
> >>
> >>CONSTRUCT is likely one of the things which people will pick up very 
> >>fast...so it would be good to have this more formal, I think.
> > 
> > 
> > The group has decided not to pursue a rigorous treatment of CONSTRUCT 
in 
> > Section 12 at this time. To do so would require a great deal of new 
work 
> > and review, and would put our schedule in serious jeopardy. We believe 

> > that the semantics specified in Section 10.2 sufficiently specify 
> > CONSTRUCT and will lead to interoperable implementations. 
> > 
> > ... 
> > 
> >>>>In the definition of compatible mappings, you might want to change
> >>>>
> >>>>"every variable v in dom(&mu;1) and in dom(&mu;2)"
> >>>>to
> >>>>"every variable v &isin;  dom(&mu;1) &cap; dom(&mu;2)"
> >>>>
> >>>>"Write merge(&mu;1, &mu;2) for &mu;1 set-union &mu;2"
> >>>>
> >>>>Why not use the symbol &cup; here?
> >>>
> >>>
> >>>As noted above the reliance on some symbols being available is not 
> > 
> > safe 
> > 
> >>>across enough brower and locale setups.  We are striking a balance 
> > 
> > here.
> > 
> >>and &mu; is safe?
> > 
> > 
> > &mu; is safer.  Correct display of, say, &cup; is less common than 
than 
> > &mu; 
> > The W3C document style does not set the font family for display.
> > 
> > ...
> > 
> >>>>12.5
> >>>>
> >>>>The operator List(P) is nowhere defined.
> >>>>I still don't have totally clear why you need to introduce the 
ToList
> >>>>operator. 
> >>>
> >>>
> >>>Already discussed.
> >>
> >>Also that "List(P)" is not defined?
> > 
> > 
> > This is already fixed in the editors' working draft.
> > 
> > ... 
> > 
> >>>>A general comment:
> >>>>
> >>>>I miss a section defining the *Semantics of a query* and of 
different
> >>>>result forms. The Evaluation semantics given here rather is a mix of
> >>>>functions having partly multisets of solution mappings and sequences
> >>>>thereof as result, 
> >>>>but all are called "eval()".
> >>>> E.g. eval for BGP returns a multiset, whereas eval returns a list
> >>>>for ToList, etc. 
> >>>>
> >>>>The semantics of a *query* is not really clearly defined yet, it
> >>>>seems. This needs another revision, I guess.
> >>
> >>no response here?
> > 
> > 
> > 
> > Sec. 12 intro says:
> > 
> > """
> > This section defines the correct behavior for evaluation of graph 
patterns 
> > and 
> > solution modifiers, given a query string and an RDF dataset. It does 
not 
> > imply 
> > a SPARQL implementation must use the process defined here.
> > """
> > 
> > 
> >>>>In the "Notes", item (d):
> >>>>
> >>>>"the current state of the art in OWL-DL querying focusses on the 
case
> >>>>where answer bindings to blank nodes are prohibited."
> >>>>
> >>>>It would be helpful to give references here.
> >>>
> >>>
> >>>The notes highlight the working assumptions.  I don't think 
references 
> > 
> > 
> >>>would change that.  This is a diference between an acedemic paper and 

> > 
> > a 
> > 
> >>>specification.
> >>
> >>You mean that a specification shouldn't follow general rules of style 
> >>which make the reader more comfortable (such as for instance 
> >>references)? disagree, to be honest.
> > 
> > 
> > Technology specifications do not follow the same style rules as 
academic 
> > papers, largely because the two have different goals. The editors do 
not 
> > believe that references to OWL-DL querying work would improve the 
> > specification.
> > 
> > thanks again,
> > Lee
> > 
> > 
> > 
> 
> 
> -- 
> Dr. Axel Polleres
> email: axel@polleres.net  url: http://www.polleres.net/
> 
>
Received on Thursday, 10 May 2007 20:07:31 UTC