W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2006

Re: [Fwd: Re: my take on the formal semantics of SPARQL]

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Wed, 21 Jun 2006 09:26:41 +0100
Message-ID: <449902C1.5020403@hp.com>
To: Fred Zemke <fred.zemke@oracle.com>
CC: public-rdf-dawg@w3.org

(conversation placed into the WG email list)

Fred Zemke wrote:
> Attached is my draft on the formal semantics of SPARQL, circulated
> to AndyS, PatH and EricP privately, and now posted to this site.
> Please note that the attached paper is not a personal statement of
> what the formal semantics should be, nor is it a corporate position of 
> Oracle.
> It is merely my attempt to clarify what the document currently says.
> The following list tries to categorize to what extent this paper addresses
> my comments on the current CR.
> editorial comments on CR dated 6 april 2006
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0114.html
> -- editorial, I may have incidentally fixed some of these but that was
> not my objective in this exercise.
> Comments on 2.5.1 "General framework" in CR dated 6 April 2006
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0115.html
> -- addressed by my paper, section 3.9 "Changes to 2.5.1 General framework"
> comment on A.7 "Grammar"
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0117.html
> -- not addressed by my paper
> comment on CR 10.1 "solution sequences and result forms"
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0116.html
> and
> Re: comment on CR 10.1 "solution sequences and result forms"
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0123.html
> -- not addressed, I did not get to the section on result forms.
> Re: Draft response to: Re: major technical: blank nodes
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0128.html
> and
> Comments on 2.1.4 "Syntax for blank nodes" in 6 April 2006 CR
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0129.html
> -- addressed by section 3.2 "Changes to 2.1.4 Syntax for blank nodes",
> section 3.13 "Changes to 2.8.3 Blank nodes", and various examples of
> queries containing blank node identifiers throughout the paper.
> Comments on UNION matching (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0130.html
> -- partially addressed by my paper, section 3.19 "Changes to 6.2 UNION 
> matching -
> formal definition". 
> first part, asking for an exampe of a solution with cardinality greater 
> than 1:
> my paper does not provide such an example
> second part is technically addressed, in that my proposal does provide
> an answer to the question.  My paper actually gives a different answer
> from the one requested by the comment.  I wrote my paper this way because
> it seemed like the most literal interpretation of the existing 
> specification.
> I believe both answers are defensible; this is just a matter that needs 
> to be
> discussed and decided.  As I said, my paper is not staking out a
> personal or corporate position.
> Comments on 10.1 "Solution sequences and result forms" (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0131.html
> -- not addressed yet (I did not get as far as the result forms in my
> proposal)
> Comments on optional pattern matching (CR 6 apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0132.html
> -- first part, proposing the word "widen" instead of "add": my
> proposal did not touch this.  This is kind of editorial rather than
> substantive
> second part, that the current definition is logically equivalent to
> just finding the matches of the first pattern: addressed in my paper
> section 3.18 "Changes to 5.4 Optional matching - formal definition"
> General problem with cardinality of results in CR 6 April 2006
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0133.html
> -- addressed by my paper, which introduces the notion of cardinality
> and defines the cardinality of all solutions.
> Comments on 2.6 "Multiple Matches" (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0134.html
> -- addressed by my paper, by introducing the notion of cardinality
> of a solution (3.5 "New section 2.n, Introduction to graph pattern 
> semantics"),
> and by defining the cardinality for all possible queries in each section.
> The solutions of the empty pattern are specifically addressed in section 
> 3.14
> "Changes to 3.3 Value constraints - definition", last few paragraphs.
> Comment on 11.2 "filter evaluation" (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0135.html
> -- not addressed
> Comments on the proper domain for solutions (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0136.html
> -- addressed by my paper, section 3.3 "changes to 2.1.8 Result descriptions
> used in this document", and section 3.5 "New section 2.n, Introduction 
> to graph
> pattern semantics".
> Comments on 9 "Specifying RDF datasets" (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0137.html
> -- handled by my paper in sections 3.21 "specifying RDF datsets"
> through 3.24 "Combining FROM and FROM NAMED".
> Comment on 4.1 "Group graph patterns" (CR 6 Apr 2006)
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0141.html
> -- Addressed by my paper section 3.14 "Changes to 3.3 Value constraints -
> definition" at the bottom of page 11, with the paragraph that says
> "In terms of the technique to solve simple entailments by binding both 
> variables
> and blank node identifiers, one must note that the bindings of blank node
> identifiers are not global in scope; their scope is always confined to a 
> single
> BlockOfTriples."  As I look at this, I think it would be helpful to move 
> this
> paragraph to the end of SPARQL 2.5.2 "SPARQL basic graph pattern
> matching" for more visibility.
> However, I am not personally happy with this outcome; I wrote this
> paragraph only because the logic of the document seems to compel it.
> The question posed by
> the comment is that the entailment definition of basic graph pattern 
> matching
> implies that the scope of a blank node identifier is a BlockOfTriples, and
> consequently one cannot erase curly braces from { ?x <v1> _:a } { ?x 
> <v2> _:a },
> obtaining  ?x <v1> _:a . ?x <v2> _:a , and have the same semantics, because
> the blank node identifier _:a in the unerased group graph pattern represents
> two separate existentially quantified variables (in math logic terms) 
> whereas
> it represents a single existentially quantified variable in the pattern 
> with no braces. 
> In contrast, the existing paragraph in the current specification
> that describes the semantics of simple entailment as finding a
> function from variables and blank node identifiers to RDF terms can be 
> construed
> as saying that the blank node identifier _:a must be mapped to the same
> RDF term in each subpattern, in which case the erasure is semantically 
> correct.
> In my proposal, I opted to decide in favor of the "two scope"
> interpretation since that seems to be implied by the general framework for
> entailment and basic graph pattern matching.  However,
> I personally think that the ordinary user will find it non-intuitive 
> that he cannot
> erase the curly braces in the first expression without changing the 
> semantics.
> Perhaps the DAWG has already looked at this issue and has an agreed
> answer.  If so, that is fine; I am just pointing it out as an issue to be
> highlighted in the specification.
> Fred

Blank nodes and identifiers:

These form a significant part of the proposal.  It would seem that if we can 
sort out what is happening as given by section 2.5, the other comments about 
blank nodes will be more readily addressed.  Pat has offered to review the 
language on blank nodes/blank node identifiers.

Other sections:

1/ Proposal sections 3.4, 3.6, 3.8

One of the key themes here is the relationship of syntax and definition.  I 
agree that rq23 is weak on this.  Correct me if I'm wrong but you seem to 
propose that the syntax is used to drive the definitions whereas currently 
there is a collection of abstract definitions, then there is a syntax.  Would 
it instead work to add ties from the sections with abstract definitions 
referencing the grammar?

I prefer this because there may be future different syntax forms (e.g. XML or 
RDF): by keeping the abstract definitions central, these can be reused with 
different mapping to syntax.  Also, some communities are interested only in 
the abstract definitions, not the syntax.

2/ Cardinality of a solution.

I do not understand the use of this terminology.  A solution is a mapping of 
variable to RDF terms so that the pattern is true on the dataset.  Is 
cardinality the number of variables mapped or the number of such solutions?

For example in 3.18, if condition 2b hold the cardinality is one.  I don't see 
why the cardinality of S is constrained by condition 2b?  If cardinality is 
counting the number of solutions, I would have thought the cardinalities where:

If 2a) , then cardinality(optional) = cardinality(S) If 2b) , then 
cardinality(optional) = sum over all solutions S * cardinality(T given S).

3/ Section 3.18

The significant change here is where S is required not to be defined on the 
variables (variables introduced by ...) GGP.  This rules out the query:

"Get the name if there is one, else get the email address, else ..."

    { ... ?x .... }
    OPTIONAL { ?x foaf:name ?displayLabel }
    OPTIONAL { ?x foaf:mbox ?displayLabel }

because S may be defined on ?displayLabel yet not match (and hence not 
introduced new bindings for other variables) on the right hand side of the 
optional.  Have I understood correctly?

Received on Wednesday, 21 June 2006 08:26:56 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:00:51 UTC