W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2005

Re: pls consider comments on disjunction

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Fri, 25 Mar 2005 20:31:49 +0000
Message-ID: <42447535.5030209@hp.com>
To: "Thompson, Bryan B." <BRYAN.B.THOMPSON@saic.com>
Cc: "Personick, Michael R." <MICHAEL.R.PERSONICK@saic.com>, ''''RDF Data Access Working Group ' ' ' ' <public-rdf-dawg@w3.org>, "Bebee, Bradley R." <BRADLEY.R.BEBEE@saic.com>



Thompson, Bryan B. wrote:
> Andy,
> 
> Could you clarify your comment that the "semantics are very similar
> to SQL"?  I.e., what are the differences from the expectations of a
> SQL programmer?

In SQL, the two sides of a UNION have to be "union compatible" tables.  To 
get this, each subquery has a SELECT to line up the columns needed and 
exclude unwanted columns.

In SPARQL, there is no typing - RDF isn't typed and we allow solutions to 
have different numbers of bound variables.  Union does not require the SELECT.

The difference between UNION and UNION ALL is not relevant as we have the 
selection of variables outside the union.

It's absence of the SQL-needed SELECT syntax that seems to be at least part 
of the confusion.  I believe that part has been cleared up.

	Andy

> 
> There has been substantial confusion over the semantics of UNION, esp.
> the thread with Bob MacGregor started from [1].  This is a point where
> I think it is very important for the spec to be clear and unambiguous.
> 
> [1]
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/0034.ht
> ml
> 
> Thanks,
> 
> -bryan
> 
> -----Original Message-----
> From: Personick, Michael R.
> To: 'Seaborne, Andy '
> Cc: '''RDF Data Access Working Group ' ' '; Bebee, Bradley R.; Thompson,
> Bryan B.
> Sent: 3/25/2005 10:24 AM
> Subject: RE: pls consider comments on disjunction
> 
> Andy,
> 
> Thanks for your response - I appreciate the clear explanation. 
> 
> The spec has a note under the section for UNION that confused me a bit:
> "The working group decided on this design and closed the disjunction
> issue without reaching consensus. The objection was that adding UNION
> would complicate implementation and discourage adoption. If you have
> input to this aspect of the SPARQL that the working group has not yet
> considered, please send a comment to public-rdf-dawg-comments@w3.org."
> To me that meant that UNION was out.
> 
> From your response, UNION seems the most clean and clear way of writing
> the example query. I also like UNION because to me it most closely
> resembles traditional OR operator semantics. I hope that it makes it
> through into the final spec.
> 
> I was not aware of the value-disjunction approach either - that is good
> to know. Is FILTER a new addition? I cannot find it in
> http://www.w3.org/TR/rdf-sparql-query/.
> 
> thanks,
> Mike
> 
> -----Original Message-----
> From: Seaborne, Andy
> To: Personick, Michael R.
> Cc: ''RDF Data Access Working Group ' '; Bebee, Bradley R.
> Sent: 3/25/2005 9:54 AM
> Subject: Re: pls consider comments on disjunction
> 
> Personick, Michael R. wrote:
> 
>>I second Bryan's motion and can expand on his point just a bit.
>>
>>A bit of background: The system I am trying to implement with RDF and
> 
> OWL is
> 
>>an attempt to federate and semantically align multiple legacy
> 
> relational
> 
>>databases with different schemas. Developers on our team previously
>>transformed (replicated) new data sets into a common relational schema
> 
> and
> 
>>wrote business tier applications to that particular schema. The only
> 
> real
> 
>>way to consider multiple data sets simultaneously was to hand-jam them
> 
> into
> 
>>a single relational instance. I am trying to give them a new data
> 
> access
> 
>>layer using Semantic Web technologies to: a) avoid replicating
> 
> everything
> 
>>into the common relational schema and b) make it easier to consider
> 
> multiple
> 
>>data sources at once.
>>
>>When I bring a new developer onto the team and they start writing
> 
> queries
> 
>>using an RDF query language (RDQL up to this point), within the first
> 
> few
> 
>>hours they invariably ask me how to do OR in RDF queries. I tell them
> 
> that,
> 
>>well, there is no OR. At least not the way they are used to (coming
> 
> from SQL
> 
>>where it's a simple construct). But you can use "nested optionals" or
>>"demorgan's theorem" to accomplish the same thing. Blank stares. I
> 
> explain,
> 
>>well this is how I've been doing it:
>>
>>When you need to do an OR and wish you could just do this:
>>
>>construct *
>>where (?evidence <myns:memberOf> <myns:ThisGroup>) OR
>>      (?evidence <myns:memberOf> <myns:ThatGroup>)
>>
>>Instead do this:
>>
>>construct *
>>where (?evidence <myns:memberOf> ?group)
>>and  !(?group ne <myns:ThisGroup> &&
>>       ?group ne <myns:ThatGroup>)
>>
>>This is a very simple case, but occassionally we end up with extremely
>>complicated and hard-to-debug queries by doing it this way. I was very
>>excited when I saw that the problem might be solved by UNION in
> 
> SPARQL:
> 
>>construct *
>>where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>>      (?evidence <myns:memberOf> <myns:ThatGroup>)
>>
>>But now I understand that this is no longer in the spec or UNION does
> 
> not
> 
>>mean what I think it means?
> 
> 
> Mike,
> 
> Thank you for the comments.
> 
> I think the feature you are referring to is either a value test
> involving ||
> (like programming language ||) or the UNION construct.  Both are in the
> editors' draft at the moment.
> 
> UNION was briefly OR but there was sufficient confusion that it was
> changed
> to UNION.  The semantics are very similar to SQL e.g.
> 
> The SPARQL pattern:   { pat1 } UNION { pat2 }
> is like an SQL query (possibly involving further subqueries):
> 
>     SELECT FROM/WHERE pat1'
>       UNION
>     SELECT FROM/WHERE pat2'
> 
> where pat1' and pat2' are the SQL rewrites of the SPARQL part and these
> may
> involve SQL subqueries.
> 
> SPARQL does not need the explicit projection in the subqueries because
> RDF
> is not typed and so there are no union compatibility rules as there are
> in 
> SQL.  SPARQL union is based on variable names, not column type.
> 
> The editors' text is to be found at:
> http://www.w3.org/2001/sw/DataAccess/rq23/#alternatives
> 
> 
> The example you give is the
> 
> 
>>where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>>      (?evidence <myns:memberOf> <myns:ThatGroup>)
> 
> 
> becomes
> 
>   where {
>             { ?evidence  myns:memberOf  myns:ThisGroup }
>           UNION
>             { ?evidence  myns:memberOf  myns:ThatGroup }
>        }
> 
> 
> I would note that your particular example query can also be written
> 
> construct { ?evidence myns:memberOf ?group }
> where { ?evidence myns:memberOf ?group .
>          FILTER ?group = myns:ThatGroup || ?group = myns:ThisGroup
>        }
> 
> The "=" operator is general equality and will compare URIs.
> 
> This might be more natural to those coming from SQL or the union form
> may be
> more natural.  This approach of using value-operator disjunction also
> works
> in RDQL.  As the patterns in the union increase in size, it can get very
> unwieldy to do it as value disjunction; with several variables in the 
> subpatterns it will become very easy to make mistakes in the complex 
> expressions needed.
> 
> A query processor may wish to optimize into either the value-operator
> form
> or into the concatenated subquery form based on its facilities and what
> it
> can do fast. Allowing the application programmer to write their request
> clearly and succinctly is important.
> 
> 
> The original motivating example for UNION is for variations in use of
> Dublin
> Core v1.0 and v1.1:
> 
> # application does not need to know which version of the property was
> used.
> SELECT ?title
> WHERE {
>          { ?x dc10:title ?title } UNION { ?x dc11:title ?title }
>        }
> 
> 
> # application does want to know which version of the property was used.
> SELECT ?title10 ?title11
> WHERE {
>          { ?x dc10:title ?title10 } UNION { ?x dc11:title ?title11 }
>        }
> 
> which is close to your example.
> 
> 
> Separate issue:
> 
> I changed the "construct *" in your example because that isn't currently
> in 
> rq23 because it does not work when a query involved GRAPH.  If you have
> any 
> feedback on that, please let the comments list know, ideally on a new
> thread.
> 
> 	Andy
> 
> 
>>I also understand that union/disjuntion/or can be accomplished by the
> 
> method
> 
>>I illustrated above and also by nested optionals (although I haven't
> 
> seen a
> 
>>simple explanation of how). Regardless, why is the burden on me to
> 
> learn how
> 
>>to do OR a totally new way? If wide acceptance of the language is a
> 
> goal and
> 
>>it's well understood how to accomplish OR through nested optionals,
> 
> why not
> 
>>just give the user an OR and then let a query translator/optimizer
> 
> sort out
> 
>>rewriting the query using nested optionals?
>>
>>Sincerely,
>>Mike Personick
>>Science Applications International Corp.
>>
>>-----Original Message-----
>>From: Thompson, Bryan B.
>>To: 'Dan Connolly '; 'public-rdf-dawg-request@w3.org '; 'RDF Data
> 
> Access
> 
>>Working Group '
>>Cc: Bebee, Bradley R.; Personick, Michael R.
>>Sent: 3/24/2005 2:12 PM
>>Subject: RE: pls consider comments on disjunction
>>
>>Dan,
>>
>>I am in favor of re-opening this issue.  I think that Bob has made
>>several very good points and there is pretty consistent input from
>>the comments list that we need to respect traditional semantics for
>>core operators (AND, OR, NOT).
>>
>>>From our own experience using SPARQL prototypes, we spend a lot of
>>time re-writing queries that require disjunction using an combination
>>of AND and NOT.
>>
>>-bryan
>>
>>-----Original Message-----
>>From: public-rdf-dawg-request@w3.org
>>To: RDF Data Access Working Group
>>Sent: 3/24/2005 2:03 PM
>>Subject: pls consider comments on disjunction
>>
>>
>>Most of the comments continue to get handled by the editors etc.,
>>forwarding to the WG as appropriate. One that I'm not sure
>>what to do with is the thread beginning...
>>
>>Disjunction vs. Optional ... and UNION Bob MacGregor (Sunday, 20
> 
> March)
> 
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/003
> 
>>4.html
>>
>>Our decision on the disjunction and nestedOptionals issues...
>>  http://www.w3.org/2001/sw/DataAccess/issues#disjunction
>>  http://www.w3.org/2001/sw/DataAccess/issues#nestedOptionals
>>are binding here... the question is whether this is sufficient
>>new information that I should reopen the issue.
>>
>>My own investigation is inconclusive. I encourage WG members to
>>study it and let me know if you want the issue re-opened or not.
>>
Received on Friday, 25 March 2005 21:32:32 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:22 GMT