Re: pls consider comments on disjunction

Personick, Michael R. wrote:
> I second Bryan's motion and can expand on his point just a bit.
> 
> A bit of background: The system I am trying to implement with RDF and OWL is
> an attempt to federate and semantically align multiple legacy relational
> databases with different schemas. Developers on our team previously
> transformed (replicated) new data sets into a common relational schema and
> wrote business tier applications to that particular schema. The only real
> way to consider multiple data sets simultaneously was to hand-jam them into
> a single relational instance. I am trying to give them a new data access
> layer using Semantic Web technologies to: a) avoid replicating everything
> into the common relational schema and b) make it easier to consider multiple
> data sources at once.
> 
> When I bring a new developer onto the team and they start writing queries
> using an RDF query language (RDQL up to this point), within the first few
> hours they invariably ask me how to do OR in RDF queries. I tell them that,
> well, there is no OR. At least not the way they are used to (coming from SQL
> where it's a simple construct). But you can use "nested optionals" or
> "demorgan's theorem" to accomplish the same thing. Blank stares. I explain,
> well this is how I've been doing it:
> 
> When you need to do an OR and wish you could just do this:
> 
> construct *
> where (?evidence <myns:memberOf> <myns:ThisGroup>) OR
>       (?evidence <myns:memberOf> <myns:ThatGroup>)
> 
> Instead do this:
> 
> construct *
> where (?evidence <myns:memberOf> ?group)
> and  !(?group ne <myns:ThisGroup> &&
>        ?group ne <myns:ThatGroup>)
> 
> This is a very simple case, but occassionally we end up with extremely
> complicated and hard-to-debug queries by doing it this way. I was very
> excited when I saw that the problem might be solved by UNION in SPARQL:
> 
> construct *
> where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>       (?evidence <myns:memberOf> <myns:ThatGroup>)
> 
> But now I understand that this is no longer in the spec or UNION does not
> mean what I think it means?

Mike,

Thank you for the comments.

I think the feature you are referring to is either a value test involving ||
(like programming language ||) or the UNION construct.  Both are in the
editors' draft at the moment.

UNION was briefly OR but there was sufficient confusion that it was changed
to UNION.  The semantics are very similar to SQL e.g.

The SPARQL pattern:   { pat1 } UNION { pat2 }
is like an SQL query (possibly involving further subqueries):

    SELECT FROM/WHERE pat1'
      UNION
    SELECT FROM/WHERE pat2'

where pat1' and pat2' are the SQL rewrites of the SPARQL part and these may
involve SQL subqueries.

SPARQL does not need the explicit projection in the subqueries because RDF
is not typed and so there are no union compatibility rules as there are in 
SQL.  SPARQL union is based on variable names, not column type.

The editors' text is to be found at:
http://www.w3.org/2001/sw/DataAccess/rq23/#alternatives


The example you give is the

> where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>       (?evidence <myns:memberOf> <myns:ThatGroup>)

becomes

  where {
            { ?evidence  myns:memberOf  myns:ThisGroup }
          UNION
            { ?evidence  myns:memberOf  myns:ThatGroup }
       }


I would note that your particular example query can also be written

construct { ?evidence myns:memberOf ?group }
where { ?evidence myns:memberOf ?group .
         FILTER ?group = myns:ThatGroup || ?group = myns:ThisGroup
       }

The "=" operator is general equality and will compare URIs.

This might be more natural to those coming from SQL or the union form may be
more natural.  This approach of using value-operator disjunction also works
in RDQL.  As the patterns in the union increase in size, it can get very
unwieldy to do it as value disjunction; with several variables in the 
subpatterns it will become very easy to make mistakes in the complex 
expressions needed.

A query processor may wish to optimize into either the value-operator form
or into the concatenated subquery form based on its facilities and what it
can do fast. Allowing the application programmer to write their request
clearly and succinctly is important.


The original motivating example for UNION is for variations in use of Dublin
Core v1.0 and v1.1:

# application does not need to know which version of the property was used.
SELECT ?title
WHERE {
         { ?x dc10:title ?title } UNION { ?x dc11:title ?title }
       }


# application does want to know which version of the property was used.
SELECT ?title10 ?title11
WHERE {
         { ?x dc10:title ?title10 } UNION { ?x dc11:title ?title11 }
       }

which is close to your example.


Separate issue:

I changed the "construct *" in your example because that isn't currently in 
rq23 because it does not work when a query involved GRAPH.  If you have any 
feedback on that, please let the comments list know, ideally on a new thread.

	Andy

> 
> I also understand that union/disjuntion/or can be accomplished by the method
> I illustrated above and also by nested optionals (although I haven't seen a
> simple explanation of how). Regardless, why is the burden on me to learn how
> to do OR a totally new way? If wide acceptance of the language is a goal and
> it's well understood how to accomplish OR through nested optionals, why not
> just give the user an OR and then let a query translator/optimizer sort out
> rewriting the query using nested optionals?
> 
> Sincerely,
> Mike Personick
> Science Applications International Corp.
> 
> -----Original Message-----
> From: Thompson, Bryan B.
> To: 'Dan Connolly '; 'public-rdf-dawg-request@w3.org '; 'RDF Data Access
> Working Group '
> Cc: Bebee, Bradley R.; Personick, Michael R.
> Sent: 3/24/2005 2:12 PM
> Subject: RE: pls consider comments on disjunction
> 
> Dan,
> 
> I am in favor of re-opening this issue.  I think that Bob has made
> several very good points and there is pretty consistent input from
> the comments list that we need to respect traditional semantics for
> core operators (AND, OR, NOT).
> 
>>From our own experience using SPARQL prototypes, we spend a lot of
> time re-writing queries that require disjunction using an combination
> of AND and NOT.
> 
> -bryan
> 
> -----Original Message-----
> From: public-rdf-dawg-request@w3.org
> To: RDF Data Access Working Group
> Sent: 3/24/2005 2:03 PM
> Subject: pls consider comments on disjunction
> 
> 
> Most of the comments continue to get handled by the editors etc.,
> forwarding to the WG as appropriate. One that I'm not sure
> what to do with is the thread beginning...
> 
> Disjunction vs. Optional ... and UNION Bob MacGregor (Sunday, 20 March)
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/003
> 4.html
> 
> Our decision on the disjunction and nestedOptionals issues...
>   http://www.w3.org/2001/sw/DataAccess/issues#disjunction
>   http://www.w3.org/2001/sw/DataAccess/issues#nestedOptionals
> are binding here... the question is whether this is sufficient
> new information that I should reopen the issue.
> 
> My own investigation is inconclusive. I encourage WG members to
> study it and let me know if you want the issue re-opened or not.
> 

Received on Friday, 25 March 2005 14:55:09 UTC