RE: pls consider comments on disjunction

Andy,

Could you clarify your comment that the "semantics are very similar
to SQL"?  I.e., what are the differences from the expectations of a
SQL programmer?

There has been substantial confusion over the semantics of UNION, esp.
the thread with Bob MacGregor started from [1].  This is a point where
I think it is very important for the spec to be clear and unambiguous.

[1]
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/0034.ht
ml

Thanks,

-bryan

-----Original Message-----
From: Personick, Michael R.
To: 'Seaborne, Andy '
Cc: '''RDF Data Access Working Group ' ' '; Bebee, Bradley R.; Thompson,
Bryan B.
Sent: 3/25/2005 10:24 AM
Subject: RE: pls consider comments on disjunction

Andy,

Thanks for your response - I appreciate the clear explanation. 

The spec has a note under the section for UNION that confused me a bit:
"The working group decided on this design and closed the disjunction
issue without reaching consensus. The objection was that adding UNION
would complicate implementation and discourage adoption. If you have
input to this aspect of the SPARQL that the working group has not yet
considered, please send a comment to public-rdf-dawg-comments@w3.org."
To me that meant that UNION was out.

>From your response, UNION seems the most clean and clear way of writing
the example query. I also like UNION because to me it most closely
resembles traditional OR operator semantics. I hope that it makes it
through into the final spec.

I was not aware of the value-disjunction approach either - that is good
to know. Is FILTER a new addition? I cannot find it in
http://www.w3.org/TR/rdf-sparql-query/.

thanks,
Mike

-----Original Message-----
From: Seaborne, Andy
To: Personick, Michael R.
Cc: ''RDF Data Access Working Group ' '; Bebee, Bradley R.
Sent: 3/25/2005 9:54 AM
Subject: Re: pls consider comments on disjunction

Personick, Michael R. wrote:
> I second Bryan's motion and can expand on his point just a bit.
> 
> A bit of background: The system I am trying to implement with RDF and
OWL is
> an attempt to federate and semantically align multiple legacy
relational
> databases with different schemas. Developers on our team previously
> transformed (replicated) new data sets into a common relational schema
and
> wrote business tier applications to that particular schema. The only
real
> way to consider multiple data sets simultaneously was to hand-jam them
into
> a single relational instance. I am trying to give them a new data
access
> layer using Semantic Web technologies to: a) avoid replicating
everything
> into the common relational schema and b) make it easier to consider
multiple
> data sources at once.
> 
> When I bring a new developer onto the team and they start writing
queries
> using an RDF query language (RDQL up to this point), within the first
few
> hours they invariably ask me how to do OR in RDF queries. I tell them
that,
> well, there is no OR. At least not the way they are used to (coming
from SQL
> where it's a simple construct). But you can use "nested optionals" or
> "demorgan's theorem" to accomplish the same thing. Blank stares. I
explain,
> well this is how I've been doing it:
> 
> When you need to do an OR and wish you could just do this:
> 
> construct *
> where (?evidence <myns:memberOf> <myns:ThisGroup>) OR
>       (?evidence <myns:memberOf> <myns:ThatGroup>)
> 
> Instead do this:
> 
> construct *
> where (?evidence <myns:memberOf> ?group)
> and  !(?group ne <myns:ThisGroup> &&
>        ?group ne <myns:ThatGroup>)
> 
> This is a very simple case, but occassionally we end up with extremely
> complicated and hard-to-debug queries by doing it this way. I was very
> excited when I saw that the problem might be solved by UNION in
SPARQL:
> 
> construct *
> where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>       (?evidence <myns:memberOf> <myns:ThatGroup>)
> 
> But now I understand that this is no longer in the spec or UNION does
not
> mean what I think it means?

Mike,

Thank you for the comments.

I think the feature you are referring to is either a value test
involving ||
(like programming language ||) or the UNION construct.  Both are in the
editors' draft at the moment.

UNION was briefly OR but there was sufficient confusion that it was
changed
to UNION.  The semantics are very similar to SQL e.g.

The SPARQL pattern:   { pat1 } UNION { pat2 }
is like an SQL query (possibly involving further subqueries):

    SELECT FROM/WHERE pat1'
      UNION
    SELECT FROM/WHERE pat2'

where pat1' and pat2' are the SQL rewrites of the SPARQL part and these
may
involve SQL subqueries.

SPARQL does not need the explicit projection in the subqueries because
RDF
is not typed and so there are no union compatibility rules as there are
in 
SQL.  SPARQL union is based on variable names, not column type.

The editors' text is to be found at:
http://www.w3.org/2001/sw/DataAccess/rq23/#alternatives


The example you give is the

> where (?evidence <myns:memberOf> <myns:ThisGroup>) UNION
>       (?evidence <myns:memberOf> <myns:ThatGroup>)

becomes

  where {
            { ?evidence  myns:memberOf  myns:ThisGroup }
          UNION
            { ?evidence  myns:memberOf  myns:ThatGroup }
       }


I would note that your particular example query can also be written

construct { ?evidence myns:memberOf ?group }
where { ?evidence myns:memberOf ?group .
         FILTER ?group = myns:ThatGroup || ?group = myns:ThisGroup
       }

The "=" operator is general equality and will compare URIs.

This might be more natural to those coming from SQL or the union form
may be
more natural.  This approach of using value-operator disjunction also
works
in RDQL.  As the patterns in the union increase in size, it can get very
unwieldy to do it as value disjunction; with several variables in the 
subpatterns it will become very easy to make mistakes in the complex 
expressions needed.

A query processor may wish to optimize into either the value-operator
form
or into the concatenated subquery form based on its facilities and what
it
can do fast. Allowing the application programmer to write their request
clearly and succinctly is important.


The original motivating example for UNION is for variations in use of
Dublin
Core v1.0 and v1.1:

# application does not need to know which version of the property was
used.
SELECT ?title
WHERE {
         { ?x dc10:title ?title } UNION { ?x dc11:title ?title }
       }


# application does want to know which version of the property was used.
SELECT ?title10 ?title11
WHERE {
         { ?x dc10:title ?title10 } UNION { ?x dc11:title ?title11 }
       }

which is close to your example.


Separate issue:

I changed the "construct *" in your example because that isn't currently
in 
rq23 because it does not work when a query involved GRAPH.  If you have
any 
feedback on that, please let the comments list know, ideally on a new
thread.

	Andy

> 
> I also understand that union/disjuntion/or can be accomplished by the
method
> I illustrated above and also by nested optionals (although I haven't
seen a
> simple explanation of how). Regardless, why is the burden on me to
learn how
> to do OR a totally new way? If wide acceptance of the language is a
goal and
> it's well understood how to accomplish OR through nested optionals,
why not
> just give the user an OR and then let a query translator/optimizer
sort out
> rewriting the query using nested optionals?
> 
> Sincerely,
> Mike Personick
> Science Applications International Corp.
> 
> -----Original Message-----
> From: Thompson, Bryan B.
> To: 'Dan Connolly '; 'public-rdf-dawg-request@w3.org '; 'RDF Data
Access
> Working Group '
> Cc: Bebee, Bradley R.; Personick, Michael R.
> Sent: 3/24/2005 2:12 PM
> Subject: RE: pls consider comments on disjunction
> 
> Dan,
> 
> I am in favor of re-opening this issue.  I think that Bob has made
> several very good points and there is pretty consistent input from
> the comments list that we need to respect traditional semantics for
> core operators (AND, OR, NOT).
> 
>>From our own experience using SPARQL prototypes, we spend a lot of
> time re-writing queries that require disjunction using an combination
> of AND and NOT.
> 
> -bryan
> 
> -----Original Message-----
> From: public-rdf-dawg-request@w3.org
> To: RDF Data Access Working Group
> Sent: 3/24/2005 2:03 PM
> Subject: pls consider comments on disjunction
> 
> 
> Most of the comments continue to get handled by the editors etc.,
> forwarding to the WG as appropriate. One that I'm not sure
> what to do with is the thread beginning...
> 
> Disjunction vs. Optional ... and UNION Bob MacGregor (Sunday, 20
March)
>
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Mar/003
> 4.html
> 
> Our decision on the disjunction and nestedOptionals issues...
>   http://www.w3.org/2001/sw/DataAccess/issues#disjunction
>   http://www.w3.org/2001/sw/DataAccess/issues#nestedOptionals
> are binding here... the question is whether this is sufficient
> new information that I should reopen the issue.
> 
> My own investigation is inconclusive. I encourage WG members to
> study it and let me know if you want the issue re-opened or not.
> 

Received on Friday, 25 March 2005 16:05:49 UTC