Re: 3.6 Optional Match from Bob MacGregor on 2004-08-08 (public-rdf-dawg@w3.org from July to September 2004)

From: Bob MacGregor <macgregor@isi.edu>
Date: Sat, 07 Aug 2004 17:51:24 -0700
To: "Eric Prud'hommeaux" <eric@w3.org>
CC: public-rdf-dawg@w3.org
Message-ID: <4115790C.9090301@isi.edu>
Eric,

Some comments on your comments

Eric Prud'hommeaux wrote:

>On Wed, Aug 04, 2004 at 09:54:10AM -0700, Bob MacGregor wrote:
>  
>
>>I'm responding to the general optional match issue, rather than
>>to the particulars in the last message I saw:
>>
>>Why do we need optional match?  Suppose you want to retrieve
>>all books (or whatever) with some properties, and want to print
>>out what there is to know about each book.  In BRQL, you can
>>use DESCRIBE, but then what you get back and what you
>>don't appears to be at the mercy of the query engine.  If you
>>want control over what attributes you want back, then you
>>need an optional match.
>>
>>Suppose you want attributes title, publisher, and creator to
>>be printed (whenever they are present for a given book).
>>'publisher' and 'creator' are likely to return URIs,
>>so what you would like is to ask for specific attributes about
>>them as well.  These also need to be handled with optional
>>matches, but now the matches are nested inside of the
>>original optional match.
>>
>>Databases get the effect of optional matches using two distinct
>>mechanisms.  Null values in columns correspond to top-level
>>optional matches.  Outer joins provide a second mechanism.
>>If you have a left join, and the right table in the left join has
>>a column containing null values, then you are automatically getting
>>the equivalent of a nested optional match.  So, nested optional
>>matches are in fact quite common place in the database world.
>>    
>>
>
>(Aside: the tuple with all NULL attributes (except the foreign key)
> feels like a less useful form of outer join. "Get me the sales
> contacts and any addresses you have for them" isn't well served
> by getting a bunch of all NULL addresses.)
>
>  
>
I wasn't referring above to a tuple with all NULL attributes except the 
key.  Suppose
a database tuple contains  (among other things) Street Num, and City, 
but the zip code is NULL.
If you executed
     SELECT ?streetnum, ?city, ?zip  WHERE ...
on that the table containing that tuple, you would get a NULL zipcode 
for that particular tuple.  This is basically
an optional match, since each column in a relational database 
corresponds to an RDF triple.
My point is, that this kind of optional match is VERY prevalent, more so 
than outer joins.
When I say that, I'm assuming that the NULL value stands for "missing 
value".  That's by
far the most common usage of NULL (but not the only one).

>>BRQL should be commended in adding the OPTIONAL
>>clause.  However, I see two problems.  First is that it doesn't
>>support nested optional matches (their document acknowledges
>>this issue).
>>    
>>
>
>Phrased as a use case, nested optionals could look like:
>
>Phred wants a list of all the sales contacts, along with any addresses
>that are known for them. Each (optional) address is a multi-arc
>structure. In order to be useful, it must include a city, a state, a
>street name and a street number. It may also include an apartment
>number.
>
>Phred's asks for everyone with type SalesContact. In addition, he
>optionally looks for a node with arcs for city, state, street name,
>and street number. For each such address, he optionally looks for an
>apartment number.
>
>  
>
I like your use case.   We have many that are very similar to it.
I'm curious to know what kind of query syntax you would use to capture 
the above example.
Here is one possibility, extending BRQL to use parentheses for 
grouping.  The OPTIONAL
keyword becomes a unary operator, rather than a top level keyword.

SELECT ?name, ?city, ?state, ?street, ?streetNum, ?aptNum
FROM ...
WHERE (?contact rdf:type n:SalesContact)
               (?contact n:name ?name)
     OPTIONAL ( (?contact n:address ?addr)
                            (?addr n:city ?city)
                            (?addr n:state ?state)
                            (?addr: n:streetNumber ?streetNum)
                            OPTIONAL (?addr n:apartment ?aptNum))
USING ...

>>              Second, BRQL makes a distinction between
>>WHERE and AND restrictions.  Both of these should be
>>allowable in the OPTIONAL clause, but they aren't (as far
>>as I can tell). 
>>    
>>
>
>I can think of two reasons to keep your constraints and your graph
>pattern separate:
>  (PROCESSING-REQUIREMENTS) a query that looks for
>    ?a math:muchLessThan ?b
>  might be looking for ground facts with that predicate or for a math
>  function called muchLessThan. Thus syntactic distinction allows a
>  query service to respond to queries that require functions that it
>  doesn't understand.
>  - There are other extensibility mechanisms but I don't think we can
>    remove the sytnactic distinction of constraints without employing
>    one of these mechanisms in its stead.
>  
>
I personally don't think that syntax is the right way to address this.  
I'd prefer to use
metadata annotations on predicates such as  "math:muchLessThan"  to 
inform the
system that they are built-ins.  There of course is another more serious 
problem,
which is that literals aren't supposed to appear in subject position of 
triples.

However, that wasn't my point.  My point is that built-ins should be 
allowed inside
of optional matches, not just outside.  For example, databases allow 
inequality
joins to be within an outerjoin clause (i.e., I'm not asking for 
something that isn't
already commonplace).  The BRQL syntax would need some non-obvious
extensions to be able to express this.

>  ......
>  
>
>>BTW, the distinction between WHERE and AND restrictions
>>may appear to be well-motivated, but its not.  In logic systems,
>>its perfectly legal to assert triples having predicates like GREATER-THAN
>>and LESS-THAN.
>>    
>>
>
>Prolog has a bunch of built-ins that I think are indistinguishable
>from ground facts. That leaves it unable to distinguish between "you
>got no solutions" and "you got no solutions because i don't have a
>built-in called muchLessThan". It's good form in extensible protocols
>to care about that distinction, but I don't think we've made a
>concious choice.
>
>  
>
Are you assuming that the 'muchLessThan' built-in has been declared 
(e.g., by a
    muchLessThan rdf:Type  Property.
statement, or are you assuming that declarations are not a 
prerequisite.  If the
former, than there should have also have been an assertion like
      muchLessThan n:isBuiltInProperty xsd:true
If the latter (no property declarations), then its reasonable to think 
that the
user is on his own regarding the usage of the predicate.

When I write built-in predicates, they are often able to look for database
assertions as well as doing procedural computations.  Are you allowing 
for that?

Cheers, Bob
Received on Sunday, 8 August 2004 00:51:56 UTC