Re: 3.6 Optional Match

On Wed, Aug 04, 2004 at 09:54:10AM -0700, Bob MacGregor wrote:
> 
> I'm responding to the general optional match issue, rather than
> to the particulars in the last message I saw:
> 
> Why do we need optional match?  Suppose you want to retrieve
> all books (or whatever) with some properties, and want to print
> out what there is to know about each book.  In BRQL, you can
> use DESCRIBE, but then what you get back and what you
> don't appears to be at the mercy of the query engine.  If you
> want control over what attributes you want back, then you
> need an optional match.
> 
> Suppose you want attributes title, publisher, and creator to
> be printed (whenever they are present for a given book).
> 'publisher' and 'creator' are likely to return URIs,
> so what you would like is to ask for specific attributes about
> them as well.  These also need to be handled with optional
> matches, but now the matches are nested inside of the
> original optional match.
> 
> Databases get the effect of optional matches using two distinct
> mechanisms.  Null values in columns correspond to top-level
> optional matches.  Outer joins provide a second mechanism.
> If you have a left join, and the right table in the left join has
> a column containing null values, then you are automatically getting
> the equivalent of a nested optional match.  So, nested optional
> matches are in fact quite common place in the database world.

(Aside: the tuple with all NULL attributes (except the foreign key)
 feels like a less useful form of outer join. "Get me the sales
 contacts and any addresses you have for them" isn't well served
 by getting a bunch of all NULL addresses.)

> BRQL should be commended in adding the OPTIONAL
> clause.  However, I see two problems.  First is that it doesn't
> support nested optional matches (their document acknowledges
> this issue).

Phrased as a use case, nested optionals could look like:

Phred wants a list of all the sales contacts, along with any addresses
that are known for them. Each (optional) address is a multi-arc
structure. In order to be useful, it must include a city, a state, a
street name and a street number. It may also include an apartment
number.

Phred's asks for everyone with type SalesContact. In addition, he
optionally looks for a node with arcs for city, state, street name,
and street number. For each such address, he optionally looks for an
apartment number.

>               Second, BRQL makes a distinction between
> WHERE and AND restrictions.  Both of these should be
> allowable in the OPTIONAL clause, but they aren't (as far
> as I can tell). 

I can think of two reasons to keep your constraints and your graph
pattern separate:
  (PROCESSING-REQUIREMENTS) a query that looks for
    ?a math:muchLessThan ?b
  might be looking for ground facts with that predicate or for a math
  function called muchLessThan. Thus syntactic distinction allows a
  query service to respond to queries that require functions that it
  doesn't understand.
  - There are other extensibility mechanisms but I don't think we can
    remove the sytnactic distinction of constraints without employing
    one of these mechanisms in its stead.

  (SHADOWING) SeRQL has a directSubClassOf predicate that can be used
  to find subClass relationships that are not the product of a trans-
  itive closure inference. For instance, BlueWhales may be a subClass
  of Whales which are in turn a subClass of Mammals.
    ?a directSubClassOf ?b
  would NOT return (BlueWhales, Mammals). But what if the graph that
  I'm querying and the product of a query which states that
    BlueWhales directSubClassOf Whales.
  ? Can I query for that? How, directDirectSubClassOf ?
  - Specifying that directSubClassOf is matched by EITHER an asserted
  subClassOf OR directSubClassOf arc may solve this problem. Dunno.

> BTW, the distinction between WHERE and AND restrictions
> may appear to be well-motivated, but its not.  In logic systems,
> its perfectly legal to assert triples having predicates like GREATER-THAN
> and LESS-THAN.

Prolog has a bunch of built-ins that I think are indistinguishable
from ground facts. That leaves it unable to distinguish between "you
got no solutions" and "you got no solutions because i don't have a
built-in called muchLessThan". It's good form in extensible protocols
to care about that distinction, but I don't think we've made a
concious choice.

>                 That we don't do this commonly in RDF is
> an artifact of its youth -- when it gets more sophisticated, with rule-based
> inference becoming common place, then there won't be a distinction
> between computed and asserted predicates.
> So the right fix to the second of the problems
> I just mentioned is to eliminate the AND clause in BRQL.
> 
> The database community spent a cycle producing large numbers of
> different kinds of null values.  Eventually, they had so many that it became
> obvious that they were making a mistake.  Inventing a new kind of NULL
> to mean "missing value" is probably starting all over again on that mistaken
> path.  We need one kind of NULL, and that's what a variable should
> bind to when there is no value for it in an optional match.
> 
> Cheers, Bob

-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741 (does not work in Asia)

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Friday, 6 August 2004 00:48:28 UTC