- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 6 Aug 2004 00:48:12 -0400
- To: Bob MacGregor <macgregor@ISI.EDU>
- Cc: public-rdf-dawg@w3.org
- Message-ID: <20040806044812.GA26860@w3.org>
On Wed, Aug 04, 2004 at 09:54:10AM -0700, Bob MacGregor wrote: > > I'm responding to the general optional match issue, rather than > to the particulars in the last message I saw: > > Why do we need optional match? Suppose you want to retrieve > all books (or whatever) with some properties, and want to print > out what there is to know about each book. In BRQL, you can > use DESCRIBE, but then what you get back and what you > don't appears to be at the mercy of the query engine. If you > want control over what attributes you want back, then you > need an optional match. > > Suppose you want attributes title, publisher, and creator to > be printed (whenever they are present for a given book). > 'publisher' and 'creator' are likely to return URIs, > so what you would like is to ask for specific attributes about > them as well. These also need to be handled with optional > matches, but now the matches are nested inside of the > original optional match. > > Databases get the effect of optional matches using two distinct > mechanisms. Null values in columns correspond to top-level > optional matches. Outer joins provide a second mechanism. > If you have a left join, and the right table in the left join has > a column containing null values, then you are automatically getting > the equivalent of a nested optional match. So, nested optional > matches are in fact quite common place in the database world. (Aside: the tuple with all NULL attributes (except the foreign key) feels like a less useful form of outer join. "Get me the sales contacts and any addresses you have for them" isn't well served by getting a bunch of all NULL addresses.) > BRQL should be commended in adding the OPTIONAL > clause. However, I see two problems. First is that it doesn't > support nested optional matches (their document acknowledges > this issue). Phrased as a use case, nested optionals could look like: Phred wants a list of all the sales contacts, along with any addresses that are known for them. Each (optional) address is a multi-arc structure. In order to be useful, it must include a city, a state, a street name and a street number. It may also include an apartment number. Phred's asks for everyone with type SalesContact. In addition, he optionally looks for a node with arcs for city, state, street name, and street number. For each such address, he optionally looks for an apartment number. > Second, BRQL makes a distinction between > WHERE and AND restrictions. Both of these should be > allowable in the OPTIONAL clause, but they aren't (as far > as I can tell). I can think of two reasons to keep your constraints and your graph pattern separate: (PROCESSING-REQUIREMENTS) a query that looks for ?a math:muchLessThan ?b might be looking for ground facts with that predicate or for a math function called muchLessThan. Thus syntactic distinction allows a query service to respond to queries that require functions that it doesn't understand. - There are other extensibility mechanisms but I don't think we can remove the sytnactic distinction of constraints without employing one of these mechanisms in its stead. (SHADOWING) SeRQL has a directSubClassOf predicate that can be used to find subClass relationships that are not the product of a trans- itive closure inference. For instance, BlueWhales may be a subClass of Whales which are in turn a subClass of Mammals. ?a directSubClassOf ?b would NOT return (BlueWhales, Mammals). But what if the graph that I'm querying and the product of a query which states that BlueWhales directSubClassOf Whales. ? Can I query for that? How, directDirectSubClassOf ? - Specifying that directSubClassOf is matched by EITHER an asserted subClassOf OR directSubClassOf arc may solve this problem. Dunno. > BTW, the distinction between WHERE and AND restrictions > may appear to be well-motivated, but its not. In logic systems, > its perfectly legal to assert triples having predicates like GREATER-THAN > and LESS-THAN. Prolog has a bunch of built-ins that I think are indistinguishable from ground facts. That leaves it unable to distinguish between "you got no solutions" and "you got no solutions because i don't have a built-in called muchLessThan". It's good form in extensible protocols to care about that distinction, but I don't think we've made a concious choice. > That we don't do this commonly in RDF is > an artifact of its youth -- when it gets more sophisticated, with rule-based > inference becoming common place, then there won't be a distinction > between computed and asserted predicates. > So the right fix to the second of the problems > I just mentioned is to eliminate the AND clause in BRQL. > > The database community spent a cycle producing large numbers of > different kinds of null values. Eventually, they had so many that it became > obvious that they were making a mistake. Inventing a new kind of NULL > to mean "missing value" is probably starting all over again on that mistaken > path. We need one kind of NULL, and that's what a variable should > bind to when there is no value for it in an optional match. > > Cheers, Bob -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +1.857.222.5741 (does not work in Asia) (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Friday, 6 August 2004 00:48:28 UTC