Re: Booleans as the degenerate case of variable binding results from Kendall Clark on 2004-06-18 (public-rdf-dawg@w3.org from April to June 2004)

From: Kendall Clark <kendall@monkeyfist.com>
Date: Fri, 18 Jun 2004 07:39:10 -0400
To: Simon Raboczi <raboczi@tucanatech.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <20040618113910.GA18539@monkeyfist.com>

On Tue, Jun 15, 2004 at 09:37:55AM -0700, Simon Raboczi wrote:

> In tabular form, the false result has zero columns (variables) and zero 
> rows.  The true result has zero columns and one row.  It doesn't make 
> any sense to have more than one row, because additional rows would 
> necessarily be duplicates of the first.

I think we want to have distinguished, first class boolean answers,
that is, that we don't want to treat forms of variable bindings as
booleans. Here's why:

1. efficiency

a. I assume, eventually, that there will be DAWG engines that are
implemented using the full set of DL tableau algorithms. The OWL
reasoner in my lab, Pellet, may eventually be such a beast.

In that scenario, Pellet's developers assure me that answering boolean
queries may well be dramatically more efficient by doing a single
satisfiablity test rather than calculating a single variable
result. This has a lot to do with things like conjunctive abox query
and the like, stuff I don't fully understand.

Yes, this isn't how most of us will implement DAWG query processors
initially, or perhaps ever, but it's worth taking into account.

b. The more crucial point is that, absent any streamability design
features, it's more efficent to transfer a single distinguished
boolean value for TRUE than a set of variable binding results in most
cases.

2. policy

I can think of several use cases where what I care about isn't the
particular answers -- the actual string comprising someone's email
address -- but *that* some graph has "an email address". Or, for
example, I don't care about the values of foaf:knows predicates, I
just want to know whether some FOAF resource contains more than 8 of
them. Or consider privacy issues; I set my PDA/phone -- which has both
a DAWG client and server built in...not a fantasy, since I've been
playing with Python on Nokia Series 60 phones and we should have
rdflib running on them soon -- to answer queries from people about my
FOAF profile. I say in a policy rules resource that I don't want
certain details about me revealed, but that I'm okay with boolean
queries about them.

(Imagine another use case I'm working on in my lab: an RSS aggregator
that takes a policy file as input and turns policy rules into DAWG
queries; it then aggregates RSS feeds if the queries are TRUE, not if
FALSE.)

Thus, I don't necessarily want my foaf:knows or :mbox revealed to
anyone who queries me, but I don't mind my server answering 'true' if
someone queries about whether I *have* an :mbox and whether I foaf:eat
> 3 diff kinds of cuisine.

So, Simon, your conceptual analysis seems right, but I think there are
other overriding reasons to have first class boolean result formats
and form of query (SELECT BOOL or ASK) that expects boolean results.

Best,
Kendall Clark

Received on Friday, 18 June 2004 07:41:21 UTC