Re: Reqirement 3.5: subgraph results

On Wed, May 05, 2004 at 03:59:40PM +0100, Steve Harris wrote:
> On Tue, May 04, 2004 at 01:45:33 -0700, Rob Shearer wrote:
> > > "1.8 Derived Graphs
> > > 
> > > The working group must recognize that RDF graphs are often constructed
> > > by aggregation from multiple sources and through logical 
> > > inference, and
> > > that sometimes the graphs are never materialized. Such graphs may be
> > > arbitrarily large or infinite."
> > >  --
> > 
> > "Logical inference" extends a *lot* farther than just appending triples
> > to an RDF graph.
> > There are a lot of ways to say "at least one of these two edges needs to
> > exist". It can be a consequence of an OWL ontology; it can be a
> > consequence of a rule encoded in a rules language; it can be a
> > consequence of any semantic layer you want to put on top of the basic
> > RDF data model. It does explicitly say something about what RDF graphs
> > are possible and what are not. But such knowledge does not necessarily
> > have any sensible encoding in RDF itself. (The best we've seen is
> > changing the query to little more than "get me the answers" and then
> > adding triples to the source RDF that say "this is an answer"; an
> > approach which is both bizarre and quite impractical in the case of more
> > than one variable which needs to be bound.)
> This is similar to my concern:
> If a query includes some extension function (after 3.3), say a function
> that takes a radius and the URIs for two geo-spatial co-ordinate nodes and
> returns TRUE if one is in the radius of the other. The complete graph used
> to answer that query is not neccesarily known to the query engine -
> especailly if the function is implemented at a lower level. Asking
> extension functions (for example) to give the triples that it used to
> answer the question seems unneccesarily onerous.

I don't think we will be able to write a specification for this
scenario anyays. How will we specify the semantics of query
opperations on data sets that aren't RDF?

Imagine a SQL interface to get the current weather in Chicago. The
part of the SQL spec that says results are some expression of a table
may help us serialize/deserialize the response, but the bulk of the
spec that defines the results in terms of restriction, projection,
union, difference and product on the original data will be useless.
The SQL spec will be useful if the weather data is defined in terms
of a table by some other spec. That's roughly what I'm arguing for,
don't try to define a QL that will handle queries on non-RDF data.
Instead define operator semantics, result sets, test suites in terms
of RDF dataa and leave the projection of rules into RDF for another

How do we know the client will be as clever as the server? What
happens if the server gives back a result graph (or set of result
graphs) and the client can't see how they answer the query? Not a
problem, I say dismissively. RDF semantics protect the client from
arriving at contradictory results (with some extra think-work for
negation as failure). If the client asked for bindings, the user can
still get his answer. If the client asked only for grpah answers, it
probably had a reason to do this, like it's federating parts of a
query to multiple sources. If it isn't clever enough to follow the
logic of some of the federated sources, the user doens't get that
solution. The user in that case would have done no better if DAWG-QL
didn't have a mechanism for communication results as subgraphs.

I think this is at least a consistent world view. Whether it will
be a popular world view remains to be seen.

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741

Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Wednesday, 5 May 2004 16:03:47 UTC