Re: RDF graph query, graph equivalence and inference from Graham Klyne on 2003-06-27 (www-archive@w3.org from June 2003)

From: Graham Klyne <gk@ninebynine.org>
Date: Fri, 27 Jun 2003 17:31:16 +0100
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: Jeremy Carroll <jjc@hpl.hp.com>, www-archive <www-archive@w3.org>, Brian McBride <bwm@hplb.hpl.hp.com>
Message-Id: <5.1.0.14.2.20030627171303.02b12ef8@127.0.0.1>

Hi Dave,

Thanks for your feedback.

At 18:12 26/06/03 +0100, Dave Reynolds wrote:
>...  Datatype reasoning is one of my
> > targets (e.g. having a specific datatype for IP network addresses).
>
>That sounds good. We haven't done much with datatype reasoning yet.

Neither have I, yet ;-)

> > My current strategy is to introduce a variant of my graph forward-chaining
> > query function that matches a query graph with bnodes against a target
> > graph, and use that to match the outputs from backward chaining against
> > input data.
>
>That sounds reasonable. We do it slightly differently but only slightly.
>
>In Jena people often think of bNodes as concrete java objects they can do 
>things
>with - they are "in" the graph and they can go back to the same graph 
>twice and
>see the same bNodes. This makes it hard to implement, say, the RDFS closure
>rules in a way which users won't throw up at.

Yes, I'm juggling a similar problem.  I've a view that bnodes are mostly 
graph-local, and have to be renamed when merging graphs.  But this isn't 
always clear and I do have separate methods for merging graphs (with 
renaming) and adding graphs (making a union).  When doing inference, it 
seems to me that not all bnodes have similar scope:  some are generated 
locally by backchaining and need to be renamed when the results are 
combined with other inference results, while others come from the external 
data and should be preserved through to the result (so that multiple 
statements about the same bnode resource are preserved as such).

I'm toying with the idea of adding a specific variable "binding" to my 
graph representation (think: exists (x,y) : ... ), the normal case being 
all variables bound locally, but some may be allowed to "leak out" of the 
subgraphs processed and generated by inference rules.

>At the moment we solve this by keeping the bNodes-as-variables outside the
>graph, graph access and graph inference machinery. The variables in our rules
>are a different node type disjoint from both Resources and bNodes. For running
>the working group tests we translate the query graph into a Jena query 
>with each
>bNode replaced by a query variable and the run the query machinery on the
>inference graph.

I have a distinct node type for rule variables, which is processed 
specially by inference queries and substitutions.  But that doesn't solve 
all the problems of scope.

>This will only work for small query graphs because the query machinery 
>does not
>have any sophisticated subgraph isomorphism algorithms at present.

I'm not convinced that full subgraph isomorphism tests are needed -- I'm 
hoping Jeremy can comment on my previous message.  In effect, only the 
query graph has "variable nodes" for the purposes of matching, so (I think) 
the matching options are more constrained.

>Both the forward and backward rule systems work on the raw Jena graphs, to 
>them
>a bNode is much like any other node and distinct from a rule variable.

Sounds familiar.  I guess the interesting bit is the "planning procedure" 
-- I haven't really figured mine out yet -- I'm hoping that's where my 
Haskell approach will be useful, allowing me to experiment with different 
approaches.

#g

-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E

Received on Friday, 27 June 2003 14:07:48 UTC