RE: Decentralized RDF Distribution from Bill de hOra on 2001-02-19 (www-rdf-interest@w3.org from February 2001)

From: Bill de hOra <bill@dehora.fsnet.co.uk>
Date: Mon, 19 Feb 2001 13:19:33 -0000
To: "RDF Interest" <www-rdf-interest@w3.org>
Cc: "David Allsopp" <dallsopp@signal.dera.gov.uk>, "Seth Russell" <seth@robustai.net>
Message-ID: <DCEBKOHMHCKKIAAPKLLMMEMICBAA.bill@dehora.fsnet.co.uk>
: > > Aaron Swartz:
: > > The simplistic way to manage this would be to assign each triple a URI
: > > and then publish updated information on each of the statements.
: > > However, by doing this, we run into the tricky problem of both
: > > asserting and reifying a statement at the same time. _@@ reification
: > > experts should let me know how to do this_
: >
: > Seth Russell:
: > I think this is a good idea, have the original author assign a uri to each
: > triple.  The reified statement then is a quadruple [ statementUri,
: > subject, predicate, object].  When we assert triples, that's what goes
: > into the semantic cloud.  When people talk about those statements to the
: > cloud, they refer to them by putting the statementUri in the subject.  I
: > love it, it's simple.
:
: David Allsopp:
: I'm very interested in this sort of usage.  I'm working on software
: agents which exchange information in situations where data change over
: time, and are subject to varying levels of trust over time, so I will
: need to be able to 'tag' the triples with their source, timestamp,
: expiry...  This would allow filtering, or roll-back to earlier data,
: etc.
:
: I think this sort of usage needs to be considered in the design of APIs
: and query languages - we'll need easy manipulation and querying of
: reified statements, e.g:
:
: I would want to be able to easily query for a statement which is reified
: but not asserted, without having to construct a query explicitly
: matching the whole quad.
:
: Some APIs allow 'direct' reification without using quads (allowing
: Statements to be Resources), whilst also allowing the explicit quad
: form.  We shouldn't need to know which implementation is used when
: writing our query, which reinforces the point that we shouldn't have to
: explicitly search for quads - that's the job of the query engine.

I was working at an API which using assert/retract semantics where the
statements were based on Linda tuples. The container (which I called "Model")
allowed direct queries and  was to be optimised for answering and setting
questions (since I expect those to be the main use case for RDF). The key
operations over the "Model", Resource and Statement:

Model isa Resource
  Model assert( Statement s )
  Model retract( Statement template )
  Model ask( Statement template )
  Statement retractAny( Statement template )
  Statement askAny( Statement template )
  Statement takeImage( Statement template )
  boolean isEmpty()
  boolean contains( Statement template )
  boolean mutable()

Resource:
  String getURI();
  String getNS();
  String getName()
  Model model()
  boolean sra( Resource r )
  boolean accept( ResourceVisitor v )
  public Object clone()

Statement isa Resource:
  Resource getSubject()
  Resource getPredicate()
  Resource getObject()
  public Object clone()
  Context context() // placeholder for #gk's work

The next step was to allow insertion of leased listeners (stay resident queries)
using distributed events enabled by ModelEventListener/ModelEvent:

ModelEventListener:
  notify( ModelEvent evt )

ModelEvent:
  new ModelEvent(Model source, Statement cause, long eventID, long tstamp)
  // accessors

From that, enough small steps (including objects like a Lease issued by an
extended Model) would provide a forward chaining distributed rules system
(production rules). So what I *really* wanted to crosscut was Linda and Prolog
using RDF statements as information and behviour. Tuple querying can be
inefficient, but I think it uses the same logic (existential/conjunctive) that
Prolog and SQL does: if so, then it can be highly optimised internally without
losing the simplicity. Specifying negation as failure with EC logic gives
universal quantifiers, disjunctions and negation to queries, though that's not
so straightforward in a distributed rules/data base (you probably want to
mandate a best-effort as failure, to stop queries blowing up).

I considered using Javaspaces at one point. The API/usage are really simple and
clean. You get all that Linda infrastructure for free, as well as Jini leasing,
join and discovery. But having to hook it up to the web and then figure out how
to knit Javaspaces together didn't seem worth it. Plus the backend is all
RMI/bytecode (not HTTP/XML!). Anyway the web really needs a solid open source
Linda spaces implementation that sits on HTTP/XML and allows systems to host and
sandbox tuples.

This has become very much background work now: other things to do and...speaking
of easy manipulation of refications...I get *really* frustrated trying to handle
refication with assert/retract semantics. Maybe I'm dense, but processing
refication quads feels like playing 3 card trick (now you see it, now you don't)
with objects, where one can remove an accessor as well as change the value: they
suck. Trying to putting the words "reification" and "transaction" together makes
me feel grumpier and stupider than usual.

Querying is only one part of things: you have to be able to move the metadata in
and out and ideally you want to do this in a standard language. That will
require consensus on how RDF stores are expected to behave in light of updates:
punting that entirely to applications seems questionable.

Bill de hOra
Received on Monday, 19 February 2001 08:19:46 UTC