Datasets, blackboxes and frames

The architecture document section on datasets [*] focusses on how you 
identify and describe datasets.

There is a second, related, dataset issue I'd like to get clearer on.

Will we support dataset-specific queries in the core?

In particular, I would find it useful to be able to map SPARQL-style 
named graph expressions into RIF - e.g. in order to represent CWM rules 
and because that something we need for our own use cases (which may 
affect how JenaRules evolves).

This could be achieved by having some builtin in the library that can 
query a dataset, such as the SPARQL blackbox we have talked about before:

    SPARQL(dataset-id-list, query-string, var1, ... varn)

However, I wonder whether it would be possible/reasonable to have the 
frame terms include an optional datasource identifier:

    oid{datasource}[p->v, ... p'->v']

N.B. I don't care about the human readable syntax, this is just to give 
a way to discuss it.

Thus the facts would be partitioned into a set of fact datasets, one 
default anonymous one and a set of named ones identified by URIs.
A pattern with no explicit datasource ID is matched against the default 
set, one with an explicit datasource ID is matched against the 
corresponding dataset of facts.

There need be no formal link between the dataset URI and the web. There 
would be no enforced processing model requiring you to dereference the 
URI to fetch the data. The URI is simply a name for a data partition.

(1) Is this a reasonable approach at all?

(2) What other rule languages might need such dataset-specific 
conditions and would this mechanism be useful for them?

(3) Assuming some derivative of this can be made useful, should it go in 
the Core?

Dave

[*] http://www.w3.org/2005/rules/wg/wiki/Arch/Data_Sets
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Friday, 22 June 2007 16:09:50 UTC