Re: (TED] SPARQL, data sources and blackboxes [was (Re: [UCR] ISSUE-12 and ACTION6198)]

Christian de Sainte Marie wrote:
> 
> Dave Reynolds wrote:
>>
>> [...]
>> Let us pick one of the boundary cases to ground the discussion. What 
>> about the set of builtins/functions such as one for access to an 
>> external SPARQL data source.
>>
>> Technically there is nothing stopping us defining such a thing but 
>> where would that go? It can't go in RIF Core [*] because lots of rule 
>> vendors won't want to support such a thing.
> 
> I do not think that the vendors are the main issue, here. Suppose we 
> have good reasons to put such buildins in RIF Core: we would require 
> compliant implementations to *understand* a RIF rule that contains a 
> SPARQL query. But could we require that they (more precisely, the 
> applications that use the retrieve rules) be able to *execute* the 
> query, that is, not only to implement SPARQL, but also to have a 
> SPARQL-able data source?
> 
> The intuitive answer seems to be 'no' (and that applies to SQL queries 
> etc, as well); at least not in RIF Core.

As you might guess my answer is 'yes' :-)

One point of picking this example is that the SPARQL query is not just 
another way of expressing a query it is a query to a specific web 
information source (given by the URL in the first argument position).

As a random example the rule set might express how to translate a 
purchase order between two formats (e.g. from ebxml to Rosettanet) but 
involves looking up translation of product codes using a web-based 
product code database. The ruleset will work on your local data (a 
purchase order) but needs access to a web resource to work.

This ability to explicitly access web data sources is, to me, part of 
the qualification to be a web rule language, let alone a semantic web 
one. [Which gets us back to why I preferred the "basis for" weasel 
wording for Issue 12.]

> On the other hand, if requiring an implementation to understand a query 
> means that they must be able to translate it into a query that makes 
> sens against their own data sources (or their own access to the 
> specified data sources), why include it as a SPARQL query? Shouldn't it 
> rather be included in a data-source neutral form? That would place the 
> burden of tranformation on the publisher rather than on the receiver, 
> which might have some inconvenience (like, making roundtrips more 
> complex), but it would keep RIF Core simpler (specialised dialects would 
> then add specialised buildins ad libitum).

See above, the access is tied to the data source. The existence of 
SPARQL as a standard general purpose remote data access protocol makes 
it relevant to RIF in a way that more application-specific web-services 
would not be.

> But it may not be that simple, e.g., what if the rules I publish are 
> supposed to access my data source, even when you use them - something we 
> could have in UC6, e.g. if pharmaceutical companies published their 
> drugs notices in the form of rulesets? 

Exactly, I guess I should have read ahead further before replying ...

> Allowing RIF to include 
> blackboxes could be a solution for that UC (something like: if the rules 
> I retrieve from you include a blackbox, I can return the blackbox (*) to 
> you along with some specified instance data, and you have to return me 
> the appropriate answer), but what of other use cases? If any?

I don't understand what you mean by "include blackboxes" I thought we 
were using blackbox to refer to accessing some external oracle. It would 
certainly be infeasible to transmit an entire product code translation 
database as part of the rule set. Assuming you mean the same thing then 
what we would need to allow you to encode blackbox access would be a 
remote access protocol, a general query format and a neutral results 
format for returning variable bindings - i.e. SPARQL.

>> Unless we start a semantic-web-friendly dialect there is no dialect to 
>> put it in. We could put it in the proposed library of reusable 
>> components so that future definers of semantic-web-friendly dialects 
>> might reuse the same one. 
> 
> My hypothesis was rather that it (the reusable component library) would 
> work the other way round: if we or some future definers of dialects need 
> to specify such buildins, we or they will put them in the library of 
> reusable components. I think that it would be dangerous to allow the 
> addition in the library of components that are not used in a dialect 
> (under the motto: "make sure that a component is usable before thinking 
> of re-use" :-)

That's what I thought you meant, which completes the argument that there 
is no place for such a thing except in some (semantic-)web-friendly 
dialect.

Dave

Received on Tuesday, 9 January 2007 21:39:19 UTC