Re: Datasets, blackboxes and frames from Chris Welty on 2007-06-25 (public-rif-wg@w3.org from June 2007)

From: Chris Welty <cawelty@gmail.com>
Date: Mon, 25 Jun 2007 10:36:29 -0400
To: Michael Kifer <kifer@cs.sunysb.edu>
CC: Dave Reynolds <der@hplb.hpl.hp.com>, RIF <public-rif-wg@w3.org>
Message-ID: <467FD2ED.5010109@gmail.com>
Michael Kifer wrote:
>>
>> Michael Kifer wrote:
>>> Dave Reynolds <der@hplb.hpl.hp.com> wrote:
>>>> In particular, I would find it useful to be able to map SPARQL-style 
>>>> named graph expressions into RIF - e.g. in order to represent CWM rules 
>>>> and because that something we need for our own use cases (which may 
>>>> affect how JenaRules evolves).
>>> SPARQL's named graphs is a hack, 
>> Sparql is a candidate recommendation from the W3C.  If you find 
>> something wrong with it, there are open channels (not academic 
>> publications) in which you can state any objection.
> 
> SPARQL is a hack because it does not have model theory. They decided to
> relegate it to an appendix, and it does not exactly match the graph
> matching algorithm that they use. The algorithm currently used is a hack
> and so is their named graph idea. But all this can be approximated with a
> traditional model theory, which is what we should do in RIF.

Yes, yes, but I'm not talking about sparql in general; You had said 
Sparql named graphs are a hack, and I am asking you to be more 
specific.  What in particular about sparql named graphs do you find 
objectionable?

> I do not need to state my objections because this is well-known and a
> number of people in the SPARQL group have already raised it before
> (unsuccessfully). In fact, we discussed this with you and Enrico when you
> were in Bolzano.

We did not talk about named graphs in particular, but anyway *my* 
point is that you share your technical objections with the group 
(where they impact RIF).

>> If you think there are problems with it that are relevant to RIF, 
>> please state them.  Stating that it is "a hack" doesn't help at all.
> 
> Fortunately, we are not dependent on SPARQL. All we need is to provide some
> kind of interface. Since they refused to give a normal model theory to
> their language, it makes our (RIF) job easy: it is just a built-in with a
> black-box semantics.

Yes, but we seem in agreement that some kind of KB partitioning is in 
order for RIF as well.  Why not named graphs?

>>> which has clean logical counterpart. It is
>>> called scoped inference. It was described in several places, such as
>>> http://www.springerlink.com/content/f511460n0v3hl61n/
>>> http://www.springerlink.com/content/1kcf7e0eu32kycxr/
>> If you wish people in the group to read it, please provide a link to a 
>> version we can access.  These links require paying a fee.
> 
> OK, did not realize it was not free. But
> one can simply cut and paste the titles and get the links from there. Anyway:
> http://www.inf.unibz.it/~jdebruijn/publications/msa-ruleml05.pdf
> ftp://ftp.cs.sunysb.edu/pub/TechReports/kifer/flora-lpnmr2005.pdf

That's better, thanks.

-Chris

> 
> 
> 
> 	--michael  
> 
>> Anyway, I think we agree *something like this* would be useful.
>>
>> -Chris
>>
>>> It has also been implemented in several systems, such as Flora-2, Triple,
>>> Ontobroker.
>>>
>>> This is all we need to have scoped negation, which is mentioned in the
>>> Charter for phase 2. So, having this in the core will pave way for scoped
>>> negation in phase 2.
>>>
>>>
>>>> This could be achieved by having some builtin in the library that can 
>>>> query a dataset, such as the SPARQL blackbox we have talked about before:
>>>>
>>>>     SPARQL(dataset-id-list, query-string, var1, ... varn)
>>>>
>>>> However, I wonder whether it would be possible/reasonable to have the 
>>>> frame terms include an optional datasource identifier:
>>>>
>>>>     oid{datasource}[p->v, ... p'->v']
>>>>
>>>> N.B. I don't care about the human readable syntax, this is just to give 
>>>> a way to discuss it.
>>> A conceptually better syntax is
>>>
>>>      oid[p->v, ... p'->v']@datasource
>>>      pred(....)@datasource
>>>
>>> The important point here is not the exact syntax, but an emphasis on the
>>> fact that we are asking queries (the part left of @) against a knowledge base
>>> (a logical theory), which is to the right of @.
>>>
>>> Note that this is not (and should not be) specific to RDF. Scoped inference
>>> is a generally useful facility for distributed (and even non-distributed)
>>> knowledge bases.
>>>
>>>> Thus the facts would be partitioned into a set of fact datasets, one 
>>>> default anonymous one and a set of named ones identified by URIs.
>>> Does not need to be identified by a URI. This facility is also very useful
>>> for modularization of a KB. It is the same issue as global/local Ids for
>>> predicates.
>>>
>>>> A pattern with no explicit datasource ID is matched against the default 
>>>> set, one with an explicit datasource ID is matched against the 
>>>> corresponding dataset of facts.
>>> Yes, this is exactly how it is implemented in FLORA-2.
>>>
>>>
>>>> There need be no formal link between the dataset URI and the web. There 
>>>> would be no enforced processing model requiring you to dereference the 
>>>> URI to fetch the data. The URI is simply a name for a data partition.
>>> Right.
>>>
>>>> (1) Is this a reasonable approach at all?
>>> Yes.
>>>
>>>> (2) What other rule languages might need such dataset-specific 
>>>> conditions and would this mechanism be useful for them?
>>> I think every language needs it, but some do not realize it :-)
>>>
>>>> (3) Assuming some derivative of this can be made useful, should it go in 
>>>> the Core?
>>> I believe that this is necessary even just to be able to properly interface
>>> with RDF in the core. The problem is that without such a facility there is
>>> no way to represent RDF/S theories properly. If we just include RDFS axioms
>>> then there is no barrier to people adding other axioms that affect the
>>> inference in imported RDF/S data. Worse, the interaction between the
>>> imported theories and other rules may (and is likely to be) unintentional.
>>>
>>>
>>> 	cheers
>>> 	  --michael  
>>>
>>>
>> -- 
>> Dr. Christopher A. Welty                    IBM Watson Research Center
>> +1.914.784.7055                             19 Skyline Dr.
>> cawelty@gmail.com                           Hawthorne, NY 10532
>> http://www.research.ibm.com/people/w/welty
>>
> 
> 
> 

-- 
Dr. Christopher A. Welty                    IBM Watson Research Center
+1.914.784.7055                             19 Skyline Dr.
cawelty@gmail.com                           Hawthorne, NY 10532
http://www.research.ibm.com/people/w/welty
Received on Monday, 25 June 2007 14:37:25 UTC