W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2004

RE: Test cases: source of a triple

From: Dan Connolly <connolly@w3.org>
Date: Fri, 27 Aug 2004 17:27:49 -0500
To: Rob Shearer <Rob.Shearer@networkinference.com>
Cc: Andy Seaborne <andy.seaborne@hp.com>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <1093645668.2934.5012.camel@dirk>

On Thu, 2004-08-26 at 17:48, Rob Shearer wrote:
[...]
> I don't understand what you're saying here.
> I totally dig on your first example, which just involved aggregation.
> Yes there's an issue when a statement has multiple sources, and yes it
> needs to be addressed. I was trying to avoid addressing that
> problem--inferencing is quite orthogonal. Thus I take issue with the
> very notion of your second and third examples, because they both assume
> a very big piece of functionality (inferencing) that I don't think
> anyone is expecting to be in a query processor.

Hmm... not only do people expect inferencing to be in a query processor
eventually, some people have already built processors that do it.

As to the interaction between inference and source, swap/cwm/N3
has a design that I quite like. The proposals I have seen
in this WG for SOURCE would be a big step backward; I'd
rather we punted until the next version than standardize
anything like them.

I like Andy's test cases; I can illustrate the design we use
in swap/cwm/N3 with them pretty well, I think. I hope this
shows that the issue is separable... that SOURCE can be
added later without big changes to the basic graph
matching stuff...

> On Thu, 2004-08-26 at 03:47, Seaborne, Andy wrote:
[...] 
> > == Test case 2: inference
> > 
> > Data:
> >   a1.rdf:
> >   :x rdf:type :C1 .
> >   :C1 rdfs:subClassOf :C2 .
> >
> > Query:
> >     SELECT * WHERE { ?x rdf:type :C2 }

You replied later that you meant to have a ?src in there...
but I'll show the N3 analog of the query as written
first... (using the all-singing cat presentation tool,
as pioneered by Rob S in Amsterdam... ;-)

swap$ cat test/includes/a.n3
@keywords a.
@prefix : <http://example/vocab#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

x a C1.
C1 rdfs:subClassOf C2.

swap$ cat test/includes/query.n3
@keywords a.
@prefix : <http://example/vocab#>.

{ ?x a C2 } => { (("?x") (?x)) a BindingSet }.

swap$ python cwm.py test/includes/a.n3 util/rdfs-rules.n3 --think --filter=test/includes/query.n3
#Processed by Id: cwm.py,v 1.162 2004/08/08 01:44:49 syosi Exp
     @prefix : <http://example/vocab#> .

      (  (
            "?x"  )
         (
            :x  ) )
         a :BindingSet .

#ENDS



Now... when we consider the source issue...

> > ?x = :x
> > ?src = <a.rdf> maybe.

no... I think ?src is the RDFS-closure of <a.rdf>;
please don't let's conflate <a.rdf> with its RDFS-closure.

In swap/cwm/N3, if you want to query the source of some
data, you have to move the references to that data from
the command line into the query itself. It then
looks like this...


swap$ cat test/includes/query-src.n3
@keywords a.
@prefix : <http://example/vocab#>.
@prefix log:  <http://www.w3.org/2000/10/swap/log#> .

{ (<a.n3>.log:semantics
   <../../util/rdfs-rules.n3>.log:semantics
  ).log:conjunction log:conclusion ?F } => { <a_rdfs_closure> graph ?F
}.

@forAll :X.
{ ?SRC graph [
     log:includes { :X a C2 } ] } =>
   { (("?x" :X) ("?src" ?SRC)) a BindingSet }.

swap$ python cwm.py test/includes/query-src.n3 --think --data
#Processed by Id: cwm.py,v 1.162 2004/08/08 01:44:49 syosi Exp
     @prefix : <http://example/vocab#> .

      (  (
            "?x"
            :x  )
         (
            "?src"
            <a_rdfs_closure>  ) )
         a :BindingSet .

#ENDS


Now on to...
> == Test case 3: Inference by multiple routes:

I had to change the query. I could probably write
one query that would work for cases 2 and 3,
but it would involve hairy-looking list processing
that I think will just distract from the point.

The point is: the source, in this case, is
the RDFS closure of the conjuction (merge)
of a.rdf and b.rdf.

swap$ cat test/includes/a.n3 @keywords a.
@prefix : <http://example/vocab#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

x a C1.

swap$ cat test/includes/b.n3
@keywords a.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://example/vocab#>.

C1 rdfs:subClassOf C2 .

swap$ cat test/includes/query-src.n3 @keywords a.
@prefix : <http://example/vocab#>.
@prefix log:  <http://www.w3.org/2000/10/swap/log#> .

{ (<a.n3>.log:semantics
   <b.n3>.log:semantics
   <../../util/rdfs-rules.n3>.log:semantics
  ).log:conjunction log:conclusion ?F } => { <a_b_rdfs_closure> graph ?F
}.

@forAll :X.
{ ?SRC graph [
     log:includes { :X a C2 } ] } =>
   { (("?x" :X) ("?src" ?SRC)) a BindingSet }.

swap$ python cwm.py test/includes/query-src.n3 --think --data
#Processed by Id: cwm.py,v 1.162 2004/08/08 01:44:49 syosi Exp
     @prefix : <http://example/vocab#> .

      (  (
            "?x"
            :x  )
         (
            "?src"
            <a_b_rdfs_closure>  ) )
         a :BindingSet .

#ENDS

> > Yes.  This is one of my worries about quads - that it is an incomplete
> > solution without chaining.  There is a reasonable position to 
> > take that only
> > the last step in the chain matters because publishing a statement is
> > undertaking it to be true.  Hence, only quads are needed.  
> > I'd just like to
> > be sure of this if the WG decides that way.
> 
> I think the general solution to this problem is that the fourth element
> of each quad is simple a unique identifier for the statement.

I've been there. I recommend against it. Identifiers for graphs
are much more useful than identifiers for statements, in my experience.

[...]
> My understanding was that RDF was *supposed* to have a reification
> system such that you could actually talk about the "Rob worksFor NI"
> triple as a bNode (even assigning it an explicit identifier such as
> Statement1 if you liked). If that worked, then these SOURCE things could
> really be exposed in the pure RDF model (even if many implementations
> stored them in some more efficient format, like quads or chained quads).

Yes... unfortnately, the rdf:subject/predicate/object reification
design is... somewhere between goofy and useless, in my experience.
I wrote about that separately...

                          Subject: 
how cwm does quoting rather than
rdf:subject/predicate/object
                             Date: 
Fri, 27 Aug 2004 11:16:07 -0500

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Friday, 27 August 2004 22:28:21 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:20 GMT