RE: Test cases: source of a triple from Rob Shearer on 2004-08-26 (public-rdf-dawg@w3.org from July to September 2004)

From: Rob Shearer <Rob.Shearer@networkinference.com>
Date: Thu, 26 Aug 2004 15:48:03 -0700
To: "Seaborne, Andy" <andy.seaborne@hp.com>, "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
Message-ID: <CFE388CECDDB1E43AB1F60136BEB4973028133@rome.ad.networkinference.com>

> > If you were to query either
> > one of these documents, then you'd be querying RDF, not a 
> "completed"
> > graph with extra inferences. I certainly haven't seen anything to
> > suggest that a query implementation should be able to perform
> > inferencing, and I certainly don't see anything in the BRQL 
> spec to try
> > to get this to happen.
> 
> Firstly - I'm not arguing for one way or another.  You are 
> assuming I am
> advocating a position.  I undertook to write some test cases 
> for points I
> thought needed answering.  
> 
> I agree that the query system accesses a graph without regard 
> to inference.
> I think this should be a headline principle of our work.
> 
> It happens to lead to ?src = <a1/a2.rdf> in each case which 
> seems like the
> natural answer.  So far so good.  

I don't understand what you're saying here.
I totally dig on your first example, which just involved aggregation.
Yes there's an issue when a statement has multiple sources, and yes it
needs to be addressed. I was trying to avoid addressing that
problem--inferencing is quite orthogonal. Thus I take issue with the
very notion of your second and third examples, because they both assume
a very big piece of functionality (inferencing) that I don't think
anyone is expecting to be in a query processor.

> Yes.  This is one of my worries about quads - that it is an incomplete
> solution without chaining.  There is a reasonable position to 
> take that only
> the last step in the chain matters because publishing a statement is
> undertaking it to be true.  Hence, only quads are needed.  
> I'd just like to
> be sure of this if the WG decides that way.

I think the general solution to this problem is that the fourth element
of each quad is simple a unique identifier for the statement. You can
then use a regular triple with that identifier as its subject to tack
whatever information you want on to that identifier. You can use
multiple triples, and since roles don't need to be functional, you can
have multiple values for the same thing. Eg:

:Rob :worksFor :NetworkInference :Statement1
:Statement1 :assertedBy :Rob

It can be argued that you don't really want that statement identifier to
be relevent. What matters is that you're talking about the triple
involving Rob, worksFor, and NetworkInference. If this triple ever
appears again (and RDF is built such that such an event should be
mundane and tolerable) then it would be a little nasty if it got a
completely different identifier. For example, if both Rob and Network
Inference asserted the same thing, what you want is:

:Rob :worksFor :NetworkInference :Statement1
:Statement1 :assertedBy :Rob
:Rob :worksFor :NetworkInference :Statement1
:Statement1 :assertedBy :NetworkInference

And an aggregation system would probably do something like:

:Rob :worksFor :NetworkInference :Statement1
:Statement1 :assertedInDocument :RobsBusinessCard.rdf
:RobsBusinessCard.rdf :endorsedBy :Rob
:RobsBusinessCard.rdf :printedOn 2004-03-01^^xsd:date
:Statement1 :assertedInDocument :NIEmployeeListing.rdf
:NIEmployeeListing.rdf :endorsedBy :NetworkInference
:NIEmployeeListing.rdf :currentAsOf 2004-08-26^^xsd:date

My understanding was that RDF was *supposed* to have a reification
system such that you could actually talk about the "Rob worksFor NI"
triple as a bNode (even assigning it an explicit identifier such as
Statement1 if you liked). If that worked, then these SOURCE things could
really be exposed in the pure RDF model (even if many implementations
stored them in some more efficient format, like quads or chained quads).

Received on Thursday, 26 August 2004 22:50:56 UTC