W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2004

Re: Source and provenance words.

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Tue, 07 Sep 2004 11:00:28 +0100
Message-ID: <413D86BC.1060203@hp.com>
To: Dave Beckett <dave.beckett@bristol.ac.uk>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>


Thanks for the text.

I am including it in the document:

   For the section 6 text (Choosing What to Query)
   it is now the draft text.

   For the section 9 text (SOURCE) I have included it as
   an alternative to the material Eric wrote.  There needs to be
   a stable version today for the F2F so there may not be time
   for a considered integration.

Eric - We don't loose the text this way.  I'm assuming you'll be looking 
at it.


For me, ORIGIN is better than SOURCE (which is ambigous as to whether it 
is accessing the information or loading it - Alberto's comment) but
it suggests the first place a statement was made, not the place an 
aggregation founf the statement (which itself may be an aggregation).

The best I can think of is "GRAPH" (Trix-inspired) which would suggest 
unification with SimonR's ideas on targeted subqueries (my terminology).

"FOUND IN" makes the point but is not nice syntax.

FROM seems appropriate - a change of use but it is accurate.  Need a 
replacement for FROM-query-target but that might be easier than a good 
word for the current SOURCE.


Dave Beckett wrote:

> I volunteered to own this issue recorded as:
>   ACTION: DaveB to repropose source in both results and restrictions.
> written based on the feedback to my earlier email:
>   http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0307.html
> Refering to 
>   http://www.w3.org/2001/sw/DataAccess/rq23/
>   $Revision: 1.52 $ of $Date: 2004/09/03 15:04:38 $
> I have written for section:
>   9. Querying the Origin of Statements
>   http://www.w3.org/2001/sw/DataAccess/rq23/#source
> but I also needed something about data sources so I've got
> some additional words for section 6.  
> There is also some terminology change compared to the above email,
> now using Origin rather than Source to try to distinguish what might
> be a DAWG service access point (DAWG protocol API, WSDL1.x portType,
> WSDL2 instance of interface, target, file or implicit graph) from a
> URI of content.
> Some comments below after the words.
> ----------------------------------------------------------------------
> 6. Choosing What to Query
> ...
> DAWG queries operate against an RDF graph which is given implicitly
> where the graph context is known by the application or known
> externally such as by the DAWG protocol.  A FROM statement can also
> explicitly give a Source URI:
>   FROM <http://www.w3.org/2000/08/w3c-synd/home.rss>
>     (?x ?y ?z)
> The URI is retrieved and the resulting representation should
> represent RDF triples in some syntax, such as RDF/XML which provide
> the query graph.
> Aggregate graphs may also be queried by using multiple source URIs in
> the FROM clause such as:
>   FROM <uri1>, <uri2>
>   SELECT ...
> However this is implemented, the result must be equivalent to
> retrieving the Source URIs and forming the aggregate graph from the
> triples returned.  Implementations provide a single web service
> target that aggregates multiple Source URIs, accessed by the DAWG
> protocol or some other mechanism.
>   Issue: Refering to the DAWG protocol lots here without checking the
>   requirements for the protocol match.
> The RDF graph may be constructed through inference rather than
> retrieval or never be materialized.
>   Issue: Does the use of Source URI and representation for graphs
>   make sense with this.
> 9. Working with the Origin of Triples
> [Note change of section title]
> The Origin of an RDF triple in a query graph is the RDF URI Reference
> (ref) where a resource representation was retrieved that provided
> that triple, which may be in an aggregate graph.
>   Issue: Could allow blank nodes here which would help with the
>   inferred or non-materialized graphs.
>   TRiX http://www.w3.org/2004/03/trix/ removed this after originally
>   allowing named graphs to be named by blank nodes.  I think this was
>   due to scoping issues.  FIXME Find reference to why it was removed.
>   The ISWC2004 paper?
> A triple in an RDF graph may have zero or more Origins.  A BRQL
> application may optionally not support recording and providing origin
> information.
>   Issue: Making origin information optional does not help
>   interoperability.
> The Origin of a triple may be used in queries with the SOURCE
> clause before a triple
> Example 9.1:
>   Find all triples in an graph of aggregated RSS 1.0 feeds which were
>   retrieved from the W3C's feed.
> Data:
>   An aggregated graph of RSS 1.0 feeds including the triples
>   retrieved from Source URI http://www.w3.org/2000/08/w3c-synd/home.rss
> Query:
>   SELECT ?x,?y,?z WHERE
>     SOURCE <http://www.w3.org/2000/08/w3c-synd/home.rss> (?x ?y ?z)
> result:
>   the RDF triples with origin
>     <http://www.w3.org/2000/08/w3c-synd/home.rss> 
> If the application does not support origin information or no origin
> information was recorded when the aggregated graph was created, no
> results are returned.
>   Issue: Change to ORIGIN keyword?
>   Issue: Make RDF triples become non-RDF quads.
> Origin information can be returned by queries using a variable
> with the SOURCE clause.
> Example 9.2
>   An aggregate graph contains aggregated RSS 1.0 feeds and the query
>   wants to return all items indicating where they were originally
>   retrieved from, even with duplicates:
> Data:
>   an aggregate graph of RSS 1.0 feeds
> Query:
>     SOURCE ?s (?x rdf:type rss:item)
> Results:
>   ?s= ... ?x=...
>   for each RSS 1.0 item in the aggregate graph.
>   The ?s variables bind to the Origin URIs of the triple that matched.
> If an implementation does not support Origin information, the SOURCE
> ?s clause is ignored and no binding value is returned for ?s.
>   ?s=null ?x=...
>   ...
>   Issue: this means adding a null value definition which I know is
>     tricky.  The alternate is to not give a result for ?s:
>       ?x=...
>       ...
>     which means the returned result set is not regular.
> ----------------------------------------------------------------------
> My opinion on RDF quads for provenance in queries is well known and
> I've discussed this before in depth[1].  I don't like to see them in
> RDF query languages since there is little consensus what the fourth
> item is - that is one thing we are considering.
> They are of course fine as implementation techniques.  What's inside
> your application is up to you.
> Dave
> [1]
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0305.html
Received on Tuesday, 7 September 2004 10:01:51 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:00:45 UTC