W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2004

Re: Source and provenance words.

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Tue, 07 Sep 2004 11:00:28 +0100
Message-ID: <413D86BC.1060203@hp.com>
To: Dave Beckett <dave.beckett@bristol.ac.uk>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>

Dave,

Thanks for the text.

I am including it in the document:

   For the section 6 text (Choosing What to Query)
   it is now the draft text.

   For the section 9 text (SOURCE) I have included it as
   an alternative to the material Eric wrote.  There needs to be
   a stable version today for the F2F so there may not be time
   for a considered integration.

Eric - We don't loose the text this way.  I'm assuming you'll be looking 
at it.

---------

For me, ORIGIN is better than SOURCE (which is ambigous as to whether it 
is accessing the information or loading it - Alberto's comment) but
it suggests the first place a statement was made, not the place an 
aggregation founf the statement (which itself may be an aggregation).

The best I can think of is "GRAPH" (Trix-inspired) which would suggest 
unification with SimonR's ideas on targeted subqueries (my terminology).

"FOUND IN" makes the point but is not nice syntax.

FROM seems appropriate - a change of use but it is accurate.  Need a 
replacement for FROM-query-target but that might be easier than a good 
word for the current SOURCE.

	Andy

Dave Beckett wrote:

> I volunteered to own this issue recorded as:
>   ACTION: DaveB to repropose source in both results and restrictions.
> written based on the feedback to my earlier email:
>   http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0307.html
> 
> Refering to 
>   http://www.w3.org/2001/sw/DataAccess/rq23/
>   $Revision: 1.52 $ of $Date: 2004/09/03 15:04:38 $
> 
> I have written for section:
>   9. Querying the Origin of Statements
>   http://www.w3.org/2001/sw/DataAccess/rq23/#source
> but I also needed something about data sources so I've got
> some additional words for section 6.  
> 
> There is also some terminology change compared to the above email,
> now using Origin rather than Source to try to distinguish what might
> be a DAWG service access point (DAWG protocol API, WSDL1.x portType,
> WSDL2 instance of interface, target, file or implicit graph) from a
> URI of content.
> 
> Some comments below after the words.
> 
> ----------------------------------------------------------------------
> 
> 6. Choosing What to Query
> 
> ...
> 
> DAWG queries operate against an RDF graph which is given implicitly
> where the graph context is known by the application or known
> externally such as by the DAWG protocol.  A FROM statement can also
> explicitly give a Source URI:
> 
>   FROM <http://www.w3.org/2000/08/w3c-synd/home.rss>
>   SELECT * WHERE
>     (?x ?y ?z)
> 
> The URI is retrieved and the resulting representation should
> represent RDF triples in some syntax, such as RDF/XML which provide
> the query graph.
> 
> Aggregate graphs may also be queried by using multiple source URIs in
> the FROM clause such as:
> 
>   FROM <uri1>, <uri2>
>   SELECT ...
> 
> However this is implemented, the result must be equivalent to
> retrieving the Source URIs and forming the aggregate graph from the
> triples returned.  Implementations provide a single web service
> target that aggregates multiple Source URIs, accessed by the DAWG
> protocol or some other mechanism.
> 
>   Issue: Refering to the DAWG protocol lots here without checking the
>   requirements for the protocol match.
> 
> The RDF graph may be constructed through inference rather than
> retrieval or never be materialized.
> 
>   Issue: Does the use of Source URI and representation for graphs
>   make sense with this.
> 
> 
> 
> 9. Working with the Origin of Triples
> [Note change of section title]
> 
> 
> The Origin of an RDF triple in a query graph is the RDF URI Reference
> (ref) where a resource representation was retrieved that provided
> that triple, which may be in an aggregate graph.
> 
>   Issue: Could allow blank nodes here which would help with the
>   inferred or non-materialized graphs.
> 
>   TRiX http://www.w3.org/2004/03/trix/ removed this after originally
>   allowing named graphs to be named by blank nodes.  I think this was
>   due to scoping issues.  FIXME Find reference to why it was removed.
>   The ISWC2004 paper?
> 
> A triple in an RDF graph may have zero or more Origins.  A BRQL
> application may optionally not support recording and providing origin
> information.
> 
>   Issue: Making origin information optional does not help
>   interoperability.
> 
> The Origin of a triple may be used in queries with the SOURCE
> clause before a triple
> 
> Example 9.1:
>   Find all triples in an graph of aggregated RSS 1.0 feeds which were
>   retrieved from the W3C's feed.
> 
> Data:
>   An aggregated graph of RSS 1.0 feeds including the triples
>   retrieved from Source URI http://www.w3.org/2000/08/w3c-synd/home.rss
> 
> Query:
>   SELECT ?x,?y,?z WHERE
>     SOURCE <http://www.w3.org/2000/08/w3c-synd/home.rss> (?x ?y ?z)
> 
> result:
>   the RDF triples with origin
>     <http://www.w3.org/2000/08/w3c-synd/home.rss> 
> 
> 
> If the application does not support origin information or no origin
> information was recorded when the aggregated graph was created, no
> results are returned.
> 
>   Issue: Change to ORIGIN keyword?
> 
>   Issue: Make RDF triples become non-RDF quads.
> 
> Origin information can be returned by queries using a variable
> with the SOURCE clause.
> 
> Example 9.2
> 
>   An aggregate graph contains aggregated RSS 1.0 feeds and the query
>   wants to return all items indicating where they were originally
>   retrieved from, even with duplicates:
> 
> Data:
>   an aggregate graph of RSS 1.0 feeds
> 
> Query:
>   SELECT ?s WHERE
>     SOURCE ?s (?x rdf:type rss:item)
> 
> Results:
>   ?s= ... ?x=...
>   for each RSS 1.0 item in the aggregate graph.
> 
>   The ?s variables bind to the Origin URIs of the triple that matched.
> 
> If an implementation does not support Origin information, the SOURCE
> ?s clause is ignored and no binding value is returned for ?s.
> 
>   ?s=null ?x=...
>   ...
> 
>   Issue: this means adding a null value definition which I know is
>     tricky.  The alternate is to not give a result for ?s:
> 
>       ?x=...
>       ...
>     which means the returned result set is not regular.
> 
> 
> ----------------------------------------------------------------------
> 
> My opinion on RDF quads for provenance in queries is well known and
> I've discussed this before in depth[1].  I don't like to see them in
> RDF query languages since there is little consensus what the fourth
> item is - that is one thing we are considering.
> 
> They are of course fine as implementation techniques.  What's inside
> your application is up to you.
> 
> Dave
> 
> [1]
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0305.html
> 
Received on Tuesday, 7 September 2004 10:01:51 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:20 GMT