- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Mon, 6 Sep 2004 17:32:53 +0100
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
I volunteered to own this issue recorded as: ACTION: DaveB to repropose source in both results and restrictions. written based on the feedback to my earlier email: http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0307.html Refering to http://www.w3.org/2001/sw/DataAccess/rq23/ $Revision: 1.52 $ of $Date: 2004/09/03 15:04:38 $ I have written for section: 9. Querying the Origin of Statements http://www.w3.org/2001/sw/DataAccess/rq23/#source but I also needed something about data sources so I've got some additional words for section 6. There is also some terminology change compared to the above email, now using Origin rather than Source to try to distinguish what might be a DAWG service access point (DAWG protocol API, WSDL1.x portType, WSDL2 instance of interface, target, file or implicit graph) from a URI of content. Some comments below after the words. ---------------------------------------------------------------------- 6. Choosing What to Query ... DAWG queries operate against an RDF graph which is given implicitly where the graph context is known by the application or known externally such as by the DAWG protocol. A FROM statement can also explicitly give a Source URI: FROM <http://www.w3.org/2000/08/w3c-synd/home.rss> SELECT * WHERE (?x ?y ?z) The URI is retrieved and the resulting representation should represent RDF triples in some syntax, such as RDF/XML which provide the query graph. Aggregate graphs may also be queried by using multiple source URIs in the FROM clause such as: FROM <uri1>, <uri2> SELECT ... However this is implemented, the result must be equivalent to retrieving the Source URIs and forming the aggregate graph from the triples returned. Implementations provide a single web service target that aggregates multiple Source URIs, accessed by the DAWG protocol or some other mechanism. Issue: Refering to the DAWG protocol lots here without checking the requirements for the protocol match. The RDF graph may be constructed through inference rather than retrieval or never be materialized. Issue: Does the use of Source URI and representation for graphs make sense with this. 9. Working with the Origin of Triples [Note change of section title] The Origin of an RDF triple in a query graph is the RDF URI Reference (ref) where a resource representation was retrieved that provided that triple, which may be in an aggregate graph. Issue: Could allow blank nodes here which would help with the inferred or non-materialized graphs. TRiX http://www.w3.org/2004/03/trix/ removed this after originally allowing named graphs to be named by blank nodes. I think this was due to scoping issues. FIXME Find reference to why it was removed. The ISWC2004 paper? A triple in an RDF graph may have zero or more Origins. A BRQL application may optionally not support recording and providing origin information. Issue: Making origin information optional does not help interoperability. The Origin of a triple may be used in queries with the SOURCE clause before a triple Example 9.1: Find all triples in an graph of aggregated RSS 1.0 feeds which were retrieved from the W3C's feed. Data: An aggregated graph of RSS 1.0 feeds including the triples retrieved from Source URI http://www.w3.org/2000/08/w3c-synd/home.rss Query: SELECT ?x,?y,?z WHERE SOURCE <http://www.w3.org/2000/08/w3c-synd/home.rss> (?x ?y ?z) result: the RDF triples with origin <http://www.w3.org/2000/08/w3c-synd/home.rss> If the application does not support origin information or no origin information was recorded when the aggregated graph was created, no results are returned. Issue: Change to ORIGIN keyword? Issue: Make RDF triples become non-RDF quads. Origin information can be returned by queries using a variable with the SOURCE clause. Example 9.2 An aggregate graph contains aggregated RSS 1.0 feeds and the query wants to return all items indicating where they were originally retrieved from, even with duplicates: Data: an aggregate graph of RSS 1.0 feeds Query: SELECT ?s WHERE SOURCE ?s (?x rdf:type rss:item) Results: ?s= ... ?x=... for each RSS 1.0 item in the aggregate graph. The ?s variables bind to the Origin URIs of the triple that matched. If an implementation does not support Origin information, the SOURCE ?s clause is ignored and no binding value is returned for ?s. ?s=null ?x=... ... Issue: this means adding a null value definition which I know is tricky. The alternate is to not give a result for ?s: ?x=... ... which means the returned result set is not regular. ---------------------------------------------------------------------- My opinion on RDF quads for provenance in queries is well known and I've discussed this before in depth[1]. I don't like to see them in RDF query languages since there is little consensus what the fourth item is - that is one thing we are considering. They are of course fine as implementation techniques. What's inside your application is up to you. Dave [1] http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/0305.html
Received on Monday, 6 September 2004 16:35:02 UTC