Re: PAQ document update, target renamed as context from Olaf Hartig on 2011-08-21 (public-prov-wg@w3.org from August 2011)

From: Olaf Hartig <hartig@informatik.hu-berlin.de>
Date: Sun, 21 Aug 2011 18:15:03 +0200
To: public-prov-wg@w3.org
Message-Id: <201108211815.05141.hartig@informatik.hu-berlin.de>
Hey,

On Friday 19 August 2011 19:25:19 Graham Klyne wrote:
> Olaf,
> 
> Many thanks for your feedback.  It is most valuable.  Most of your comments
> have been actioned in the draft version at
> http://dvcs.w3.org/hg/prov/raw-file/cf79b13c1217/paq/provenance-access.html

The changes look good. Thanks!

My responses to your comments inline:

> There are a couple of issues you raise here which I am treating as
> unresolved:
> 
> (1) the subject of HTTP/HTML links specifying provenance-URI, context-URI. 
> This was already raised as ISSUE 68, so I've added a note there.

see below

> (2) the section on adding links to RDF original data remains incomplete.  I
> briefly discuss it below, but I have not yet updated the document other
> than to add some comments.

ditto
 
> Olaf Hartig wrote:
> [...]
> > --Section 3.1--
> > *) Why do we begin with the POWDER mechanism here? I would propose to
> [...]
> > *) Regarding the first Issue (i.e. a separate Link header field for
> > anchor or anchor as a parameter): We should pick the second because it
> > is precise about which provenance-URI is associated with which
> > context-URI.
> 
> That is true.  But that [precision cannot be achieved using the alternative
> mechanisms. especially HTML <link> element, so I'm actually leaning the
> other way.

I would consider HTTP Link header fields more important than HTML link elements 
because they serve a more general use case. In other words, we shouldn't 
introduce an unnecessary limitation in the preciseness of the HTTP Link based 
mechanism that we propose, only because the (more specific) HTML link based 
mechanism isn't expressive enough.

> I've updated ISSUE 68
> (http://www.w3.org/2011/prov/track/issues/68) to mention this problem. 
> I'm treating it as currently unresolved.

I don't see how this question (i.e. using a separate Link header field for 
anchor or anchor as a parameter) is _directly_ related to ISSUE 68 - most
of the note that you added to ISSUE 68 is irrelevant for ISSUE 68 and should 
not be conflated with ISSUE 73 and the corresponding first Issue in Sec.3.1 of 
the PAQ document. ISSUE 68 is about the case where the anchor parameter is 
missing (and there is no Link of the additional type that we may or may not 
introduce). For that reason, I propose you remove the corresponding, 
irrelevant parts from the note that you added to ISSUE 68.
 
> [...]
> > --Section 3.2--
> > *) Same issue with POWDER as raised for Sec.3.1.
> 
> Agreed.  Reference removed as noted above.

It seems you forgot that one. It's fine in Sec.3.1 now, but Sec.3.2 still 
begins with the POWDER related introduction.
 
> [...]
> > --Section 3.3--
> > *) For prov:hasProvenance triples I still don't understand how the
> > subject is associated to the set of RDF triples that contains the
> > corresponding prov:hasProvenance triple. To put it differently, what URI
> > do I as a publisher use in the subject position of a prov:hasProvenance
> > triple if I want to say that the object resource represents provenance
> > information about that very set of triples which currently represent the
> > resource in question.
> 
> You use the URI of the containing RDF.

What exactly is this URI?
Let's use the following example to clarify my confusion: In order to retrieve 
data about

   <http://dbpedia.org/resource/Berlin>

I retrieve a representation of the Web resource identified by URI

  <http://dbpedia.org/data/Berlin>

I parse this _representation_ and obtain some RDF triples. Obviously, this set 
might be different today than it was yesterday, because the data in DBpedia 
changes and, thus, I get different representations. Now, my question is, what 
URI should the DBpedia guys use as subject in a prov:hasProvenance triple that 
may occur in the representation served today, if they want to refer to 
provenance information about _today's_ data about Berlin? It cannot be the URI

  <http://dbpedia.org/data/Berlin>

because tomorrow that URI might not identify today's data about Berlin 
anymore.

(This discussion is in some sense related to ISSUE-68)

> For RDF documents, this is sometimes written as an empty URI-reference; e.g.
> 
>    <rdf:Description rdf:about="">
>      <prov:hasProvenance rdf:resource="(provenance_URI)"/>
>    </rdf:Description>

No. At least not exactly ;-)  The subject of the RDF triple encoded in this 
RDF/XML snippet is the base URI (without any fragment part) of the 
corresponding RDF/XML document (see Sec.5.3 in the RDF/XML spec [1]).
If such a base URI is not explicitly defined in the document, then the rules 
from Sec.4.1 of the XML Base spec [2] apply. For the DBpedia example this 
means that the base URI is

  <http://dbpedia.org/data/Berlin>

because DBpedia serves RDF/XML serializations without an xml:base attribute. 
As mentioned before, I wouldn't consider that URI suitable as the subject of 
prov:hasProvenance triples.

[1] http://www.w3.org/TR/rdf-syntax-grammar/#section-baseURIs
[2] http://www.w3.org/TR/2001/REC-xmlbase-20010627/#rfc2396

> (If publishing the RDF in a named graph, then use the URI of the graph.)

I would agree here iff everybody would understand Named Graphs as immutable
set of RDF triples (which is not the case, I guess).
 
> I agree this section needs fleshing out still.  I guess I was waiting for
> the dust to settle on the provenance model and vocabulary.

Okay.

Thanks,
Olaf
Received on Sunday, 21 August 2011 16:15:50 UTC