Re: Review Prov-AQ document

Sam,

Thanks for your comments.  I'll respond to some of your comments below.  I hope 
my responses adequately address your concerns, but if you feel they don't, 
please feel free to pursue further.

#g
--

On 20/04/2012 17:21, Sam Coppens wrote:
> Hello all,
>
> Below, you can find my review of the PROV-AQ document. It is a good document and
> I consider it ready for publication as public working draft. Thumbs up for the
> editors.
>
> --------------------
> PROV-AQ Review
>
> The document gives a good description of the access and query mechanisms for
> provenance information. It is well structured and easily understandable,
> including the service specification. I propose to make a it available as public
> working draft.
>
> I have some concerns and remarks, but they should not stop the publication of
> the document as public working draft. I raise no issues, but if the remarks are
> considered relevant, feel free to do so.
>
> I propose one addition:
>
> I would consider the ability to do round-trips (Going from the resource to its
> provenance information and back to the resource.) When provenance information is
> accessed using the HTTP protocol, the response of the accessed provenance
> infromation must then also include an HTTP header denoting the the subject of
> the provenance information. E.g. Link: target-URI; rel="isProvenanceFor";
> anchor="provenance-uri". The same can be done for provenance information
> accessed via REST services or resources represented in HTML or RDF. Maybe there
> is a good reason not to do this, but then I would include this motivation into
> the document.

If I understand you correctly, using a Link: header for this would be redundant 
as the target-URI would in any case be present in an RDF rendering of the 
provenance information.  At this stage, I'm don't think that defining another 
link relation type is helpful (though it remains possible to do this in a 
separate spec if there's a compelling use for it).

> I propose some modifications:
>
> Section 1.1: The term resource needs some clarification. I would indicate that a
> resource can be: an information resource or a non-information resource. (This
> already implies that the resource URI can be dereferencable or not.) This makes
> explicit that provenance can be recorded for non-information resources (e.g. a
> person) and for information resources (e.g. an RDF representation of that person
> or an HTML representation of that person, etc.)

I'd really like to avoid the terms "information resource" and "non-information 
resource" as these terms are sometimes rather controversial, but I'm happy to 
expand the description slightly to emphasize this point.  I propose:

[[
             <dt><dfn>Resource</dfn></dt>
             <dd>also referred to as <dfn>resource on the Web</dfn>: a resource 
  in the general sense of "whatever might be identified by a URI", as described 
by the Architecture of the World Wide Web [[WEBARCH]], <a 
href="http://www.w3.org/TR/webarch/#id-resources">section 2.2</a>. A resource 
may be associated with multiple instances or views (<a 
class="internalDFN">constrained resource</a>s) with differing provenance.</dd>
]]

> Section 3.4: Composite object-packaging formats. ORE and MPEG-21 DIDL are
> usually not packaged into ZIP archives, their datastreams sometimes are for
> storage reasons. BagIt is a sort of `self-descriptive` ZIP archive by
> specification, meant to be transmitted over the Web (e.g. it includes checksum
> information of the included datastreams for validation after transmission). Also
> Mets might be considered more relevant these days then MPEG-21 DIDL in the
> digital library and archive community.

I've added METS and dropped the reference to ZIP implementation (though O know 
of ORE implementations that are combined with ZIP packaging).

> Section 4.2: ..., defined by the provenance ontology [PROV-O]. The specified RDF
> object properties, e.g., prov:ProvenanceService, are at this moment not
> specified by PROV-O. Thus, PROV-O and PROV-AQ are out of sync.

Yes indeed.  I've added a TODO flags in the document to ensure these doesn't get 
forgotten.

> Section 7: ... secure HTTP (https) should be used. Why `should`? Shouldn`t this
> be `may`, and if not, why? Now it seems provenance information should always be
> retrieved using https.

The general point of this section is to alert developers that there might be 
serious security/trust concerns around provenance, and to point them in the 
direction of some helpful practices.

I accept there are times when https is not needed, and that there are other 
possibilities.

I propose to keep the SHOULD for now, but expand the context a little:
[[
       <p>
         Secure HTTP (https) SHOULD be used across unsecured networks when 
accessing provenance information that may be used as a basis for trust 
decisions, or to obtain a provenance URI for same.
       </p>
]]

#g
--

> Some spelling corrections:
>
> Section 3.2: The target-uri given by the anchor link element specifies an
> identifier for the document ... instead of ...specifies an specifies an
> identifier ...
>
>
>

Received on Thursday, 26 April 2012 14:03:37 UTC