- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Thu, 17 Jan 2013 11:35:14 +0000
- To: Provenance Working Group <public-prov-wg@w3.org>
On Thu, Jan 10, 2013 at 2:56 PM, Provenance Working Group Issue Tracker <sysbot+tracker@w3.org> wrote: > PROV-ISSUE-613 (prov-aq-draft-review): Review paq for release as last call working draft [Accessing and Querying Provenance] > https://dvcs.w3.org/hg/prov/raw-file/b3f397c7b15c/paq/prov-aq.html Here is my partial review of the above document PROV-AQ. Due to travelling and sick days I have not been able to review section 4, 5, 6, nor appendices. 1) Could we have a more detailed "Changes since last version" appendix, like in our other documents? 1.1 Concepts 2) Why the term "Target-URI"? As far as I can understand, this is "Entity-URI". It is only vaguely hinted that this is the identifier for the prov:Entity I should be looking for. 1.2 Provenance and resources 3) These paragraphs talk about 'revisions' and 'versions' interchangeably. In terms of provenance this can get a bit confusing. I would use only the term "revision" 4) "must be persistent and not themselves dependent on context" --> "must be persistent and must not themselves be dependent on context" 5) "In summary, a provenance description may be not universally applicable to a resource, but may be expressed with respect to that resource in a restricted context (e.g. at a particular time). This restriction is itself just another resource (e.g. the weather forecast for a give date as opposed to the current weather forecast), with its own URI for referring to it within a provenance description. " - this summary is I'm afraid more confusing then the previous 3 paragraphs. Could this be written in a lighter language? 1.4 URI types and dereferencing 6) "Service-URI A provenance query service (i.e. a resource of type prov:ProvenanceQueryService). " You can't use "i.e." here - we've never heard about prov:ProvenanceQueryService before. I don't think the type should be listed here as that is specific to section 4. (and possibly 3.3 although it is not mentioned there). 7) "Provenance-URI A provenance description in the sense described by [PROV-DM] (PROV Overview)." I am uncertain as to what this mean. Does this mean a PROV structure description - as given in PROV-DM, or any odd provenance description? >From the feeling of the rest of the document I understand it is any kind of provenance description, so I find the reference to PROV-DM odd here. (I do recognize that we should say strongly that a PROV format SHOULD be one of the formats - but this table is not the right place for it) 2. Accessing provenance descriptions 8) " There is no requirement that a bundle identifier can be dereferenced to access the corresponding provenance, but where practical it is RECOMMENDED that matters be arranged so this is possible. " - although this is not a formal specification, I don't think we need to write in 1850's legal English, so I would kindly request the honourable gentlemen to provide a more directly specified recommendation than "matters to be arranged". 9) " One possible realization of a bundle is that it is published as part of an RDF Dataset [RDF-CONCEPTS11] or similar composite structure containing multiple RDF graphs in a single document. To access such a bundle would require accessing the RDF Dataset and then extracting the identified component; this in turn would require knowing a URI or some other way to retrieve the dataset. This specification does not describe a specific mechanism for extracting components from a document containing multiple graphs. " - this sounds all very speculative and I don't see why this belongs in here at all. The various PROV serializations to larger and smaller extend already define how to represent PROV bundles. 3. Locating provenance descriptions 10) "If a provenance description is a resource that can be accessed using web retrieval, one needs to know its provenance-URI to dereference. If this is known in advance, there is nothing more to specify. If a provenance-URI is not known then a mechanism to discover one must be based on information that is available to the would-be accessor." - I don't understand this, and I don't understand why this is in the document. Could we try to write the document more like a specification rather than a philosophical "what-if" paper? 11) "provider is an agent that collects or constructs some information and makes it available. The nature of the information or the means by which it is made available are not constrained, but the following discussion focuses on provenance descriptions made available by HTTP transactions (i.e. where the provenance provider is an HTTP server), " -- Just simplify this to the same style as consumer: "provider is an agent that makes available provenance descriptions" I don't think we need to mention HTTP at all here, as only one of the 3 mechanisms deal with HTTP. 12) "We consider here mechanisms for a provider to indicate a provenance-URI or service-URI along with a target-URI. " This document is not a paper that considers things and reports results - this is a specification on how to do things. Change to "We here define" 13) "primary current web protocol and data formats" -> "current primary web protocol and data formats" 14) " While a provider should avoid giving spurious information, there are no fixed semantics, particularly when multiple resources are indicated, and a client should not assume that a specific given provenance-URI will yield information about a specific given target-URI. In the general case, a client presented with multiple provenance-URIs and multiple target-URIs should look at all of the provenance-URIs for information about any or all of the target-URIs. " - this paragraph sounds of out of place - and it's anyway too early as we have not yet seen a single way to get to this information. Delete and keep it only in appendix "Security Considerations". 15) " In the general case, a client presented with multiple provenance-URIs and multiple target-URIs should look at all of the provenance-URIs for information about any or all of the target-URIs. " - this is very low-level detail, and I don't understand it at this point (I've not seen my first target-URI yet!), so it's simply too heavy and too early to start with all the exceptions and edge-cases before I have even read about how to do it in the first place. Move all such considerations to the end. 16) "does not preclude the possibility that other publishers may " - not heard about "publisher" before - perhaps "provider"? 17) "Provenance indicated in this way is not guaranteed to be authoritative. Trust in the linked provenance descriptions must be determined separately from trust in the original resource. Just as in the web at large, it is a user's responsibility to determine an appropriate level of trust in any other linked resource; e.g. based on the domain that serves it, or an associated digital signature. (See also section 6. Security considerations.) " - this is just repeated blurb from half a screen up - although I think this is a slightly better place to mention it, so I am OK to leave it here as long as the previous blurb goes. 18) The document talks about URIs - but generally these days specifications talk about IRIs. Any reason for this (like HTTP Link headers must be URIs), and could we clarify this in an appendix? 19) "There may be multiple hasQueryService link header fields, and these may appear in an HTTP response together with hasProvenance link header fields (though, in simple cases, we anticipate that hasProvenance and hasQueryService link relations will not be used together). " - I think both 'may' should be 'MAY' - to correspond with equivalent section in 3.2. 20) Can the Link: <pre> blocks be broken into several lines? On my printout it is cut out just after #hasProvenance. I suggest: Link: <provenance-service-URI>; rel="http://www.w3.org/ns/prov#hasQueryService"; anchor="target-URI" This should also be valid HTTP (and is used in the 3.1.2 example). 21) Can we have an example of the two Link headers in use here? I find it confusing due to the <two> "styles" of URIs. 3.1.2 Content negotiation 22) The example seems to use HTTP 0.9. Could it be updated for HTTP 1.1? 3.2 Resource represented as HTML 23) Can the two <link> header lines be <b>old in both examples? 24) "The provenance-URI given by the hasProvenance link element" ... "The target-URI given by the hasAnchor link element " - I found these confusing, because I could not easily find "hasProvenance" and "hasAnchor" above - as they are bits of the URI. If you don't want to repeat the full URIs here, then highlight the two terms more (super-bold?) in the pre above. This is particularly confusing for hasAnchor - because in this style you have two <link> entries while in the HTTP example this was just a single link entry with an optional parameter. I don't like the approach here with the anchors disconnected from the hasProvenance - specially not as it is not consistent with the approach of 3.1. I would have preferred the two approaches to be equivalent. I now can't construct the Link headers of 3.1 based on the HTML in 3.2 or the RDF in 3.3. Although I don't particularly like it, I might recommend changing 3.1 to also have a separate 'hasAnchor' relation, to make it consistent. (Also it would allow the off-spec use of hasAnchor without provenance links). 3.2.1 specifying provenance query service 25) " (though, in simple cases, we anticipate that hasProvenance and hasQueryService link relations would not be used together). " - I would drop this sentence. I thought hasProvenance was for simple cases. 26) " (These terms may be used to indicate provenance of arbitrary other resources too, but discussion of such usage is beyond the scope of this section.) " - so where is the section where I can read about this? It sounds important and useful. 27) "The RDF property prov:hasProvenance is a relation between two resources, where the object of the property is a resource that presents a provenance description of the subject resource. " - I would add the term provenance-URI here. 28) " This property corresponds to a hasProvenance link relation used with an HTTP Link header field, or HTML <link> element (see above). " and " This corresponds to use of the anchor parameter in an HTTP provenance Link header field, or a hasAnchor link relation in an HTML <link> element, which similarly indicate a URI used by the provenance description to refer to the described document.", "This property corresponds to a hasQueryService link relation used with an HTTP Link header field, or HTML <link> element. " - I would totally drop these sentences - as long as you specify in funny font that it is target-URI and provenance-URI you are defining, it's OK. Section 3.2 don't have an equivalent statement, and reads quite easily. 29) Example Add "Turtle syntax [TURTLE]" somewhere near this example. 30) Example Remove the use of invalid and confusing ":" for continuation - if anything use # .. RDF data ... 31) Why are the provenance relations long URIs, rather than registered Link Types? I might have missed something, because earlier we suggested to register such link types as "provenance". 32) According to http://tools.ietf.org/html/rfc5988#section-4.2 When extension relation types are compared, they MUST be compared as strings (after converting to URIs if serialised in a different format, such as a Curie [W3C.CR-curie-20090116]) in a case- insensitive fashion, character-by-character. Because of this, all- lowercase URIs SHOULD be used for extension relations. Should we not have relation URIs that are all lowercase to avoid problems? ie. Link: <http://acme.example.org/provenance/super-widget>; rel="http://www.w3.org/ns/prov#hasprovenance" 33) Section 5 - Link examples don't have appropriate quoting.of rel and anchor. NOTE: I have not reviewed section 4, 5, 6, A, B, C due to time constraints. I might try to finish that tomorrow. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Thursday, 17 January 2013 11:36:03 UTC