Re: prov-aq review for release as working draft (ISSUE-613)

From: Timothy Lebo <lebot@rpi.edu>
Date: Wed, 16 Jan 2013 22:53:08 -0500
Cc: Provenance Working Group WG <public-prov-wg@w3.org>
Message-Id: <68F10501-FFCD-42C0-86B1-9E62AB29B14D@rpi.edu>
To: Paul Groth <p.t.groth@vu.nl>, Graham Klyne <graham.klyne@zoo.ox.ac.uk>
Paul, Graham,

Thanks for such a great document!

My review is below.


On Jan 10, 2013, at 10:13 AM, Paul Groth <p.t.groth@vu.nl> wrote:

> Dear all,
> PROV-AQ is now ready for review. This should be considered as a "last call" working draft version. 
> You can find the draft to review at:
> https://dvcs.w3.org/hg/prov/raw-file/b3f397c7b15c/paq/prov-aq.html
> Tim, Simon, Luc, Dong and Stian agreed to review but all comments are appreciated.
> Questions for reviewers
> - Can this be released as a last call working draft?

Most definitely yes.

> - Is the name provenance access and query appropriate for the document?

I see no problems with the current name.

> - If not, where are the blocking issues?


> - If yes, are there other issues to work on?

Detailed comments below; they are not much more than nits.

> We particularly encourage reviewers to look at Section 5 Forward provenance as this is a new section.
> In your review please include ISSUE-613

copy/paste error:

PROV-DM (Candidate Recommendation), the PROV data model for provenance (this document);

Should "PROV-AQ" be linked in:

PROV-AQ (To be published as Note), the mechanisms for accessing and querying provenance (this document);


I wonder if
"B. Names added to prov: namespace"
should become
"B. _Terms_ added to prov: namespace"


"The Provenance Data Model [PROV-DM], Provenance Ontology [PROV-O] and related specifications (see the [PROV-OVERVIEW]) define how to represent provenance in the World Wide Web."
suggest change to:
"The Provenance Data Model [PROV-DM], Provenance Ontology [PROV-O] and related specifications define how to represent provenance in the World Wide Web (see the [PROV-OVERVIEW])."


The reference to "(see section 1.2 for discussion)" in the definition of Target-URI #dfn-target-uri seems like too much of a crutch. And, going there does not seem to clarify the concept as much as the reference seems to suggest.

I suggest to remove the "(see section 1.2 for discussion)" in the definition of Target-URI, and if something is "missing" in the definition to simply add it there.


Provenance description
refers to provenance represented in some fashion.
Why the need for "description"? The DM's definition for provenance is that it is a "record" [1], which _is_ a description and "represents in some fashion." Using more terms for the same concept seems like it could confuse readers.

[1] http://www.w3.org/TR/prov-dm/#introduction


Provenance query service
a query service that provides a provenance description given a target-URI or other information about the desired provenance.
the URI of a provenance query service.

These two definitions do not seem to be aligned. A Provenance query service is a query service, but the URI that denotes it is simply a "Service URI"? Since a service is more general than a query service (and, a provenance query service).

If this is done for brevity, I guess it's reasonable. But it reads oddly in isolation.


Similar to 5) above, why use a section reference for a crutch in a definition? Or, why don't they _all_ have a section reference?

"see section 5. Forward provenance)"


Section 1.2 Provenance and resources spends a few paragraphs describing "prov:specialization" without actually talking about prov:specialization.
It leaves a sense that "resources are slippery" without much promise that it will be adequately addressed by PROV.

Suggest to add a line at the end of 1.2 foreshadowing how this kind of situation will be addressed.


1.4 URI types and dereferencing

What does "actual" in 
"actual URIs for accessing provenance descriptions are determined via the query service description." mean?
It seems to imply that the Service-URI is _not_ an actual URI.


It seems to me that "individual" could be dropped from:

"When there is no easy way to associate a provenance-URI with individual resources"

since the "smallness", or "distinctness" is not really part of the condition for when to use a provenance query service.

(I see that "multiple" is used later in that sentence, but I'm not sure why the multiplicity matters)


The following seems out of place in section 2:

"When publishing provenance descriptions, corresponding provenance-URIs or service-URIs should be discoverable using one or more of the mechanisms described in section 3. Locating provenance descriptions."

Perhaps if it were combined with the previous para-sentence, it would make sense (i.e., provenance-uris and service-uris should be included in "return enough")?


By the time I get to:

"The consumer of a provenance description will generally need to isolate information about some specific target resource or resources. These may be constrained resources identified by separate target-URIs than the original resource. In such circumstances, a provenance consumer will need to know the target-URI used by a provenance description."

I'm still not sure what a Target-URI is. Is it the URI of a provenance description? (I don't think so) Is it a possibly distinct URI that is used to denote the *same* resource about which we originally asked for provenance? (probably not "same", but "closeish" -- but what does "closeish" mean?) Hopefully some examples will come along to help clarify. If I request URI X and I'm given a Target-URI of Y, how do X and Y relate? I know I should find Y described in the given provenance-uri, but once I find it how do I know what it means w.r.t. URI X?



The presence of a hasProvenance link in an HTTP response does not preclude the possibility that other publishers may offer provenance descriptions about the same resource. In such cases, discovery of the additional provenance descriptions must use other means (e.g. see section 4. Provenance query services).

should a reference to the forward provenance section be included, too?



"An HTML document header may include multiple hasProvenance link elements, indicating a number of different provenance descriptions that are known to the creator of the document, each of which may provide provenance about the document."

suggest to add another sentence about the ambiguity of multiple anchors and multiple hasProvenances. Could refer back to section 3 where it is mentioned: "a client presented with multiple provenance-URIs and multiple target-URIs should look at all of the provenance-URIs for information about any or all of the target-URIs."



"It is recommended that this convention be used only when the document is static and has a stable URI that is reasonably expected to be available to anyone accessing the document (e.g. when delivered from a web server"

couldn't one embed RDFa to give the URI of the document? If so, then we could relax the desire to keep it at a fixed location.


Similar to 15) above, would be helpful to reinforce the "multiplicity problem" by referring back to the beginning of section 3.


"returning a provenance for a particular resource"
"returning a provenance description for a particular resource"


fantastic! I greatly appreciate how these are now two of many possible options.

"if a recognized query mechanism is described, extract information needed to use that mechanism (e.g. a URI template or a SPARQL service endpoint URI); and"


The suggestion to use a blank node should be avoided in 4.1.1:

"where service-URI is the URI of the provenance query service, and query_option_node is any distinct RDF subject node (i.e. a blank node or a URI)."

I suggest to state the the RDF subject node is the URI of the service (which could be chosen by the server).


"variable uri set to the target-URI for which provenance is required"
"variable uri set to the target-URI for which provenance is requested"


Similar to 20) above, recommend to avoid encouraging a bnode in 4.1.2:

"query_option_node is any distinct RDF subject node (i.e. a blank node or a URI)."


Should the domains in "acme.example.com, and is subsequently used by wile-e.example.org" be in a different font style?


"is the provenance-URI that has been described previously." - cite the section with a link?


perhaps add to "Relates a resource to a provenance ping back service" so that it becomes:
"Relates a resource to a provenance pingback service that may receive provenance about the resource."

All done. Great job!


> Thank you,
> Paul and Graham
> -- your friendly prov-aq editors
> -- 
> --
> Dr. Paul Groth (p.t.groth@vu.nl)
> http://www.few.vu.nl/~pgroth/
> Assistant Professor
> - Knowledge Representation & Reasoning Group | 
>   Artificial Intelligence Section | Department of Computer Science
> - The Network Institute
> VU University Amsterdam

