Re: PAQ document update, target renamed as context from Graham Klyne on 2011-08-20 (public-prov-wg@w3.org from August 2011)

From: Graham Klyne <GK@ninebynine.org>
Date: Sat, 20 Aug 2011 10:34:21 +0100
To: Yogesh Simmhan <simmhan@usc.edu>
CC: 'Paul Groth' <p.t.groth@vu.nl>, 'W3C provenance WG' <public-prov-wg@w3.org>
Message-ID: <4E4F7F9D.30202@ninebynine.org>
Hi Yogesh,

Mainly by way of acknowledgment, some comments inline...

Yogesh Simmhan wrote:
> Hi Graham,
> 
> Thanks for your responses. Some comments below.
> 
> | -----Original Message-----
> | From: public-prov-wg-request@w3.org [mailto:public-prov-wg-request@w3.org]
> | On Behalf Of Graham Klyne
> | Sent: Friday, August 19, 2011 7:05 AM
> | To: Yogesh Simmhan
> | Cc: 'Paul Groth'; 'W3C provenance WG'
> | Subject: Re: PAQ document update, target renamed as context
> | 
> |  <snip/>
> |
> | > - I believe there is no guarantee that the provenance-URI will provide
> | > provenance information about the context-URI. Suggest we use *should* rather
> | > than (implicitly) *must* to state that the returned provenance-uri should
> have
> | > provenance information about the resource view identified by the
> context-uri.
> | 
> | I think I see your point, but I am concerned that making that possibility
> | explicit here might be confusing for a reader.  I wonder if this would be
> better
> | served by a new sub-section in sect 2 about interpreting provenance
> information?
> | 
> | I've tagged this as an issue in the document for now.
> | 
> I agree. For a more detailed discussion, do you think a (non-normative) appendix
> section can deal with some of these issues and in addition provide several
> concrete use cases? Maybe even using the examples from the F2F1 scenario present
> in the wiki?


I propose that I'll work on something, then we can figure if it makes sense and 
where it belongs.

I agree it would be good to see the notions exposed through the example as well.


> | > *) "An HTTP response may include multiple provenance link headers...
> Likewise,
> | > an HTTP response may include... "
> | > - Besides the above issue of the provenance being related to the resource
> being
> | > accessed (rather than the context-uri), I would like some clarity on what
> the
> | > multiple "anchor" mean. I would expect when multiple provenance-URIs and
> | > context-URIs are returned through multiple "Link:" headers, then one or all
> the
> | > provenance-URIs *may* describe one or all the context-URIs. It is upto the
> | > accessor to access each of the provenance-URIs to determine which of them
> | > describe which context-URIs. If this is indeed the intention, can it be made
> | > clearer? Also, it is not clear what resource you mean by "the resource may":
> the
> | > provenance resource or the resource being accessed by the HTTP GET/HEAD?
> | 
> | Yes, this is a point that needs clarifying.  It is also a (small) difference
> | between using "Link anchor=..." vs two separate link headers.
> |
> Yes, that is true. 


 From thinking about Olaf's comments, I'm starting to think the "anchor" and 
provenance context-URI might be subtly different.  Still thinking.


> | > == Sec 3.2 ==
> | > *) The "Appendix A. Notes on Using the Link Header with the HTML4 Format"
> | > suggests three possible ways of serializing extension relationship types
> (such
> | > as "provenance") into HTML4: an absolute URI, using the HEAD element's
> profile
> | > attribute prefix, or an RDFa namespace prefix. We seem to be using none of
> the
> | > three and the "provenance" relationship we use in the "rel" attribute is not
> a
> | > URI. Should we instead adopt an absolute URI for the relationship type (e.g.
> | > "http://www.w3.org/2011/prov/linktype/provenance") or reuse the RDFa's
> | > prov:hasProvenance that we introduce? Or is my reading of that appendix
> entry
> | > incorrect and does not apply to extension relation types that are registered
> | > with IETF? Ditto for the "anchor" relation.
> | 
> | I was in two minds about leaving that reference in.  The reason I did was that
> | it discusses the correspondence between HTTP link headers and HTML <link>
> | elements, and provides some general background information.  In view of your
> | comment, I'm inclined to remove the note.
> | 
> | Separately, using an absolute URI would have an advantage of making the HTML
> | more directly aligned with RDF usage (if we use the RDF property URI), but the
> | disadvantage of requiring a harder-to-remember URI rather than a fairly
> | intuitive name.  My intuition is that it would be more approachable for
> authors
> | and developers creating HTML to just use the name.
> |
> I see your point and agree on the need for making it easy for authors. But do
> you see us breaking compatibility with the appendix, especially since it states:
> "Surveys of existing HTML content have shown that unregistered link
>    relation types that are not URIs are (perhaps inevitably) common.
>    Consuming HTML implementations should not consider such unregistered
>    short links to be errors, but rather relation types with a local
>    scope (i.e., their meaning is specific and perhaps private to that
>    document)."


I think I've now dropped that reference as causing more confusion than helpful 
information.  Responding to the substantive comment here, if we register the 
"provenance" relation type, I think the concern is addressed.


> Should we say that in HTML, authors should serialize the "provenance" relation
> as (e.g.) "http://www.w3.org/2011/prov/rel/provenance" or "provenance", but the
> former is preferred? Note that this does not apply to the HTTP relation that can
> continue to be just "provenance" since it is going to be a registered extension.
> On the other hand, if we do get "provenance" registered as a HTTP web link
> extension relation, even using "provenance" in HTML is credible.


I'd rather there not be alternatives - not good for interoperability.  I think 
we already have too many alternative ways of presenting things, which makes more 
work for consumers of provenance.


> | > *) "The provenance-URI given by the provenance link element identifies the
> | > provenance-URI for the document. ..."
> | > - I have concerns as before for HTTP Web Links on whether the context-URI
> | *must*
> | > describe the document (or prior views) and the provenance-URI *must* have
> | > provenance information about the resoure views identified by the
> context-URIs,
> | > or they *should* in both cases. I prefer the latter, with the possiblity of
> | > providing context-URIs for resources other than the current document, and
> | > provenance-URIs of resource views other than the current document.
> | 
> | (In case I wasn't clear before, I agree with your position of this being
> SHOULD
> | rather than MUST.)
> | 
> Thanks, that helps.
> 
> | 
> | > == Sec 3.2.1 ==
> | > *) Any reason why provenance service URI relation has not been added to the
> | HTTP
> | > Web Linking section as a new relation type? Is is just to finish discussions
> | > about the relation before just migrating its use to HTTP Web Linking?
> | 
> | This is a new section, pending wider review.  It's a fairly radical change
> from
> | what I did before, so I guess I was waiting to see if people were happy with
> the
> | general approach, before fully integrating it.
> |
> In the call yesterday, there were no issues raised with the way provenance
> service was being used. I may have some comments on it (will send separate email
> on Sec 4+). We may want to bring this up in the next call?


I'm sure it could use some refinement, at least :)

(I think the telecon is a difficult environment within which to articulate or 
respond to specific technical issues, but more useful as a way to draw people's 
attention to areas that could benefit from input.  I'd prefer shorter telecons 
and more time to work on the actual text.  But this is a process issue I'll 
leave to the chairs)


> | > *) One confusion I had was that a provenance service URI may also be used to
> | > dereference a provenance URI (e.g if the provenance URI were a
> urn:uuid:NNN).
> | > Re-reading the Concepts, it was not the case and it was only used to query
> by
> | > context-URI. Not sure if an explicit disambiguation is required.
> | 
> | I'm not sure I understand the concern/confusion here.  I am guessing this
> refers
> | to the fact that, as defined, a provenance-service can return either
> | provenance-URIs, provenance information or both.  Cf. section 4.1.  I'm still
> | not sure that both these cases are needed, as they do represent a degree of
> | overlapping functionality, but I can't say which one (if any) might be
> eliminated.
> | 
> | If this is not it, can you give a more con crete example?
> | 
> The provenance service is currently a provenance discovery service (i.e. mapping
> from context URI to provenance URI) rather than a provenance URI dereferencing
> service (i.e. mapping from provenance URI to a provenance URL). So currently, it
> does not return provenance information, only provenance URI(s). 


As currently drafted, it can do either, or both, depending on what templates it 
exposes in the service description.  In practice, I'd expect a service to do one 
or the other.  Originally, I was contemplating two distinct services to deal 
with the different use-cases as presented, but using the HATEOAS-inspired 
service description model, it seemed easier to combine them.

I'm a bit uncomfortable that we have two ways here to achieve nearly the same 
thing, but it was necessary in order to address all the scenario requirements. 
(This is making me question the application of scenario-driven specification 
development, but that too is another process issue that I don't intend to raise 
now.)


> Provenance providers may wish to assign a URN as the provenance URI and have the
> provenance service map it to a URL. In such a case, having a "template" that
> gives the provenance URL for a provenance URI may be useful. In fact, the
> provenance URL may be present in a different host from the provenance service
> itself. E.g. the very successful dx.doi.org for dereferencing doi URNs comes to
> mind.


This (i.e. a context-URI resolution service) could be added easily enough, I 
think, but I'm really reluctant to add yet another way to achieve nearly the 
same thing.  I believe that every new alternative added will dramatically reduce 
the eventual number of complete implementations.

Also, we don't have to specify *everything*, just enough to provide a common 
baseline for interoperability.  It would be better to under-specify and add more 
later in the light of experience.


> | > - An additional option may be to embed the provenance information directly
> | > within the metadata. I know Yolanda brought this up earlier
> | >
> | (http://www.w3.org/2011/prov/wiki/F2F1_Access_and_Query_Proposal#Issues_bey
> | ond_s
> | > cope)
> | 
> | Revised that para to read "For formats which have provision for including
> | metadata within the file (e.g. JPEG images, PDF documents, etc.), use the
> | format-specific metadata to include a context-URI, provenance-URI and/or
> | service-URI. Format-specific metadata provision might also be used to include
> | provenance information directly in the resource"
> |
> I wonder if we should have an equivalent of this for HTML too (i.e. embedding
> provenance directly into the document). Luc did have a comment yesterday on the
> call about provenance by values and by reference. This may be another issue to
> consider if it is in scope or not.


Similar comment to above.  This section is discursive, and to that extent it 
*can* apply to HTML too.  If there's a real need for this, another group can 
specify it.  I think we have more than enough coverage for now.

#g
--
Received on Saturday, 20 August 2011 14:15:54 UTC