- From: Graham Klyne <GK@ninebynine.org>
- Date: Thu, 04 Aug 2011 08:48:23 +0100
- To: Simon Miles <simon.miles@kcl.ac.uk>
- CC: Provenance Working Group WG <public-prov-wg@w3.org>
I think I've seen enough push-back to agree this is a problem for which a proposal should be drafted. If it turns out to be too complex, we can fall back to plan B (out of scope), but I'm hopeful a simple answer can be found. I'll think about lightweight mechanisms to include the additional data. I like the "anchor" URI possibility for the Link: header; it's a shame that is not an option for the <link> element. I don't have firm ideas for this yet, and am seeking advice. #g -- Simon Miles wrote: > Graham, Luc, > > I would also vote that this is a real problem to be addressed (which > is why it is part of the scenario). > > It does seem vitally important that the client has the identifier of > the bob/thing of which they want to find the provenance *exactly as > used in the provenance data they access*. Otherwise, they haven't > really accessed the provenance of anything at all, as it can't be > interpreted - it is just a block of data describing the past and not > something's provenance. > > We could say that obtaining that identifier is out of scope, but I > can't see an argument for why an access proposal would say how a > provenance URI is obtained (e.g. embedded in the HTML), but not the > identifier of the thing as used in the provenance. The client needs > both. > > In the case that the client creates the HTML itself, then it doesn't > need the BOB-URI, but it quite possibly also doesn't need the > provenance URI - it is just a link to the storage it used to document > the page's generation. If we do say obtaining the identifier is out of > scope, then obtaining the provenance URI should also be out scope, and > the proposal should be very brief :-) > > There are alternatives to embedding the bob/thing URI in its own data > content (e.g. HTML). We could require that, on resolving the > provenance URI, you obtain not just provenance data but also the URI > of the thing as used in that provenance. That could only work if every > provenance URI was unique to one thing/bob. > > Thanks, > Simon > > On 28 July 2011 15:54, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote: >> I thought this was an *explicit* use case in the scenario crafted in Boston. >> >> Luc >> >> On 07/28/2011 03:47 PM, Graham Klyne wrote: >>> Given that we're here to create standards and supporting documents, I >>> think one of the key principles should be: does it address a >>> sufficiently compelling need with sufficient simplicity that >>> developers will implement it? And thus, the implementation really >>> does matter - I don't think it's reasonable to divorce implementation >>> concerns from the solution. >>> >>> To my mind, we're in danger of solving a non-problem here. If someone >>> gives me a USB stick with an HTML file on it, why should I take notice >>> of metadata in the HTML file about its origin when it has just been >>> handed to me by someone I presumably trust? Of course, you can always >>> find edge cases, but standardization isn't about solving edge cases >>> (except if you do security standards), but primarily about addressing >>> common cases, where the effort (i.e. cost) of standardization is >>> amortized by scale of usage. >>> >>> That said, if there's a real consensus that this is a real problem >>> worth solving, then I'd suggest a simple defining a second link >>> relation type for the purpose. That is lightweight enough that some >>> developers might just implement it even if they don't perceive much >>> value. >>> >>> #g >>> -- >>> >>> Luc Moreau wrote: >>>> Let's look at the problem conceptually first, and agree on the >>>> principles, and in a second phase, let's see how to implement this. >>>> >>>> Yes, I consider the case where we control the generation of the HTML. >>>> >>>> I think we MAY embed in the HTML >>>> - provenance-URI: the location for the provenance of this document >>>> - BOB-URI: the identifier of the BOB that represents this document >>>> >>>> Note 1: this may be BOB-URIs (since this document may be described by >>>> multiple BOBs) >>>> Note 2: this may be provenance-URIs (since there may be multiple >>>> sources for the provenance) >>>> >>>> If we are in agreement, we can look at ways of encoding this >>>> information. >>>> >>>> Luc >>>> >>>> >>>> On 07/28/2011 02:15 PM, Graham Klyne wrote: >>>>> In the general case, if you don't control the generation of the >>>>> HTML, it's the same problem as an image. There's nothing more we >>>>> can do. >>>>> >>>>> If you do control generation of the HTML, then <link> can give you >>>>> the provenance resource. >>>>> >>>>> But I see that you may also need an identifier for the HTML to >>>>> interpret that provenance (and the rest of this response addresses >>>>> just that issue). >>>>> >>>>> ... >>>>> >>>>> I see two, maybe three possibilities: >>>>> >>>>> (a) rely on some unspecified mechanism here - i.e. don't specify a >>>>> specific mechanism for this case. >>>>> (b) add something to the HTML to identify the resource it >>>>> represents. (Off the top of my head, this could be <meta>, <link>, >>>>> or RDFa - I'm sure there are other options.) >>>>> (c) adopt a packaging mechanism that can combine arbitrary data and >>>>> metadata. >>>>> (d) ... maybe something else. >>>>> >>>>> I think the scenario alone isn't enough information to make a >>>>> sensible choice here - which to be useful has to be one that >>>>> developers will actually implement. >>>>> >>>>> If I were forced to make a choice now, I'd go with (a) or maybe (b) >>>>> with a <link> element and a new link relation roughly for "self". >>>>> >>>>> The packaging approach would solve more problems generally, but I >>>>> don't think we know enough to make a call on a specific mechanism >>>>> that would effectively promote interoperability, and there's enough >>>>> defined mechanism (cf. >>>>> http://dvcs.w3.org/hg/prov/raw-file/tip/paq/provenance-access.html#gap-analysis) >>>>> for developers to do something that would work right now. >>>>> >>>>> #g >>>>> -- >>>>> >>>>> Luc Moreau wrote: >>>>>> >>>>>> Hi Graham, >>>>>> >>>>>> I guess that D7 is the case I was after. >>>>>> Note that D7 is not an image but an html file. Where do I find its >>>>>> identifier? >>>>>> >>>>>> Luc >>>>>> >>>>>> On 07/28/2011 11:26 AM, Graham Klyne wrote: >>>>>>> I've added a scenario analysis appendix to the PAQ document at >>>>>>> http://dvcs.w3.org/hg/prov/raw-file/be3b7e1f2518/paq/provenance-access.html >>>>>>> >>>>>>> >>>>>>> The short answer to this issue is that I believe there are some >>>>>>> matters that are beyond the scope of a W3C specification >>>>>>> document. The mechanisms described (or with placeholders for >>>>>>> fuller description) could form the basis for applications that >>>>>>> need to deal with, say, data provided on a USB drive, but a >>>>>>> complete specification would IMO be inappropriate. >>>>>>> >>>>>>> e.g. >>>>>>> [[ >>>>>>> S: this scenario effectively calls for this: given an arbitrary >>>>>>> data resource, implement a general purpose application to >>>>>>> discover, retrieve and analyze provenance about that resource. At >>>>>>> the present time, this is a matter for experimental development, >>>>>>> which could be based substantially on the mechanisms described for >>>>>>> provenance discovery and access via third party services. >>>>>>> ]] >> -- >> Professor Luc Moreau >> Electronics and Computer Science tel: +44 23 8059 4487 >> University of Southampton fax: +44 23 8059 2865 >> Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk >> United Kingdom http://www.ecs.soton.ac.uk/~lavm >> >> >> >> ______________________________________________________________________ >> This email has been scanned by the MessageLabs Email Security System. >> For more information please visit http://www.messagelabs.com/email >> ______________________________________________________________________ >> > > >
Received on Thursday, 4 August 2011 08:41:53 UTC