Re: PROV-ISSUE-46 (where-is-D-in-provenance): Where do I find document D in provenance [Accessing and Querying Provenance] from Luc Moreau on 2011-07-28 (public-prov-wg@w3.org from July 2011)

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Thu, 28 Jul 2011 15:53:24 +0100
To: Graham Klyne <GK@ninebynine.org>
CC: public-prov-wg@w3.org
Message-ID: <EMEW3|4adc2f540878531d8b7a5d4309758161n6RFrU08L.Moreau|ecs.soton.ac.uk|4E3177E4>
I thought this was an *explicit* use case in the scenario crafted in Boston.

Luc

On 07/28/2011 03:47 PM, Graham Klyne wrote:
> Given that we're here to create standards and supporting documents, I 
> think one of the key principles should be:  does it address a 
> sufficiently compelling need with sufficient simplicity that 
> developers will implement it?  And thus, the implementation really 
> does matter - I don't think it's reasonable to divorce implementation 
> concerns from the solution.
>
> To my mind, we're in danger of solving a non-problem here. If someone 
> gives me a USB stick with an HTML file on it, why should I take notice 
> of metadata in the HTML file about its origin when it has just been 
> handed to me by someone I presumably trust?  Of course, you can always 
> find edge cases, but standardization isn't about solving edge cases 
> (except if you do security standards), but primarily about addressing 
> common cases, where the effort (i.e. cost) of standardization is 
> amortized by scale of usage.
>
> That said, if there's a real consensus that this is a real problem 
> worth solving, then I'd suggest a simple defining a second link 
> relation type for the purpose.  That is lightweight enough that some 
> developers might just implement it even if they don't perceive much 
> value.
>
> #g
> -- 
>
> Luc Moreau wrote:
>>
>> Let's look at the problem conceptually first, and agree on the 
>> principles,  and in a second phase, let's see how to implement this.
>>
>> Yes, I consider the case where we control the generation of the HTML.
>>
>> I think we MAY embed in the HTML
>> - provenance-URI: the location for the provenance of this document
>> - BOB-URI: the identifier of the BOB that represents this document
>>
>> Note 1: this may be BOB-URIs (since this document may be described by 
>> multiple BOBs)
>> Note 2: this may be provenance-URIs (since there may be multiple 
>> sources for the provenance)
>>
>> If we are in agreement, we can look at ways of encoding this 
>> information.
>>
>> Luc
>>
>>
>> On 07/28/2011 02:15 PM, Graham Klyne wrote:
>>> In the general case, if you don't control the generation of the 
>>> HTML, it's the same problem as an image.  There's nothing more we 
>>> can do.
>>>
>>> If you do control generation of the HTML, then <link> can give you 
>>> the provenance resource.
>>>
>>> But I see that you may also need an identifier for the HTML to 
>>> interpret that provenance (and the rest of this response addresses 
>>> just that issue).
>>>
>>> ...
>>>
>>> I see two, maybe three possibilities:
>>>
>>> (a) rely on some unspecified mechanism here - i.e. don't specify a 
>>> specific mechanism for this case.
>>> (b) add something to the HTML to identify the resource it 
>>> represents. (Off the top of my head, this could be <meta>, <link>, 
>>> or RDFa - I'm sure there are other options.)
>>> (c) adopt a packaging mechanism that can combine arbitrary data and 
>>> metadata.
>>> (d) ... maybe something else.
>>>
>>> I think the scenario alone isn't enough information to make a 
>>> sensible choice here - which to be useful has to be one that 
>>> developers will actually implement.
>>>
>>> If I were forced to make a choice now, I'd go with (a) or maybe (b) 
>>> with a <link> element and a new link relation roughly for "self".
>>>
>>> The packaging approach would solve more problems generally, but I 
>>> don't think we know enough to make a call on a specific mechanism 
>>> that would effectively promote interoperability, and there's enough 
>>> defined mechanism (cf. 
>>> http://dvcs.w3.org/hg/prov/raw-file/tip/paq/provenance-access.html#gap-analysis) 
>>> for developers to do something that would work right now.
>>>
>>> #g
>>> -- 
>>>
>>> Luc Moreau wrote:
>>>>
>>>>
>>>> Hi Graham,
>>>>
>>>> I guess that D7 is the case I was after.
>>>> Note that D7 is not an image but an html file. Where do I find its 
>>>> identifier?
>>>>
>>>> Luc
>>>>
>>>> On 07/28/2011 11:26 AM, Graham Klyne wrote:
>>>>> I've added a scenario analysis appendix to the PAQ document at 
>>>>> http://dvcs.w3.org/hg/prov/raw-file/be3b7e1f2518/paq/provenance-access.html 
>>>>>
>>>>>
>>>>> The short answer to this issue is that I believe there are some 
>>>>> matters that are  beyond the scope of a W3C specification 
>>>>> document.  The mechanisms described (or with placeholders for 
>>>>> fuller description) could form the basis for applications that 
>>>>> need to deal with, say, data provided on a USB drive, but a 
>>>>> complete specification would IMO be inappropriate.
>>>>>
>>>>> e.g.
>>>>> [[
>>>>> S: this scenario effectively calls for this: given an arbitrary 
>>>>> data resource, implement a general purpose application to 
>>>>> discover, retrieve and analyze provenance about that resource. At 
>>>>> the present time, this is a matter for experimental development, 
>>>>> which could be based substantially on the mechanisms described for 
>>>>> provenance discovery and access via third party services.
>>>>> ]]
>>>>
>>>
>>
>

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
Received on Thursday, 28 July 2011 14:53:57 UTC