Re: PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]

Hi James,
I don't think it's the clearest.
Are both bundle and instance are named set of statements?
I wouldnt know which of these terms to use in prov-n.

prov-n has :
- statements:  e.g. entity(e), wasGeneratedBy(a,e)
- a construct bundle, which gives a name to a set of statements
- a construct toplevel-bundle, which combines a set of statements, and 
bundles

Are you suggesting to rename toplevel-bundle to dataset?



Isn't it the case that an instance (which is a prov-constraint concept 
and not a prov-n concept)
a set of statement or a bundle or a toplevel-bundle/dataset?

Luc


On 09/08/12 18:03, James Cheney wrote:
> OK.  I have done a quick pass to use the term "PROV dataset" and changed all occurrences of "toplevel bundle" to "toplevel instance".  I think it's a lot better this way!
>
> instance = named set of statements.  (Excluding "bundle" constructs, which are not statements.)
> bundle = named set of statements ~= named graph of PROV-O (hopefully!)
> dataset = an instance and zero or more bundles (with distinct names).
> toplevel instance = the set of statements at the toplevel of a dataset
>
> Module typos/snags, does this look OK?  If so I will close.
>
> Perhaps this terminology would be useful in other documents (Luc pointed out PROV-N uses "toplevel bundle" too...).
>
> --James
>
> On Aug 9, 2012, at 5:41 PM, Miles, Simon wrote:
>
>> Hello James,
>>
>> I strongly agree with the suggested general solution. I have no objection to "dataset" as a term. If you do still need to talk about bundles at all in PROV-Constraints, I think it should be made clear that the "toplevel" does not need to be named (does not need to be a bundle) to avoid confusion of concepts for different purposes.
>>
>> As said on the IRC, I don't think this is a blocking issue, just a matter of text clarification.
>>
>> thanks,
>> Simon
>>
>> Dr Simon Miles
>> Senior Lecturer, Department of Informatics
>> Kings College London, WC2R 2LS, UK
>> +44 (0)20 7848 1166
>>
>> Evolutionary Testing of Autonomous Software Agents:
>> http://eprints.dcs.kcl.ac.uk/1370/
>> ________________________________________
>> From: James Cheney [jcheney@inf.ed.ac.uk]
>> Sent: 09 August 2012 17:21
>> To: Provenance Working Group
>> Subject: Re: PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]
>>
>> We discussed this in the teleconference and it sounded like it would be appropriate to find better terminology for the following three things, which are currently not clearly distinguished:
>>
>> - "the whole PROV instance, including set of toplevel statements and bundles"
>> - "a particular set of statements, either the toplevel one or one within a bundle"
>> - bundle = "a named set of provenance statements"
>>
>> My initial proposal is "PROV dataset", "PROV instance", and "bundle".  I believe "PROV dataset" is roughly analogous to what people call "dataset" in the context of SPARQL; if anyone knows different (or has objections or better suggestions), let me know.
>>
>> I'll send another message on this when this is ready for review.
>>
>> --James
>>
>> On Aug 9, 2012, at 3:45 PM, Provenance Working Group Issue Tracker wrote:
>>
>>> PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]
>>>
>>> http://www.w3.org/2011/prov/track/issues/474
>>>
>>> Raised by: Simon Miles
>>> On product: prov-dm-constraints
>>>
>>> As requested, I'm submitting an issue where I feel a PROV-Constraints review comment of mine is not completely answered.
>>>
>>> My original comment:
>>>> Bundles
>>>> -------
>>>> F. Section 6.1 seems a bit out of the blue. "The definitions
>>>> [etc.]... assume a PROV instance with exactly one bundle", and then
>>>> multiple bundles are handled as exactly the same number of
>>>> instances. Why? Why is there a connection between number of instances
>>>> and number of bundles? Why would a bundle be considered to be only one
>>>> instance? I thought a bundle was an identified set of statements,
>>>> allowing for provenance of provenance, which seems a distinct matter
>>>> from whether a set of statements are valid. It seems fine for a user
>>>> to treat one bundle as one instance if they want to, but there's no
>>>> reason given why this is the general case.
>>> Response from editors:
>>>> I am not sure I understand this comment.  However, I have rewritten
>>>> slightly the intro of section 6.1.
>>>>
>>>> "The definitions, inferences, and constraints, and the resulting notions of normalization, validity and equivalence, assume a PROV instance that consists of exactly one bundle, the toplevel bundle, containing all PROV statements in the top level of the bundle (that is, not enclosed in a named bundle). In this section, we describe how to deal with PROV instances consisting of multiple named bundles. Briefly, each bundle is handled independently; there is no interaction between bundles from the perspective of applying definitions, inferences, or constraints, computing normal forms, or checking validity or equivalence."
>>> I agree this is clearer, but I don't feel it answers the key questions in my comment. To put my comment another way: you have explained checking validity where an instance consists of one bundle and of multiple bundles. The two other possibilities I see are:
>>> (a) A bundle containing multiple instances;
>>> (b) An instance that is a collection of PROV descriptions with no identifier and so is not a bundle, e.g. a provenance service query result.
>>>
>>> How do we deal with each of these cases? Or, if they cannot occur, why not?
>>>
>>> Thanks,
>>> Simon
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Friday, 10 August 2012 04:27:29 UTC