Re: PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]

Please ignore my prior comment... After that long email thread, I'm
very happy with prov document :-)

cheers
Paul

On Thu, Aug 16, 2012 at 2:41 PM, James Cheney <jcheney@inf.ed.ac.uk> wrote:
> Hearing no objections to using "PROV document" for now, I will close the issue.
>
> --James
>
> On Aug 14, 2012, at 11:49 AM, Ivan Herman wrote:
>
>>
>> On Aug 13, 2012, at 10:20 , Graham Klyne wrote:
>>
>>> James,
>>>
>>> Mainly, I wanted to say that it will be very helpful if a PROV Dataset is structurally and semantically aligned with a SPARQL/RDF 1.1 Dataset.  (SPARQL defines no dataset semantics, but I understand the RDF 1.1 group have adopted the structure for "named graphs" in RDF, so will hopefully also define appropriate RDF semantics.)
>>
>> Graham: that is the point (alas!): the RDF group have not yet really adopted anything:-( And the 'appropriate RDF Semantics' is one of the stumbling blocks, in fact.
>>
>> Ivan
>>
>>
>>>
>>> From this email, I find the distinction between "instance" and "bundle" to be unclear.  Also, when you say a "bundle" is not a "statement", what do you mean here by "statement" - I'm offline, can't check the source right now, so my apologies if this is covered in the document.  [later] I see that was a typo, but I'm still left wondering what you mean by "not a statement"
>>>
>>> #g
>>> --
>>>
>>> On 09/08/2012 18:03, James Cheney wrote:
>>>> OK.  I have done a quick pass to use the term "PROV dataset" and changed all occurrences of "toplevel bundle" to "toplevel instance".  I think it's a lot better this way!
>>>>
>>>> instance = named set of statements.  (Excluding "bundle" constructs, which are not statements.)
>>>> bundle = named set of statements ~= named graph of PROV-O (hopefully!)
>>>> dataset = an instance and zero or more bundles (with distinct names).
>>>> toplevel instance = the set of statements at the toplevel of a dataset
>>>>
>>>> Module typos/snags, does this look OK?  If so I will close.
>>>>
>>>> Perhaps this terminology would be useful in other documents (Luc pointed out PROV-N uses "toplevel bundle" too...).
>>>>
>>>> --James
>>>>
>>>> On Aug 9, 2012, at 5:41 PM, Miles, Simon wrote:
>>>>
>>>>> Hello James,
>>>>>
>>>>> I strongly agree with the suggested general solution. I have no objection to "dataset" as a term. If you do still need to talk about bundles at all in PROV-Constraints, I think it should be made clear that the "toplevel" does not need to be named (does not need to be a bundle) to avoid confusion of concepts for different purposes.
>>>>>
>>>>> As said on the IRC, I don't think this is a blocking issue, just a matter of text clarification.
>>>>>
>>>>> thanks,
>>>>> Simon
>>>>>
>>>>> Dr Simon Miles
>>>>> Senior Lecturer, Department of Informatics
>>>>> Kings College London, WC2R 2LS, UK
>>>>> +44 (0)20 7848 1166
>>>>>
>>>>> Evolutionary Testing of Autonomous Software Agents:
>>>>> http://eprints.dcs.kcl.ac.uk/1370/
>>>>> ________________________________________
>>>>> From: James Cheney [jcheney@inf.ed.ac.uk]
>>>>> Sent: 09 August 2012 17:21
>>>>> To: Provenance Working Group
>>>>> Subject: Re: PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]
>>>>>
>>>>> We discussed this in the teleconference and it sounded like it would be appropriate to find better terminology for the following three things, which are currently not clearly distinguished:
>>>>>
>>>>> - "the whole PROV instance, including set of toplevel statements and bundles"
>>>>> - "a particular set of statements, either the toplevel one or one within a bundle"
>>>>> - bundle = "a named set of provenance statements"
>>>>>
>>>>> My initial proposal is "PROV dataset", "PROV instance", and "bundle".  I believe "PROV dataset" is roughly analogous to what people call "dataset" in the context of SPARQL; if anyone knows different (or has objections or better suggestions), let me know.
>>>>>
>>>>> I'll send another message on this when this is ready for review.
>>>>>
>>>>> --James
>>>>>
>>>>> On Aug 9, 2012, at 3:45 PM, Provenance Working Group Issue Tracker wrote:
>>>>>
>>>>>> PROV-ISSUE-474 (instances-and-bundles): Bundles and valid instances [prov-dm-constraints]
>>>>>>
>>>>>> http://www.w3.org/2011/prov/track/issues/474
>>>>>>
>>>>>> Raised by: Simon Miles
>>>>>> On product: prov-dm-constraints
>>>>>>
>>>>>> As requested, I'm submitting an issue where I feel a PROV-Constraints review comment of mine is not completely answered.
>>>>>>
>>>>>> My original comment:
>>>>>>> Bundles
>>>>>>> -------
>>>>>>> F. Section 6.1 seems a bit out of the blue. "The definitions
>>>>>>> [etc.]... assume a PROV instance with exactly one bundle", and then
>>>>>>> multiple bundles are handled as exactly the same number of
>>>>>>> instances. Why? Why is there a connection between number of instances
>>>>>>> and number of bundles? Why would a bundle be considered to be only one
>>>>>>> instance? I thought a bundle was an identified set of statements,
>>>>>>> allowing for provenance of provenance, which seems a distinct matter
>>>>>>> from whether a set of statements are valid. It seems fine for a user
>>>>>>> to treat one bundle as one instance if they want to, but there's no
>>>>>>> reason given why this is the general case.
>>>>>>
>>>>>> Response from editors:
>>>>>>> I am not sure I understand this comment.  However, I have rewritten
>>>>>>> slightly the intro of section 6.1.
>>>>>>>
>>>>>>> "The definitions, inferences, and constraints, and the resulting notions of normalization, validity and equivalence, assume a PROV instance that consists of exactly one bundle, the toplevel bundle, containing all PROV statements in the top level of the bundle (that is, not enclosed in a named bundle). In this section, we describe how to deal with PROV instances consisting of multiple named bundles. Briefly, each bundle is handled independently; there is no interaction between bundles from the perspective of applying definitions, inferences, or constraints, computing normal forms, or checking validity or equivalence."
>>>>>>
>>>>>> I agree this is clearer, but I don't feel it answers the key questions in my comment. To put my comment another way: you have explained checking validity where an instance consists of one bundle and of multiple bundles. The two other possibilities I see are:
>>>>>> (a) A bundle containing multiple instances;
>>>>>> (b) An instance that is a collection of PROV descriptions with no identifier and so is not a bundle, e.g. a provenance service query result.
>>>>>>
>>>>>> How do we deal with each of these cases? Or, if they cannot occur, why not?
>>>>>>
>>>>>> Thanks,
>>>>>> Simon
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>>
>>>>
>>>>
>>>
>>
>>
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>



-- 
--
Dr. Paul Groth (p.t.groth@vu.nl)
http://www.few.vu.nl/~pgroth/
Assistant Professor
- Knowledge Representation & Reasoning Group |
  Artificial Intelligence Section | Department of Computer Science
- The Network Institute
VU University Amsterdam

Received on Wednesday, 29 August 2012 09:05:15 UTC