W3C home > Mailing lists > Public > public-prov-wg@w3.org > April 2012

Re: PROV-ISSUE-311 (clarify-optionals): Clarify optional arguments in DM [prov-dm]

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Thu, 19 Apr 2012 10:30:00 +0100
Message-ID: <EMEW3|567350d6ed73da0084e37d092cd3363do3IAU408L.Moreau|ecs.soton.ac.uk|4F8FDB18.4090404@ecs.soton.ac.uk>
To: public-prov-wg@w3.org, James Cheney <jcheney@inf.ed.ac.uk>
Hi James,

For some reason, your cut and paste does not seem to include some '?' 
symbols,

Here is the rule in my antlr parser, where each identifier was given a name.

     :    'wasDerivedFrom' '(' ((id0=identifier | '-') ',')? 
id2=identifier ',' id1=identifier (',' (a=identifier | '-') ',' 
(g2=identifier  | '-') ',' (u1=identifier | '-') )?    
optionalAttributeValuePairs ')'

Notice that either:
- you have *three* arguments + optional attributes following id1
- or you don't have these arguments but simply optional attributes.

So, the following are examples of valid expressions:
wasDerivedFrom(id2,id1,a,g2,u1)
wasDerivedFrom(id2,id1,a,-,u1)
wasDerivedFrom(id2,id1,-,-,-)
wasDerivedFrom(id2,id1)

If id0 is present, then, likewise:
wasDerivedFrom(id0, id2,id1,a,g2,u1)
wasDerivedFrom(id0, id2,id1,a,-,u1)
wasDerivedFrom(id0, id2,id1,-,-,-)
wasDerivedFrom(id0, id2,id1)

Note that in the above, '-' appears explicitly in the textual 
representation.

So, to me,

wasDerivedFrom(e2, e1,x)

can only parsed in a single way:
wasDerivedFrom(id0, id2,id1)


As far as the unknown/absent discussion is concerned, I am not trying to 
argue for the cases
I enumerated. I am just saying that 'unknown' as you suggested is not 
clear. Unknown by whom?
what is unknown?

Luc


On 04/19/2012 10:00 AM, James Cheney wrote:
> On Apr 19, 2012, at 5:35 AM, Luc Moreau wrote:
>
>    
>> Hi James,
>>
>> I don't think your description of the problem is accurate.
>> The production [1] is not ambiguous (LL grammar), it definitely does not
>> require multiple pass over the document to recognise types.
>>
>>      
> Sorry, I don't understand how a grammar containing the rule [1] can *possibly* be unambiguous.
>
> derivationExpression ::= wasDerivedFrom ( ( identifier | - ) , eIdentifier , eIdentifier , ( aIdentifier | - ) , ( gIdentifier | - ) , ( uIdentifier | - ) optional-attribute-values )
>
> By "unambiguous", I mean what people normally mean: for each string there is at most one parse tree (not that one can find *some* parse without backtracking.)  There are three parse trees for:
>
> wasDerivedFrom(e2, e1,x)
> - one where x is parsed as an aIdentifier,
> - one where x is parsed as a gIdentifier,
> - one where x is parsed as a uIdentifier.
>
> The grammar may be LL, but an LL parser will always pick the leftmost derivation, i.e. the aIdentifier one.  This is *not* the same as unambiguity.
>
> If this is the *required* way to disambiguate then the grammar spec should say so, and the rule "you have to use - for the first few omitted arguments" should be made explicit.  This seems at least as complicated as my alternative suggestion.
>
> I haven't seen the current version of PROV-N so maybe this is explained better there, but it should also be explained in PROV-DM(-CONSTRAINTS).
>
>
>
>    
>> I think the confusion may have come from the description of the grammar but Paolo has reworked it.
>>
>> As far as the reading of - is concerned, I would even say that we have the following cases:
>> - value exists and is known but not expressed (say, because not deemed important)
>> - value existence is known but actual value is unknown
>> - value does not exist
>> - value existence is not known
>> So, your suggested split absent/unknown may not be the clearest.
>>
>> I believe your Proposal 0 is implemented in the grammar.
>>
>> I considered variants of Proposal 1 but ruled them out because the grammar was not ambiguous.
>>
>>      
> I would argue that the proliferation of different cases above is a strong motivation for cutting down on the number of cases.  Even if the grammar happens to be unambiguous (though I can't see how it can be), we are currently asking a lot of readers especially since the grammar is the last of the three documents they'll see.
>
> In an open world setting (I think!) we shouldn't distinguish between "value does not exist" and "value existence is not known".  Combining provenance records could fill in unknown vlaues.  In any case, we currently have no way to express this distinction - and we don't say anywhere what should happen if we somehow learn the value of a "value that does not exist".
>
> I also see no reason to distinguish between "value exists and is known but not expressed" and "value existence is known but actual value is unknown" - from the point of view of a consumer of provenance, what would I do differently?  In any case there is no way for the producer to express this difference.
>
> At the end of the day, what matters is what people will implement, and it's unclear to me what someone should actually implement when doing inference/validation/equivalence checking on provenance descriptions.
>
> If the consensus is that the existing way is fine, at least it should be explained clearly; especially we should explain how the "short" forms of expressions expand into the long forms.  Right now, this is not explained clearly anywhere.   I plan, at least, to expand all of the expressions used in constraints so that there is no ambiguity.
>
> --James
>
>    
>> [1] http://dvcs.w3.org/hg/prov/raw-file/default/model/prov-n.html#Derivation-Relation
>>
>> Professor Luc Moreau
>> Electronics and Computer Science
>> University of Southampton
>> Southampton SO17 1BJ
>> United Kingdom
>>
>> On 19 Apr 2012, at 00:33, "James Cheney"<jcheney@inf.ed.ac.uk>  wrote:
>>
>>      
>>> OK, I've posted my thoughts on this, and a proposal, at:
>>>
>>> http://www.w3.org/2011/prov/wiki/Optional_arguments
>>>
>>> (Sorry this is a bit long, but I think it is worth being a little pedantic here).
>>>
>>> I'd like to keep this open for discussion, but don't think it's a blocking issue.
>>>
>>> --James
>>>
>>> On Apr 18, 2012, at 10:43 AM, James Cheney wrote:
>>>
>>>        
>>>> Hi,
>>>>
>>>> I have been working on the optional arguments in part 2, and I am still not sure what to write baed on what is in part 1 now.  I am trying to formulate a proposal to see if I am on the right track.  So I think this should be kept open for now (maybe it should be reassigned to prov-dm-constraints).
>>>>
>>>> --James
>>>>
>>>>
>>>> On Apr 18, 2012, at 7:51 AM, Luc Moreau wrote:
>>>>
>>>>          
>>>>> Hi Stian,
>>>>> Can we close this issue now?
>>>>> Regards,
>>>>> Luc
>>>>>
>>>>> On 04/02/2012 03:58 PM, Luc Moreau wrote:
>>>>>            
>>>>>> Hi Stian,
>>>>>>
>>>>>> If you follow [1] below, you will now find our proposed answer to optional arguments.
>>>>>> It contains explicit links to prov-dm part 2.
>>>>>>
>>>>>> I propose to close this issue pending your review.
>>>>>> Regards,
>>>>>> Luc
>>>>>>
>>>>>>
>>>>>> On 03/30/2012 04:12 PM, Luc Moreau wrote:
>>>>>>              
>>>>>>> Hi Stian,
>>>>>>>
>>>>>>> I have been thinking about your suggestion on optional arguments.
>>>>>>> I looked at all the optional arguments [1] in prov-dm.
>>>>>>>
>>>>>>> Most of them, I believe, imply  existential quantification.
>>>>>>>
>>>>>>> It would be nice to have this confirmed, and then we can write it up in part 2.
>>>>>>>
>>>>>>> Luc
>>>>>>>
>>>>>>> [1] http://dvcs.w3.org/hg/prov/raw-file/default/model/optional.html
>>>>>>>
>>>>>>> On 13/03/2012 11:05, Provenance Working Group Issue Tracker wrote:
>>>>>>>                
>>>>>>>> PROV-ISSUE-311 (clarify-optionals): Clarify optional arguments in DM [prov-dm]
>>>>>>>>
>>>>>>>> http://www.w3.org/2011/prov/track/issues/311
>>>>>>>>
>>>>>>>> Raised by: Stian Soiland-Reyes
>>>>>>>> On product: prov-dm
>>>>>>>>
>>>>>>>> There seems to be some confusion over any of the 'optional' arguments in
>>>>>>>> PROV-DM/PROV-N.
>>>>>>>>
>>>>>>>> It is unclear if this means that the argument is *implied* (ie.
>>>>>>>> existential quantification/bnodes in OWL/RDF) or not applicable/not present (NIL).
>>>>>>>>
>>>>>>>> It might be good to go through all of the optionals in PROV-DM and make sure they make that clear.
>>>>>>>>
>>>>>>>> For instance:
>>>>>>>>                  
>>>>>>>>> Generation, written wasGeneratedBy(id,e,a,t,attrs) in PROV-N, has the following components:
>>>>>>>>> id: an optional identifier for a generation;
>>>>>>>>> entity: an identifier for a created entity;
>>>>>>>>> activity: an optional identifier for the activity that creates the entity;
>>>>>>>>> time: an optional "generation time", the time at which the entity was completely created;
>>>>>>>>> attributes: an optional set of attribute-value pairs that describes the modalities of generation of this entity by this activity.
>>>>>>>>>                    
>>>>>>>> Change to:
>>>>>>>>
>>>>>>>>
>>>>>>>>                  
>>>>>>>>> Generation, written wasGeneratedBy(id,e,a,t,attrs) in PROV-N, has the following components:
>>>>>>>>> id: an optional identifier for a generation, if unspecified the identifier is not known;
>>>>>>>>> entity: an identifier for a created entity;
>>>>>>>>> activity: an optional identifier for the activity that creates the entity, if unspecified activity is still implied, but unknown;
>>>>>>>>> time: an optional "generation time", the time at which the entity was completely created, if unspecified the time is unknown or not applicable;
>>>>>>>>> attributes: an optional set of attribute-value pairs that describes the modalities of generation of this entity by this activity, if unspecified an empty set is implied.
>>>>>>>>>                    
>>>>>>>>
>>>>>>>>
>>>>>>>>                  
>>>>>>>                
>>>>>>              
>>>>> -- 
>>>>> Professor Luc Moreau
>>>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>>>> University of Southampton          fax:   +44 23 8059 2865
>>>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>
>>>> -- 
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>>
>>>>
>>>>          
>>>
>>> -- 
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>        
>>      
>
>    

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
Received on Thursday, 19 April 2012 09:30:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:07:03 GMT