W3C home > Mailing lists > Public > public-prov-wg@w3.org > May 2012

Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

From: Paul Groth <p.t.groth@vu.nl>
Date: Thu, 31 May 2012 16:15:28 +0300
Message-ID: <CAJCyKRqp99RCrhC6NGCvP0BD82zvYftiuP6S_7rHNn+9UT0=FA@mail.gmail.com>
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Cc: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>, "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Hi Luc,

Thanks for the explanations.

I was wondering if the third argument is optional or not?

Thanks
Paul

On Thu, May 31, 2012 at 3:13 PM, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote:
> Hi Paul,
>
> I don't think we can break up this ternary relation in two binary relations.
>
> 1. One of the use cases I suggested was:
>
> hasProvenanceIn(e1, b1, e)
> hasProvenanceIn(e2, b2, e)
>
> You would end up with
>
> hasProvenanceIn(e1, b1)
> alternate(e1, e)
> hasProvenanceIn(e2, b2)
> alternate(e2, e)
>
> But e1 is not in b1, it's e which is a topic in b1.
> Likewise for e2 and b2.  How do we know what to look for in b1?
>
> Furthermore, e itself may not be a topic at all in the current bundle.
>
> What in addition, I have another alternate relation
> alternate(e1, e3)?
> How do I know what is the alias for e1 in b1?
>
>
> 2. Another use case I suggested was:
>
> hasProvenanceIn(e, b1, e)    // provenance of e found in b1, no aliasing
> hasProvenanceIn(e2, b2, e)   // I had to choose alias e to e2 in the current
> bundle
>                                 because I  specialize it differently
>
>
> hasProvenanceIn(e, b1)
> hasProvenanceIn(e2, b2)
> alternate(e2, e)
>
> Given alternate(e2, e) implies alternate(e, e2)
>
> We are faced with a similar problem, which one is aliased, which one is not?
>
>
> 3. Furthermore, I definitely would like to subtype this relation.
> The only possiblity in the context of prov-dm is to allow for prov:type to
> be expressed.
>
>  hasProvenanceIn(e1, b1, e, [ prov:type= XXX ])
>
> 4.  I do not know whether the following is a way out, but it may help.
>
> If we consider that the third argument of
>
>    hasProvenanceIn(e1, b1, e)
>
> is an alias, then when we write "e", we really mean the identifier "e" and
> not the entity denoted by "e".
> As opposed to "e1" which denotes an entity with a name.
>
>
> So, an alternative is to write it as follows:
>
>    hasProvenanceIn(e1, b1, [ prov:alias='e'])
>
> We are stating that entity e1 has Provenance in b1/is a topic in b1, but we
> need to look for it under
> the name 'e'.  The value associated with prov:alias must a URI or a
> qualified name.
>
>
> Luc
>
>
>
> On 05/31/2012 12:43 PM, Paul Groth wrote:
>
> Luc, Graham:
>
> I wonder if one can see
>
> hasProvenanceIn(entity, bundle, alias-for-entity)
>
> as a proxy for
>
> hasProvenanceIn(entity, bundle)
> alternate(entity, alias-for-entity)
>
> Is this a possible interpretation?
>
> Thanks
> Paul
>
>
> On Thu, May 31, 2012 at 11:09 AM, Graham Klyne
> <Graham.Klyne@zoo.ox.ac.uk> wrote:
>
>
> On 30/05/2012 10:41, Luc Moreau wrote:
>
>
> In my approach, retrieve bob:bundle4 to access provenance about
> alice:report1,
>
>
> But how can you, since alice:report1 is not in bob:bundle4?
>
>
>
> *then* use the specializationOf information to infer that provenance for
> ex:report1 is also true for alice:report1.
>
>
> OK, I should have said "retrieve bob:bundle4 to access provenance about
> ex:report1" there. The rest stands.
>
>
> But again, how do you know where to find provenance for ex:report1.
>
> You seem to have
> hasProvenanceIn(alice:report1, bob:bundle4, -)
> specializationOf(alice:report1, ex:report1)
>
> This does not say which bundle I can find provenance for ex:report1.
>
>
> Maybe I was right first time... I've lost the context of this example.
>
> Moving on...
>
>
>
> What I think we should *not* do is add things to PROV-DM purely to support
> operational concerns (like incremental discovery). That would be to have the
> tail wagging the dog.
>
>
> I think we may put to much emphasis on 'incremental discovery' as per PAQ.
>
> The PROV data model in effect specifies a distributed graph structure
> (distributed across bundles I mean here, services are a side issue).
> To me, it is essential for the model to provide accurate linking across
> bundles
> so that the data structure can be navigated.
>
>
> I think this is a reasonable and appropriate way to frame the problem.
>
>
>
> By accurate, I mean that I want to be able to link an entity in a bundle
> with
> another entity in another specific bundle.
>
>
> This seems to me to be a different requirement; viz "to link an entity ...
> with
> *another* entity".  If that's a requirement, I think it should be orthogonal
> to
> the cross-bundle linking.
>
> What I do not argue against is the construct:
>
>   hasProvenanceIn(entity, bundle)
>
> What I am questioning is the purpose of
>
>   hasProvenanceIn(entity, bundle, alias-for-entity)
>
> Why is the former construct alone insufficient to "provide accurate linking
> across bundles so that the data structure can be navigated"?
>
> (AT this point, we probably need to revisit the examples, but I'm out of
> time
> right now.)
>
>
>
> Without it, bundles are effectively not usable, and very quickly we will see
> constructs like the one I suggest, to aid this navigation
> of the graph, and we will have failed in achieving interoperability.
>
>
> <aside>
> "interoperability" is not a simple binary property.  No specification can
> reasonably underpin total interoperability.  What a spec can do is provide a
> basis for interoperability for a particular set of activities (scope).  Any
> application will build upon a spec with additional elements (which may or
> may
> not be interoperable with other applications).
>
> Of itself, by virtue of being technology neutral, PROV-DM *cannot* be
> regarded
> as achieving interoperability - additional technology layers will always be
> needed for that.  It's a fairly common problem in standards development to
> try
> and "boil the ocean", rather than focus on documenting a clear consensus.
>  Each
> standard is just part of a bigger ecosystem, and should focus (like good
> software products) on achieving some clear goals really well and play well
> with
> other components.
>
> None of this is arguing against the desirability of adequately describing a
> "distributed graph structure" - I think that is a reasonable goal here - but
> not
> because not doing so would mean "we will have failed in achieving
> interoperability" - in these terms, we will always fail to achieve
> interoperability.
> </aside>
>
> #g
>
>
>
>
>
>
>
>
> --
> Professor Luc Moreau
> Electronics and Computer Science   tel:   +44 23 8059 4487
> University of Southampton          fax:   +44 23 8059 2865
> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm



-- 
--
Dr. Paul Groth (p.t.groth@vu.nl)
http://www.few.vu.nl/~pgroth/
Assistant Professor
Knowledge Representation & Reasoning Group
Artificial Intelligence Section
Department of Computer Science
VU University Amsterdam
Received on Thursday, 31 May 2012 13:16:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 31 May 2012 13:16:19 GMT