Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

On 30/05/2012 10:41, Luc Moreau wrote:
>> >> In my approach, retrieve bob:bundle4 to access provenance about alice:report1,
>> >
>> > But how can you, since alice:report1 is not in bob:bundle4?
>> >
>>>> *then* use the specializationOf information to infer that provenance for
>>>> ex:report1 is also true for alice:report1.
>>
>> OK, I should have said "retrieve bob:bundle4 to access provenance about
>> ex:report1" there. The rest stands.
>
> But again, how do you know where to find provenance for ex:report1.
>
> You seem to have
> hasProvenanceIn(alice:report1, bob:bundle4, -)
> specializationOf(alice:report1, ex:report1)
>
> This does not say which bundle I can find provenance for ex:report1.

Maybe I was right first time... I've lost the context of this example.

Moving on...

>> What I think we should *not* do is add things to PROV-DM purely to support
>> operational concerns (like incremental discovery). That would be to have the
>> tail wagging the dog.
>
> I think we may put to much emphasis on 'incremental discovery' as per PAQ.
>
> The PROV data model in effect specifies a distributed graph structure
> (distributed across bundles I mean here, services are a side issue).
> To me, it is essential for the model to provide accurate linking across bundles
> so that the data structure can be navigated.

I think this is a reasonable and appropriate way to frame the problem.

> By accurate, I mean that I want to be able to link an entity in a bundle with
> another entity in another specific bundle.

This seems to me to be a different requirement; viz "to link an entity ... with 
*another* entity".  If that's a requirement, I think it should be orthogonal to 
the cross-bundle linking.

What I do not argue against is the construct:

   hasProvenanceIn(entity, bundle)

What I am questioning is the purpose of

   hasProvenanceIn(entity, bundle, alias-for-entity)

Why is the former construct alone insufficient to "provide accurate linking 
across bundles so that the data structure can be navigated"?

(AT this point, we probably need to revisit the examples, but I'm out of time 
right now.)

> Without it, bundles are effectively not usable, and very quickly we will see
> constructs like the one I suggest, to aid this navigation
> of the graph, and we will have failed in achieving interoperability.

<aside>
"interoperability" is not a simple binary property.  No specification can 
reasonably underpin total interoperability.  What a spec can do is provide a 
basis for interoperability for a particular set of activities (scope).  Any 
application will build upon a spec with additional elements (which may or may 
not be interoperable with other applications).

Of itself, by virtue of being technology neutral, PROV-DM *cannot* be regarded 
as achieving interoperability - additional technology layers will always be 
needed for that.  It's a fairly common problem in standards development to try 
and "boil the ocean", rather than focus on documenting a clear consensus.  Each 
standard is just part of a bigger ecosystem, and should focus (like good 
software products) on achieving some clear goals really well and play well with 
other components.

None of this is arguing against the desirability of adequately describing a 
"distributed graph structure" - I think that is a reasonable goal here - but not 
because not doing so would mean "we will have failed in achieving 
interoperability" - in these terms, we will always fail to achieve interoperability.
</aside>

#g

Received on Thursday, 31 May 2012 08:12:01 UTC