Re: Dublin Core - PROV Mapping, Call for Feedback (until June 5th) from Kai Eckert on 2012-06-08 (public-prov-wg@w3.org from June 2012)

From: Kai Eckert <kai@informatik.uni-mannheim.de>
Date: Fri, 08 Jun 2012 09:06:31 +0200
To: public-prov-wg@w3.org
Message-ID: <4FD1A477.9040204@informatik.uni-mannheim.de>
Hi all,

here is feedback from Antoine Isaac regarding the Dublin Core mapping. 
Unfortunately I was not able to work on the mapping this week, I will 
get back to it next week.

Cheers,

Kai

-------- Original-Nachricht --------
Betreff: Fwd: Re: Dublin Core - PROV Mapping, Call for Feedback (until 
June 5th)
Datum: Wed, 6 Jun 2012 11:56:12 +0200
Von: Antoine Isaac <aisaac@few.vu.nl>
An: Kai Eckert <kai@informatik.uni-mannheim.de>, "Panzer,Michael" 
<panzerm@oclc.org>

Hi Kai, Michael,

Something that I've forgotten to state in my hurry. Or course the only 
document I've really reviewed carefully is the Primer. Especially, I did 
not check the complex mapping document. I think for the Direct Mapping 
and the PROV specialization, this is less a problem---they're much 
shorter, and my comments on the corresponding sections in the Primer 
would apply to the content of these two documents, as well...

I sincerely hope these late comments won't be a problem for you,

Antoine



-------- Original Message --------
Subject: Re: Dublin Core - PROV Mapping, Call for Feedback (until June 5th)
Date: Wed, 6 Jun 2012 11:50:47 +0200
From: Antoine Isaac <aisaac@FEW.VU.NL>
Reply-To: DCMI Architecture Forum <DC-ARCHITECTURE@JISCMAIL.AC.UK>
To: <DC-ARCHITECTURE@JISCMAIL.AC.UK>

Hi Kai, Daniel, Michael and Simon,

Thanks a lot for the work on this. In fact I found it a great help for 
an outsider (from the Dublin Core community) to have a first glance into 
the work of the PROV working group.

I have some comments that I'm hastily putting together below. Please 
apologize if it's unclear sometimes, and of course sorry for the late 
feedback. You just gave one week... hopefully it's still June 5 
somewhere on Earth.

Best,

Antoine


====== PROV reference

It seems that you are using really "fresh" documents from the PROV 
working group. E.g. the property prov:generatedAtTime can be found in
http://dvcs.w3.org/hg/prov/raw-file/default/ontology/Overview.html
but not in the latest official working draft
http://www.w3.org/TR/prov-o/
Putting the reference to the latest draft in your docs could be handy!

====== Dublin Core as a Simple Provenance Vocabulary

I'm uncomfortable with the strict categorization of elements into 
"descriptive" and "provenance" metadata. Some elements are questionable 
to belong to one or the other. You've addressed already many doubts, but 
maybe you should acknowledge that you categorization is not "hard" or if 
it is, give more rationale for the questionable elements...
My personal list:
- hasPart, isPartOf: Perhaps isPartOf has indeed often a provenance 
flavor, especially when it's used from one element of a collection to 
that collection. But I'd argue many of their uses can be descriptive, 
especially hasPart. Unless you consider a mereological description of 
objects (typical example of a car having wheels) to be always about 
provenance?
- conformsTo, rights and accessRights may reflect provenance info 
(though it is "derived")
- accrual properties: I wonder whether all should be in (accrualPolicy 
seems interesting for provenance) or out (accrualMethod could be 
questioned). But a mixed position seems strange.

By the way method-wise, should there be strict correspondence between 
the elements in the "provenance" category and the ones that are mapped 
to a PROV element in the direct mapping?
What does it say on an a given element, if it's in the "provenance" 
category but is not mapped to PROV?

Other comment:
[
It can be questioned if a resource changes by being published, however, 
we consider the publication as an action that affects the state of the 
resource and therefore it is relevant for the provenance.
]
-> if provenance is about "where does an object come from", then this 
one is a no-brainer!


====== Basic considerations

[
if a specialization of a document is generated by one activity and a 
specialization is used by a different activity later in time,
]
-> What does "specialization" mean, in practice? I know that it is a 
notion from PROV, but the word is highly ambiguous, a Primer would 
benefit from some (short) explanation here.

By the way yourself are using "specialization" for something else (the 
extension of PROV for handling DC "nuances").

====== What is ex:doc1?

[
it is semantically incorrect to have several activities that all 
generate the same entity at different points in time.
]
-> Please cite the PROV context explicitly here!
Many people (I'd expect most) will gladly accept that several activities 
contribute to the realization of one same resource. Even in a FRBR or 
CIDOC-CRM context, which are already seen as (too) fine-grained models 
by many.
By the way, I think later you try indeed to relate to simpler 
approaches, so that must mean you thing it is *not* semantically 
incorrect ;-)

====== Direct mappings

dct:date rdfs:subPropertyOf prov:generatedAtTime .
seems dubious. dct:valid is a sub-property of dct:date, which means that 
it is also a sub-property of prov:generatedAtTime. You correctly 
represent this in the mapping document, btw. But I'm quite sure this 
relation does not hold in absolute.

dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
This also seems strange at first sight. Looking at the definition for 
dct:rightsHolder:
"A person or organization owning or managing rights over the resource." 
This may include some institution who manages/stores a resource on 
behalf of its creator, or anyone who "owns" the resource.
I think is compatible with PROV's super-vague meaning of attribution 
("Attribution is the ascribing of an entity to an agent.", 
http://www.w3.org/TR/prov-dm/). But that's quite a stretch from what 
many Dublin Core readers will understand for "attribution". Perhaps you 
could give some explanation!

======= PROV Specializations (and rationale for complex mappings)

The constructs introduced and their mapping to PROV seem ok.
But I think you could say one sentence about the rationale of these 
specializations. I understand the need to "properly reflect the meaning 
of the Dublin Core terms". Yet, do we need to go for a solution that 
result in having the complexity of patterns of PROV next to the semantic 
distinctions made in DC? We could as well just keep the granularity of 
DC, in terms of patterns. I.e., using the simple mappings between DC 
properties and the related "short-cut properties" in the PROV patterns 
(e.g., prov:wasAttributedTo).

This of course relates for the rationale for having complex mappings in 
the first step. There are several options that PROV offers, in terms of 
granularity. Especially, having more or less fine distinctions for 
linking agents to entities. For a same "creation data" PROV can 
represent direct links between persons and created resource 
(prov:wasAttributedTo), links between persons and resources via Activity 
(prov:wasAssociatedWith) and links between persons and Activity via Roles.

Having all of these levels of granularity at once is probably more 
harmful than beneficial, given the complexity of the PROV pattern in 
general (especially with "specializations"!). Or are the complex 
mappings just an *option* you provide? If yes, a small paragraph 
elaborating on this would be useful for your primer. In fact, it may be 
enough to gather some sentences you already have scattered in different 
sections.

======= Complex mappings, Stage 1

[
A lot of blank nodes are created, however, keep in mind that we envision 
a second stage that relates them and provides stable URIs for the entities.
]
-> Everyone won't be ready to create and maintain URIs for all the 
entity/activity/role splitting in the PROV pattern, certainly. What is 
the application scenario for this? I guess it would depend. So maybe at 
this stage it's safer to say that some applications would create URIs, 
some would keep to blank nodes. And of course many others won't use the 
more complex mappings.

Other comments:

- I don't get why you opted for a simpler mapping pattern for 
"Entity/Entity (How)". It's quite equivalent to the sub-property 
mappings you have in the "Direct mappings" sections. According to the 
PROV model, for a simple "version" link you can create one or several 
creation activities, as well as half a dozen of "in" and "out" 
views/specializations of the document, which play each a different role 
in these activities.
I understand you would want a simple mapping (so do I) but in this 
Primer perhaps you should make a bit clearer reference on why you made 
that choice here, as opposed to the more complex mappings that are 
listed before this one.

- Is Prov:Entity provided with any specific semantics? If not, then 
perhaps you can remove the explicit rdf:type that links to it. That 
would make the example graphs simpler.


====== Conflating PROV specializations

I understand that the stage 2 of the complex mapping will "merge" a lot 
of the "ins" and "outs" nodes of individual activities. This should 
already a progress compared to the extreme atomization that is currently 
carried out. I'm looking forward to seeing the details!

However, it seems this will still result in one entity being specialized 
into at least as many "versions" as there will be activities. I expect 
many in our community will just hate having that. In fact that could be 
smartly related to modeling distinctions such as the ones made in FRBR.
But then (or even without it) we run into the kind of problems denounced 
here: http://blogs.ecs.soton.ac.uk/webteam/2010/09/02/the-modeler/ ;-)

In this respect, it would be a good idea to at least make these 
specialization distinctions *optional*. Is it really not possible to 
have several activities carried out on a single instance of entity, say, 
the ex:doc1 in your example?


======= [end]


> Hello everyone,
>
> in the Dublin Core Metada Provenance Task Group (with the help of Simon Miles), we have released an initial DC to PROV mapping draft.
>
> The work has been divided in several documents to improve readability:
>
> - The mapping primer [1] explains the process followed to do the mapping, the main rationale of our decisions and our next steps.
>
> - The Direct Mappings document [2] shows the direct mappings found between DC and PROV (e.g., subPropertyOf relations).
>
> - The PROV Specializations document [3] extends PROV-O with some basic roles and properties to be able to perform the complex mappings.
>
> - Finally, the Complex-Mappings document [4] infers PROV statements from DC statements that are not covered by the direct mappings.
>
> Please give us your feedback on our approach and the documents within one week (until Tuesday, June 5th).
>
> We sent this mail both to the relevant DCMI mailinglists and the PROV mailinglist in order to reach consensus.
>
> We are on a quite strict timetable now and aim at finishing the mapping (Stage 2, and the mapping back from PROV to DC) until end of June to reach the state of a public draft.
>
> Daniel will briefly present the current state in the PROV call tomorrow. If you have any questions or comments, please don't hesitate to contact us.
>
> Thanks,
> Kai, Daniel, Michael and Simon.
>
> [1] https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-primer
> [2] https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings
> [3] https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations
> [4] https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1
>
Received on Friday, 8 June 2012 07:07:35 UTC