Re: Forwarding a comment from SemTech... from Reza B'Far (Oracle) on 2012-06-10 (public-prov-wg@w3.org from June 2012)

From: Reza B'Far (Oracle) <reza.bfar@oracle.com>
Date: Sat, 09 Jun 2012 23:33:41 -0700
To: public-prov-wg@w3.org
Message-ID: <4FD43FC5.9020501@oracle.com>

Ivan (et. al.)

I tried to answer the gentleman who asked the question by saying that
there was an effort going on in providing the mapping. Nevertheless, I
think the discussion is worthwhile as Ivan is suggesting. Also, for
Ivan's benefit, we should probably merge this discussion with thread
between Kai and other folks (which I'm cutting-pasting below -- alas,
there is no good way to merge threads that I know of).

--------
Kai,
I have added Antoine's feedback as Issue 405.

Best,
Daniel

2012/6/8 Kai Eckert <kai@informatik.uni-mannheim.de
<mailto:kai@informatik.uni-mannheim.de>>

Hi all,

here is feedback from Antoine Isaac regarding the Dublin Core
mapping. Unfortunately I was not able to work on the mapping this
week, I will get back to it next week.

Cheers,

Kai

-------- Original-Nachricht --------
Betreff: Fwd: Re: Dublin Core - PROV Mapping, Call for Feedback
(until June 5th)
Datum: Wed, 6 Jun 2012 11:56:12 +0200
Von: Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>>
An: Kai Eckert <kai@informatik.uni-mannheim.de
<mailto:kai@informatik.uni-mannheim.de>>, "Panzer,Michael"
<panzerm@oclc.org <mailto:panzerm@oclc.org>>

Hi Kai, Michael,

Something that I've forgotten to state in my hurry. Or course the
only document I've really reviewed carefully is the Primer.
Especially, I did not check the complex mapping document. I think
for the Direct Mapping and the PROV specialization, this is less a
problem---they're much shorter, and my comments on the corresponding
sections in the Primer would apply to the content of these two
documents, as well...

I sincerely hope these late comments won't be a problem for you,

Antoine

-------- Original Message --------
Subject: Re: Dublin Core - PROV Mapping, Call for Feedback (until
June 5th)
Date: Wed, 6 Jun 2012 11:50:47 +0200
From: Antoine Isaac <aisaac@FEW.VU.NL <mailto:aisaac@FEW.VU.NL>>
Reply-To: DCMI Architecture Forum <DC-ARCHITECTURE@JISCMAIL.AC.UK
<mailto:DC-ARCHITECTURE@JISCMAIL.AC.UK>>
To: <DC-ARCHITECTURE@JISCMAIL.AC.UK
<mailto:DC-ARCHITECTURE@JISCMAIL.AC.UK>>

Hi Kai, Daniel, Michael and Simon,

Thanks a lot for the work on this. In fact I found it a great help
for an outsider (from the Dublin Core community) to have a first
glance into the work of the PROV working group.

I have some comments that I'm hastily putting together below. Please
apologize if it's unclear sometimes, and of course sorry for the
late feedback. You just gave one week... hopefully it's still June 5
somewhere on Earth.

Best,

Antoine

====== PROV reference

It seems that you are using really "fresh" documents from the PROV
working group. E.g. the property prov:generatedAtTime can be found in
http://dvcs.w3.org/hg/prov/raw-file/default/ontology/Overview.html
but not in the latest official working draft
http://www.w3.org/TR/prov-o/
Putting the reference to the latest draft in your docs could be handy!

====== Dublin Core as a Simple Provenance Vocabulary

I'm uncomfortable with the strict categorization of elements into
"descriptive" and "provenance" metadata. Some elements are
questionable to belong to one or the other. You've addressed already
many doubts, but maybe you should acknowledge that you
categorization is not "hard" or if it is, give more rationale for
the questionable elements...
My personal list:
- hasPart, isPartOf: Perhaps isPartOf has indeed often a provenance
flavor, especially when it's used from one element of a collection
to that collection. But I'd argue many of their uses can be
descriptive, especially hasPart. Unless you consider a mereological
description of objects (typical example of a car having wheels) to
be always about provenance?
- conformsTo, rights and accessRights may reflect provenance info
(though it is "derived")
- accrual properties: I wonder whether all should be in
(accrualPolicy seems interesting for provenance) or out
(accrualMethod could be questioned). But a mixed position seems strange.

By the way method-wise, should there be strict correspondence
between the elements in the "provenance" category and the ones that
are mapped to a PROV element in the direct mapping?
What does it say on an a given element, if it's in the "provenance"
category but is not mapped to PROV?

Other comment:
[
It can be questioned if a resource changes by being published,
however, we consider the publication as an action that affects the
state of the resource and therefore it is relevant for the provenance.
]
-> if provenance is about "where does an object come from", then
this one is a no-brainer!

====== Basic considerations

[
if a specialization of a document is generated by one activity and a
specialization is used by a different activity later in time,
]
-> What does "specialization" mean, in practice? I know that it is a
notion from PROV, but the word is highly ambiguous, a Primer would
benefit from some (short) explanation here.

By the way yourself are using "specialization" for something else
(the extension of PROV for handling DC "nuances").

====== What is ex:doc1?

[
it is semantically incorrect to have several activities that all
generate the same entity at different points in time.
]
-> Please cite the PROV context explicitly here!
Many people (I'd expect most) will gladly accept that several
activities contribute to the realization of one same resource. Even
in a FRBR or CIDOC-CRM context, which are already seen as (too)
fine-grained models by many.
By the way, I think later you try indeed to relate to simpler
approaches, so that must mean you thing it is *not* semantically
incorrect ;-)

====== Direct mappings

dct:date rdfs:subPropertyOf prov:generatedAtTime .
seems dubious. dct:valid is a sub-property of dct:date, which means
that it is also a sub-property of prov:generatedAtTime. You
correctly represent this in the mapping document, btw. But I'm quite
sure this relation does not hold in absolute.

dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
This also seems strange at first sight. Looking at the definition
for dct:rightsHolder:
"A person or organization owning or managing rights over the
resource." This may include some institution who manages/stores a
resource on behalf of its creator, or anyone who "owns" the resource.
I think is compatible with PROV's super-vague meaning of attribution
("Attribution is the ascribing of an entity to an agent.",
http://www.w3.org/TR/prov-dm/). But that's quite a stretch from what
many Dublin Core readers will understand for "attribution". Perhaps
you could give some explanation!

======= PROV Specializations (and rationale for complex mappings)

The constructs introduced and their mapping to PROV seem ok.
But I think you could say one sentence about the rationale of these
specializations. I understand the need to "properly reflect the
meaning of the Dublin Core terms". Yet, do we need to go for a
solution that result in having the complexity of patterns of PROV
next to the semantic distinctions made in DC? We could as well just
keep the granularity of DC, in terms of patterns. I.e., using the
simple mappings between DC properties and the related "short-cut
properties" in the PROV patterns (e.g., prov:wasAttributedTo).

This of course relates for the rationale for having complex mappings
in the first step. There are several options that PROV offers, in
terms of granularity. Especially, having more or less fine
distinctions for linking agents to entities. For a same "creation
data" PROV can represent direct links between persons and created
resource (prov:wasAttributedTo), links between persons and resources
via Activity (prov:wasAssociatedWith) and links between persons and
Activity via Roles.

Having all of these levels of granularity at once is probably more
harmful than beneficial, given the complexity of the PROV pattern in
general (especially with "specializations"!). Or are the complex
mappings just an *option* you provide? If yes, a small paragraph
elaborating on this would be useful for your primer. In fact, it may
be enough to gather some sentences you already have scattered in
different sections.

======= Complex mappings, Stage 1

[
A lot of blank nodes are created, however, keep in mind that we
envision a second stage that relates them and provides stable URIs
for the entities.
]
-> Everyone won't be ready to create and maintain URIs for all the
entity/activity/role splitting in the PROV pattern, certainly. What
is the application scenario for this? I guess it would depend. So
maybe at this stage it's safer to say that some applications would
create URIs, some would keep to blank nodes. And of course many
others won't use the more complex mappings.

Other comments:

- I don't get why you opted for a simpler mapping pattern for
"Entity/Entity (How)". It's quite equivalent to the sub-property
mappings you have in the "Direct mappings" sections. According to
the PROV model, for a simple "version" link you can create one or
several creation activities, as well as half a dozen of "in" and
"out" views/specializations of the document, which play each a
different role in these activities.
I understand you would want a simple mapping (so do I) but in this
Primer perhaps you should make a bit clearer reference on why you
made that choice here, as opposed to the more complex mappings that
are listed before this one.

- Is Prov:Entity provided with any specific semantics? If not, then
perhaps you can remove the explicit rdf:type that links to it. That
would make the example graphs simpler.

====== Conflating PROV specializations

I understand that the stage 2 of the complex mapping will "merge" a
lot of the "ins" and "outs" nodes of individual activities. This
should already a progress compared to the extreme atomization that
is currently carried out. I'm looking forward to seeing the details!

However, it seems this will still result in one entity being
specialized into at least as many "versions" as there will be
activities. I expect many in our community will just hate having
that. In fact that could be smartly related to modeling distinctions
such as the ones made in FRBR.
But then (or even without it) we run into the kind of problems
denounced here:
http://blogs.ecs.soton.ac.uk/webteam/2010/09/02/the-modeler/ ;-)

In this respect, it would be a good idea to at least make these
specialization distinctions *optional*. Is it really not possible to
have several activities carried out on a single instance of entity,
say, the ex:doc1 in your example?

======= [end]

Hello everyone,

in the Dublin Core Metada Provenance Task Group (with the help
of Simon Miles), we have released an initial DC to PROV mapping
draft.

The work has been divided in several documents to improve
readability:

- The mapping primer [1] explains the process followed to do the
mapping, the main rationale of our decisions and our next steps.

- The Direct Mappings document [2] shows the direct mappings
found between DC and PROV (e.g., subPropertyOf relations).

- The PROV Specializations document [3] extends PROV-O with some
basic roles and properties to be able to perform the complex
mappings.

- Finally, the Complex-Mappings document [4] infers PROV
statements from DC statements that are not covered by the direct
mappings.

Please give us your feedback on our approach and the documents
within one week (until Tuesday, June 5th).

We sent this mail both to the relevant DCMI mailinglists and the
PROV mailinglist in order to reach consensus.

We are on a quite strict timetable now and aim at finishing the
mapping (Stage 2, and the mapping back from PROV to DC) until
end of June to reach the state of a public draft.

Daniel will briefly present the current state in the PROV call
tomorrow. If you have any questions or comments, please don't
hesitate to contact us.

Thanks,
Kai, Daniel, Michael and Simon.

[1] https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-primer
[2] https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings
[3]
https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations
[4] https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1

On 6/9/12 11:32 PM, Ivan Herman wrote:
> There was a presentation on Prov at SemTech, as you all know, done by Reza. There was a brief discussion regarding the relationship of Prov to Dublin Core, essentially asking why Prov uses its own Agent or Person and not the Dublin Core one. We told the commenter that there is a document coming up that makes connections and equivalences between the two. However, the persons (I think both were from governmental organizations) were not absolutely happy. Essentially, they said having, say, an owl equivalence set up between dct:Agent and prov:Agent is all good and does things on a theoretical level, but when a government agency has to choose which terms they want to use, it is very disturbing to have two formally defined one, and RDF environments do not necessarily handle owl equivalences. Ie, they would prefer if prov would simply use, say, dct:Agent outright, rather than having its own term (I have not checked whether there are more such 'equivalences' set up between DC and Prov, or whether all the others are subclasses/subproperties).
>
> The argument is compelling, we should probably consider this.
>
> Cheers
>
> Ivan
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>

Received on Sunday, 10 June 2012 06:34:17 UTC