PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]

Dear All,

Following the phone-conference on May 26, let me repeat some thoughts:

The definition of a Resource that has the potential to have a provenance
(following Guarion, Gruber, ontologies describe possible states of affairs as precisely as possible)
in a Semantic Web relevant way, should be specific enough, so that we can clearly
identify a set of properties that are relevant and connect in a relevant way to
answer provenance query.

 From our background I suggest that the distinction of a Physical Resource consisting of matter
( for instance crm:E18 Physical Thing) from an InformationResource (irw:InformationResource
or crm:E73 Information) is necessary and fundamental, because

1) a physical thing undergoes a linear sequence of states and changes, because any change destroys
  the previous form. It can only be at on place at a time. From this we infer most of our
common sense logic of provenance and identity. Even splitting or merging an object destroys all
predecessors. Identity can be based on continuity of custody (sequence of all ID cards), or
essential properties (fingerprints etc).

2) An Information Object can reside on multiple carriers (or "realizations", "copies", "items") at the same time.
The state of change of any of the copies cannot be related without complete world knowledge to that of
other copies, because we cannot know what may happen on the other side of the world.
Therefore the Information Object itself has no well-defined or verifiable states of change in its nature as data.
Therefore changes of Information Objects are better described as creations of new ones for any minimal change.
Identity can be based on content, for provenance reasoning best on a bit or character identity.

As a consequence, analogue photograhic material in film industry etc. is better traced as material objects,
because there is no convention to define identity of content for different copies of analogue photographs.

Using provenance for authenticity reasoning on information objects will rely, besides others, on the fate of multiple copies.
Not being able to distinguish the behavior of carriers from the actual data would be prohibitive to
such reasoning.

Further, universals ("Concept", crm:E55 Type), such as "man", "dog" behave again differently, because the
IsA relations and often fuzzy boundaries of concepts create again different identity conditions and much
more confused states. I propose to exclude provenance of universals from the discussion until we have understood
the other two.

I maintain that no more distinctions need be made for this PROVENANCE discussion.

FRBR entities have been mentioned in the discussion. In the CRM-FRBR Harmonization Group we
concluded (
together with the IFLA FRBR Review Group that the identification of Work, Expression, Manifestation
is in practice done by selecting "representative" existing realizations, which have a clear identity by content,
be it fragments or copies of copies of lost works. Therefore the "conceptual nature" of a Work should not confuse
us. The provenance would still be based on realizations.




  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email: |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
  Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
          Web-site:               |

Received on Sunday, 29 May 2011 13:00:19 UTC