Re: OA and provenance

Antoine,
replying to the last two paragraphs...

The current definition of pav:createdBy starts with:
"An agent primary responsible for making the digital artifact or resource
representation."
Given my knowledge of English - means: I can be wrong -, I think that is
pretty close to:

"the creator of digital artifact that represents the resource"
I feel we can swap the beginning with your suggestion - that seems more
direct- without really changing what we meant.

Regarding the name, we are discussing the change since a while. One option
is to deprecate (but still keep as we have data since 2007) pav:createdBy
for something that expresses the fact that is related to the digital
representation. We have a meeting scheduled for next week to approach this
issue. But at this point it is more 'terminological' as the definition is
not going to change in terms of meaning.

In any case, as I am not entirely sure this topic is of interest for other
in this mailing list, I would discuss this specific aspects separately if
you wish.

Best,


On Fri, Aug 16, 2013 at 8:42 AM, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi Paolo,
>
> I agree that the decisions on scenario 1-2-3 are entirely up to you. And
> the more fundamental decisions on having one annotation or one annotation
> and something else (another annotation or a PROV entity) also. But as long
> as your short-cuts have a sound grounding! If others can't understand
> clearly what you've done then it's a big problem ;-)
>
> As for the mappings (pav:authoredBy is a sub-property of oa:annotatedBy,
> pav:curatedBy a sub-property of oa:annotatedBy) I agree with you. I used
> "sub-property" but I meant it in a "local" context. I.e for all triples in
> your annotation case, you should generate another triple of the more
> general property. I think what you suggest is ok.
>
>
> As for pav:createdBy, sentence like:
>
> "In pav:createdBy is used for the digital artifact only. While
> pav:authoredBy, pav:curatedBy and so on...  are for the content of the
> artifact. So you can have both pav:createdBy (person that created the
> artifact) and pav:curatedBy (person that collected and curated its
> content)."
> and:
>
> "pav:createdBy is more specific than dct:creator as it refers only to the
> digital artifact."
> are really confusing.
> In RDF is one resource that is the object of the statements, and it
> denotes ONE entity. You can't have two properties applied to one resource,
> and sometimes the resource should be interpreted as the annotation, and
> some other times as the digital representation of the annotation.
>
> If you want to use createdBy as a shortcut, and still subject it to the
> oa:Annotation resource, then your definition should rather be :
> "pav:createdBy is used for the creator of digital artifact that represents
> the resource". And then the name should be changed because it doesn't
> reflect the short-cut at all. It should be something like pav:**
> hasDigitalRepresentationCreate**dBy...
>
> If you want to use createdBy on a resource that is a digital
> representation of the oa:Annotation (so not the oa:Annotation itself) then
> the wording is better. But in this case, well, you can't subject it to the
> resource you wanted to subject it on. And there's no point in minting a
> property that is not dc:creator... (or course dc:creator can be applied to
> any resource, be it a conceptual one of a representation).
>
> Cheers,
>
> Antoine
>
>
>  Hi Antoine,
>>
>>
>> On Fri, Aug 16, 2013 at 7:12 AM, Antoine Isaac <aisaac@few.vu.nl <mailto:
>> aisaac@few.vu.nl>> wrote:
>>
>>     Hi,
>>
>>     Wow, the return of a serious discussion, in an even more complex
>> form, awesome ;-)
>>
>>     First a remark on Stian's comment:
>>
>>     "and the conclusion seemed to have been that it is simpler to merge
>> the conceptual annotation with the formalized annotation as a
>>     datastructure."
>>
>>     Yes, and this was about the data structure only. The annotation is
>> really of conceptual nature. We just allow for attributes (e.g.
>> oa:serializedBy) that shortcut some provenance info. A full, correct
>> representation has the serialization appear as a fully-fledged (PROV)
>> entity, distinct from the oa:Annotation, as pictured at
>>     http://www.openannotation.org/**__spec/core/appendices.html#__**
>> ProvMapping<http://www.openannotation.org/__spec/core/appendices.html#__ProvMapping><
>> http://www.openannotation.**org/spec/core/appendices.html#**ProvMapping<http://www.openannotation.org/spec/core/appendices.html#ProvMapping>
>> >
>>
>>
>>     Based on this indeed pav:authoredBy is a sub-property of
>> oa:annotatedBy (or an equivalent, in the specific context).
>>
>>
>> I would say it is more an equivalent as pav:authoredBy is not only for
>> annotations.
>>
>>
>>     The question next is how to handle the extra level of "digital
>> annotation" - the guy who captures the annotation in the system (I'll just
>> focus on the "creator" aspect, the discussion is long enough, let's ignore
>> "with" "at" and "on").
>>
>>     I like Jacco's and Stian's suggestions of double annotations (whether
>> one is the target or the body or the other...). It is complex, but it
>> represents the situation quite well. In this case both are annotators.
>>
>>
>> This would complicate the implementation though. In some sense, I see the
>> "extra level of "digital annotation"" more as extra level of provenance so
>> I can stay 'compact'.
>> We had a similar issue with Claims representation. You have the
>> conceptual Claim and then multiple embodiment of that claim in text. It is
>> a tough problem.
>>
>> But sure, you could think of both of them as annotators. Not sure Darwin
>> would like that but I cannot talk for him :)
>>
>>     An alternative is to create one annotation (oa:annotatedBy Darwin)
>> and another non-annotation resource. Something similar to the PROV entity
>> we have for the serialization. It would represent the act of capturing a
>> annotation in the system, where the student plays the creator role.
>>
>>     In any case, that's two resources.
>>
>>     But as for the serialization case, you may want to have only one
>> resource in a 'core' solution. Two options here:
>>
>>     1. Consider that the oa:Annotation is the result of the intellectual
>> work of both Darwin and the student. In this case both are the object of an
>> oa:annotatedBy. I think this choice is borderline, but in a specific
>> application context, where students spend hard work
>> deciphering/interpreting an annotation, why not?
>>     If you want to use pav:curatedBy still, then you would need to have
>> it a sub-property of oa:annotatedBy
>>
>>
>> We cannot really do that as pav:curatedBy is also used for objects that
>> are not annotations.
>> What I could think of doing now is:
>>
>> <ann1>
>>        oa:annotatedBy <Darwin>
>>        oa:annotatedBy <Student>
>>        pav:authoredBy <Darwin>
>>        pav:curatedBy <Student>
>>
>> It is redundant but that way the semantics is clear for both OA and PAV
>> and it allows OA clients to get to the provenance.
>> What do you think of it?
>>
>>
>>     2. Consider that the role of the student is minor. In this case, I
>> think a property with a name like pav:curatedBy still makes sense. But it
>> would be a specialization of something more general, maybe dc:contributor.
>> And its semantic would in fact be the one of "short-cut" for the more
>> complex situation where a second annotation (or a PROV entity) exist to
>> represent the situation at the right granularity.
>>
>>
>> In PAV most of the properties are short-cuts. The idea is to have a
>> single object rather than a series of them. It does not solve everything,
>> but it works for many use cases.
>> At the moment pav:curatedBy is sub-property of prov:wasAttributedTo and
>> also dct:contributor. So I think we are on the same page.
>>
>>
>>     3. Consider that the role of Darwin is minor (very borderline maybe).
>> In this case the student is the oa:annotatedBy, and Darwin a mere
>> dc:contributor.
>>
>>
>> Frankly I feel uncomfortable with this approach. But it is true, it
>> depends on how you intend the annotation.
>>
>>
>>     In any case I don't think you can do anything practical with a
>> solution that would only have one resource of type oa:Annotation and a
>> short-cut property with a name like pav:createdBy. The name and intuitive
>> semantics are really too close to dc:creator and oa:annotatedBy (as the
>> creator of the Annotation)! In fact pav:curatedBy is much better, which is
>> why I think it could be defined as a short-cut in option 2 above.
>>
>>
>> In pav:createdBy is used for the digital artifact only. While
>> pav:authoredBy, pav:curatedBy and so on...  are for the content of the
>> artifact.
>> So you can have both pav:createdBy (person that created the artifact) and
>> pav:curatedBy (person that collected and curated its content).
>>
>>
>>     Note that PAV mentions dct:createdBy as the super-property of
>> pav:createdBy, which to my knowledge does not exist. In fact I really
>> believe PAV would benefit from removing pav:createdBy. If you need it,
>> re-introduce it with a better name, and clearer semantics!
>>
>>
>> That is just a typo in the description it is dct:creator. pav:createdBy
>> is more specific than dct:creator as it refers only to the digital artifact.
>>
>> Best,
>> Paolo
>>
>>


-- 
Dr. Paolo Ciccarese
http://www.paolociccarese.info/
Biomedical Informatics Research & Development
Instructor of Neurology at Harvard Medical School
Assistant in Neuroscience at Mass General Hospital
Member of the MGH Biomedical Informatics Core
+1-857-366-1524 (mobile)   +1-617-768-8744 (office)

CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s),
may contain information that is considered
to be sensitive or confidential and may not be forwarded or disclosed to
any other party without the permission of the sender.
If you have received this message in error, please notify the sender
immediately.

Received on Friday, 16 August 2013 13:10:17 UTC