Re: OA and provenance

Hi Paolo,

I agree that the decisions on scenario 1-2-3 are entirely up to you. And the more fundamental decisions on having one annotation or one annotation and something else (another annotation or a PROV entity) also. But as long as your short-cuts have a sound grounding! If others can't understand clearly what you've done then it's a big problem ;-)

As for the mappings (pav:authoredBy is a sub-property of oa:annotatedBy, pav:curatedBy a sub-property of oa:annotatedBy) I agree with you. I used "sub-property" but I meant it in a "local" context. I.e for all triples in your annotation case, you should generate another triple of the more general property. I think what you suggest is ok.


As for pav:createdBy, sentence like:
"In pav:createdBy is used for the digital artifact only. While pav:authoredBy, pav:curatedBy and so on...  are for the content of the artifact. So you can have both pav:createdBy (person that created the artifact) and pav:curatedBy (person that collected and curated its content)."
and:
"pav:createdBy is more specific than dct:creator as it refers only to the digital artifact."
are really confusing.
In RDF is one resource that is the object of the statements, and it denotes ONE entity. You can't have two properties applied to one resource, and sometimes the resource should be interpreted as the annotation, and some other times as the digital representation of the annotation.

If you want to use createdBy as a shortcut, and still subject it to the oa:Annotation resource, then your definition should rather be : "pav:createdBy is used for the creator of digital artifact that represents the resource". And then the name should be changed because it doesn't reflect the short-cut at all. It should be something like pav:hasDigitalRepresentationCreatedBy...

If you want to use createdBy on a resource that is a digital representation of the oa:Annotation (so not the oa:Annotation itself) then the wording is better. But in this case, well, you can't subject it to the resource you wanted to subject it on. And there's no point in minting a property that is not dc:creator... (or course dc:creator can be applied to any resource, be it a conceptual one of a representation).

Cheers,

Antoine


> Hi Antoine,
>
> On Fri, Aug 16, 2013 at 7:12 AM, Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>> wrote:
>
>     Hi,
>
>     Wow, the return of a serious discussion, in an even more complex form, awesome ;-)
>
>     First a remark on Stian's comment:
>
>     "and the conclusion seemed to have been that it is simpler to merge the conceptual annotation with the formalized annotation as a
>     datastructure."
>
>     Yes, and this was about the data structure only. The annotation is really of conceptual nature. We just allow for attributes (e.g. oa:serializedBy) that shortcut some provenance info. A full, correct representation has the serialization appear as a fully-fledged (PROV) entity, distinct from the oa:Annotation, as pictured at
>     http://www.openannotation.org/__spec/core/appendices.html#__ProvMapping <http://www.openannotation.org/spec/core/appendices.html#ProvMapping>
>
>     Based on this indeed pav:authoredBy is a sub-property of oa:annotatedBy (or an equivalent, in the specific context).
>
>
> I would say it is more an equivalent as pav:authoredBy is not only for annotations.
>
>
>     The question next is how to handle the extra level of "digital annotation" - the guy who captures the annotation in the system (I'll just focus on the "creator" aspect, the discussion is long enough, let's ignore "with" "at" and "on").
>
>     I like Jacco's and Stian's suggestions of double annotations (whether one is the target or the body or the other...). It is complex, but it represents the situation quite well. In this case both are annotators.
>
>
> This would complicate the implementation though. In some sense, I see the "extra level of "digital annotation"" more as extra level of provenance so I can stay 'compact'.
> We had a similar issue with Claims representation. You have the conceptual Claim and then multiple embodiment of that claim in text. It is a tough problem.
>
> But sure, you could think of both of them as annotators. Not sure Darwin would like that but I cannot talk for him :)
>
>     An alternative is to create one annotation (oa:annotatedBy Darwin) and another non-annotation resource. Something similar to the PROV entity we have for the serialization. It would represent the act of capturing a annotation in the system, where the student plays the creator role.
>
>     In any case, that's two resources.
>
>     But as for the serialization case, you may want to have only one resource in a 'core' solution. Two options here:
>
>     1. Consider that the oa:Annotation is the result of the intellectual work of both Darwin and the student. In this case both are the object of an oa:annotatedBy. I think this choice is borderline, but in a specific application context, where students spend hard work deciphering/interpreting an annotation, why not?
>     If you want to use pav:curatedBy still, then you would need to have it a sub-property of oa:annotatedBy
>
>
> We cannot really do that as pav:curatedBy is also used for objects that are not annotations.
> What I could think of doing now is:
>
> <ann1>
>        oa:annotatedBy <Darwin>
>        oa:annotatedBy <Student>
>        pav:authoredBy <Darwin>
>        pav:curatedBy <Student>
>
> It is redundant but that way the semantics is clear for both OA and PAV and it allows OA clients to get to the provenance.
> What do you think of it?
>
>
>     2. Consider that the role of the student is minor. In this case, I think a property with a name like pav:curatedBy still makes sense. But it would be a specialization of something more general, maybe dc:contributor. And its semantic would in fact be the one of "short-cut" for the more complex situation where a second annotation (or a PROV entity) exist to represent the situation at the right granularity.
>
>
> In PAV most of the properties are short-cuts. The idea is to have a single object rather than a series of them. It does not solve everything, but it works for many use cases.
> At the moment pav:curatedBy is sub-property of prov:wasAttributedTo and also dct:contributor. So I think we are on the same page.
>
>
>     3. Consider that the role of Darwin is minor (very borderline maybe). In this case the student is the oa:annotatedBy, and Darwin a mere dc:contributor.
>
>
> Frankly I feel uncomfortable with this approach. But it is true, it depends on how you intend the annotation.
>
>
>     In any case I don't think you can do anything practical with a solution that would only have one resource of type oa:Annotation and a short-cut property with a name like pav:createdBy. The name and intuitive semantics are really too close to dc:creator and oa:annotatedBy (as the creator of the Annotation)! In fact pav:curatedBy is much better, which is why I think it could be defined as a short-cut in option 2 above.
>
>
> In pav:createdBy is used for the digital artifact only. While pav:authoredBy, pav:curatedBy and so on...  are for the content of the artifact.
> So you can have both pav:createdBy (person that created the artifact) and pav:curatedBy (person that collected and curated its content).
>
>
>     Note that PAV mentions dct:createdBy as the super-property of pav:createdBy, which to my knowledge does not exist. In fact I really believe PAV would benefit from removing pav:createdBy. If you need it, re-introduce it with a better name, and clearer semantics!
>
>
> That is just a typo in the description it is dct:creator. pav:createdBy is more specific than dct:creator as it refers only to the digital artifact.
>
> Best,
> Paolo
>

Received on Friday, 16 August 2013 12:43:43 UTC