- From: Paolo Ciccarese <paolo.ciccarese@gmail.com>
- Date: Fri, 16 Aug 2013 08:18:08 -0400
- To: Antoine Isaac <aisaac@few.vu.nl>
- Cc: public-openannotation <public-openannotation@w3.org>
- Message-ID: <CAFPX2kAd3z_GQVWVhxCUVpxFNwgsHzBX9dsWu8y2d+P7EhFf3A@mail.gmail.com>
Hi Antoine, On Fri, Aug 16, 2013 at 7:12 AM, Antoine Isaac <aisaac@few.vu.nl> wrote: > Hi, > > Wow, the return of a serious discussion, in an even more complex form, > awesome ;-) > > First a remark on Stian's comment: > > "and the conclusion seemed to have been that it is simpler to merge the > conceptual annotation with the formalized annotation as a > datastructure." > > Yes, and this was about the data structure only. The annotation is really > of conceptual nature. We just allow for attributes (e.g. oa:serializedBy) > that shortcut some provenance info. A full, correct representation has the > serialization appear as a fully-fledged (PROV) entity, distinct from the > oa:Annotation, as pictured at > http://www.openannotation.org/**spec/core/appendices.html#**ProvMapping<http://www.openannotation.org/spec/core/appendices.html#ProvMapping> > > Based on this indeed pav:authoredBy is a sub-property of oa:annotatedBy > (or an equivalent, in the specific context). > I would say it is more an equivalent as pav:authoredBy is not only for annotations. > > The question next is how to handle the extra level of "digital annotation" > - the guy who captures the annotation in the system (I'll just focus on the > "creator" aspect, the discussion is long enough, let's ignore "with" "at" > and "on"). > > I like Jacco's and Stian's suggestions of double annotations (whether one > is the target or the body or the other...). It is complex, but it > represents the situation quite well. In this case both are annotators. > This would complicate the implementation though. In some sense, I see the "extra level of "digital annotation"" more as extra level of provenance so I can stay 'compact'. We had a similar issue with Claims representation. You have the conceptual Claim and then multiple embodiment of that claim in text. It is a tough problem. But sure, you could think of both of them as annotators. Not sure Darwin would like that but I cannot talk for him :) > An alternative is to create one annotation (oa:annotatedBy Darwin) and > another non-annotation resource. Something similar to the PROV entity we > have for the serialization. It would represent the act of capturing a > annotation in the system, where the student plays the creator role. > > In any case, that's two resources. > > But as for the serialization case, you may want to have only one resource > in a 'core' solution. Two options here: > > 1. Consider that the oa:Annotation is the result of the intellectual work > of both Darwin and the student. In this case both are the object of an > oa:annotatedBy. I think this choice is borderline, but in a specific > application context, where students spend hard work > deciphering/interpreting an annotation, why not? > If you want to use pav:curatedBy still, then you would need to have it a > sub-property of oa:annotatedBy > We cannot really do that as pav:curatedBy is also used for objects that are not annotations. What I could think of doing now is: <ann1> oa:annotatedBy <Darwin> oa:annotatedBy <Student> pav:authoredBy <Darwin> pav:curatedBy <Student> It is redundant but that way the semantics is clear for both OA and PAV and it allows OA clients to get to the provenance. What do you think of it? > > 2. Consider that the role of the student is minor. In this case, I think a > property with a name like pav:curatedBy still makes sense. But it would be > a specialization of something more general, maybe dc:contributor. And its > semantic would in fact be the one of "short-cut" for the more complex > situation where a second annotation (or a PROV entity) exist to represent > the situation at the right granularity. > In PAV most of the properties are short-cuts. The idea is to have a single object rather than a series of them. It does not solve everything, but it works for many use cases. At the moment pav:curatedBy is sub-property of prov:wasAttributedTo and also dct:contributor. So I think we are on the same page. > > 3. Consider that the role of Darwin is minor (very borderline maybe). In > this case the student is the oa:annotatedBy, and Darwin a mere > dc:contributor. > Frankly I feel uncomfortable with this approach. But it is true, it depends on how you intend the annotation. > > In any case I don't think you can do anything practical with a solution > that would only have one resource of type oa:Annotation and a short-cut > property with a name like pav:createdBy. The name and intuitive semantics > are really too close to dc:creator and oa:annotatedBy (as the creator of > the Annotation)! In fact pav:curatedBy is much better, which is why I think > it could be defined as a short-cut in option 2 above. > In pav:createdBy is used for the digital artifact only. While pav:authoredBy, pav:curatedBy and so on... are for the content of the artifact. So you can have both pav:createdBy (person that created the artifact) and pav:curatedBy (person that collected and curated its content). > > Note that PAV mentions dct:createdBy as the super-property of > pav:createdBy, which to my knowledge does not exist. In fact I really > believe PAV would benefit from removing pav:createdBy. If you need it, > re-introduce it with a better name, and clearer semantics! > That is just a typo in the description it is dct:creator. pav:createdBy is more specific than dct:creator as it refers only to the digital artifact. Best, Paolo > > Dear all, >> I would like to share a solution that I am currently implementing in >> Domeo in relation to provenance and a question related to it. Apologies in >> advance for the length of the email. >> >> Use Case: I am dealing with an existing annotation that is written on >> paper. The author of the annotation can be the author of the original >> manuscript or a third party (let's assume the latter for this example). The >> annotation is anchored in a specific location of the original text. My user >> is transforming that annotation into a OA annotation. It is very similar to >> the Darwin's annotation in the specs [1] but I got to a slightly different >> conclusion. >> >> I would like to keep track of: >> - the agent that creates the OA annotation >> - the application the agent used to create the annotation (could be >> different than the application that serialized the annotation) >> - the author of the body of the annotation (third party) >> - the author of the original association of the annotation with the >> original text >> >> In Domeo I use PAV (Provenance Authoring and Versioning ontology) [2][3] >> and I append to the oa:Annotation the following properties >> >> 1) pav:createdBy -> Domeo user >> An agent primarily responsible for encoding the digital artifact or >> resource representation. This creation is distinct from forming the >> content, which is indicated with pav:contributedBy or its subproperties. >> It is more specific than dct:createdBy - which might or might not be >> interpreted to also cover the creation of the content of the artifact. >> >> 2) pav:createdOn -> When the Domeo user created the digital object >> The date of creation of the digital artifact or resource representation. >> The agents responsible can be indicated with pav:createdBy. >> >> 3) pav:createdAt -> Where the user created the digital object >> The geo-location of the agent that created the annotation. >> >> 4) pav:createdWith -> In may case the Domeo tool >> The software/tool used by the creator (pav:createdBy) when making the >> digital resource, for instance a word processor or an annotation tool. A >> more independent software agent that creates the resource without direct >> interactions by a human creator should instead be indicated using >> pav:createdBy. >> >> 5) pav:authoredBy -> The author of the original annotation on paper >> Indicates an agent that originated or gave existence to the work that is >> expressed by the digital resource. The author of the content of a resource >> may be different from the creator of that resource representation >> (pav:createdBy), although they are often the same. The author is usually >> not a software agent (which would be indicated with pav:createdWith, >> pav:createdBy or pav:importedBy), unless the software actually authored the >> content itself; for instance an artificial intelligence algorithm which >> authored a piece of music or a machine learning algorithm that authored a >> classification of a tumor sample >> >> 6) pav:authoredOn -> The date of the original annotation >> Indicates the date this resource was authored by the agents given by >> pav:authoredBy. Note that pav:authoredOn is different from pav:createdOn, >> although their values are often the same. >> >> In summary I have something like: >> >> <ann1> a oa:Annotation >> pav:createdBy -Paolo- >> pav:createdOn -today- >> pav:createdWith -Domeo- >> pav:createdAt -Boston location- >> pav:authoredBy -Annotation’s author- >> pav:authoredOn -Date of the original annotation- >> >> In other words, using PAV I can keep the distinction between the creator >> of the digital artifact and the author of the original content/association. >> >> However, there are possibly a couple of overlaps with the current OA >> properties. As I would like to provide the OA provenance as well, I am >> wondering which of the following applies: >> <ann1> a oa:Annotation ; >> oa:annotatedBy <Paolo> . >> or >> <ann1> a oa:Annotation ; >> oa:annotatedBy <OriginalAuthor> . >> >> Or compared to PAV: >> - pav:createdBy =? oa:annotatedBy --or-- >> - pav:authoredBy =? oa:annotatedBy >> >> Looking at the Darwin’s example in the specs, if the student is >> digitizing a note from Darwin on his own content I would say: >> <ann2> a oa:Annotation >> pav:createdBy -Student- >> pav:createdOn -2013- >> pav:createdWith -Domeo- >> pav:createdAt -Boston location- >> pav:authoredBy -Darwin- >> pav:authoredOn -Date of the original annotation- >> >> Then of course the ‘body’ of the annotation can be also authored by the >> original author of the annotation. But, as pointed out above, it is >> important for me to attribute also the association of body and target to >> the original author as that represent the historical provenance of it. >> >> What this comes down to is basically what an oa:Annotation really is: “an >> Annotation expresses the relationship between two or more resources, and >> their metadata, using an RDF graph”. We talked about this before - my >> question here becomes if oa:annotatedBy indicates who formed the >> relationship (the ‘author’ of the conceptual annotation); or the person who >> (using some OA aware tools) formalized this as an oa:Annotation data >> structure (the RDF structure)? >> >> Best, >> Paolo >> >> >> [1] http://www.openannotation.org/**spec/core/core.html#Provenance<http://www.openannotation.org/spec/core/core.html#Provenance> >> [2] http://arxiv.org/abs/1304.7224 >> [3] http://code.google.com/p/pav-**ontology/<http://code.google.com/p/pav-ontology/> >> >> >> -- >> Dr. Paolo Ciccarese >> http://www.paolociccarese.**info/ <http://www.paolociccarese.info/> >> Biomedical Informatics Research & Development >> Instructor of Neurology at Harvard Medical School >> Assistant in Neuroscience at Mass General Hospital >> Member of the MGH Biomedical Informatics Core >> +1-857-366-1524 (mobile) +1-617-768-8744 (office) >> >> CONFIDENTIALITY NOTICE: This message is intended only for the >> addressee(s), may contain information that is considered >> to be sensitive or confidential and may not be forwarded or disclosed to >> any other party without the permission of the sender. >> If you have received this message in error, please notify the sender >> immediately. >> > > >
Received on Friday, 16 August 2013 12:18:36 UTC