Doppelgänger fallacy; Was Re: OA and provenance

Antoine-

I take no position on whether Paolo is indeed introducing a confusion
between an annotation and the digital representation of the
annotation. But my experience has been that such confusion is very
common and not just about annotations, but, as you  point out, about
any kind of resource. The most typical case I find among natural
scientists is eagerness to assign the same identifier to a physical
object as to a digital description of it.  Paolo himself tends to warn
about this in talks about annotation,  using the Eiffel Tower as an
example.  It's a natural thing for humans to do, because in human
dialogue, the context  often makes clear which is under discussion.
But when the context doesn't, confusion ensues.  One venue where
confusion is \likely/ is when informaticians are discussing a case in
which both require treatment.

I think the phenomenon is so pervasive that we need a short, memorable
name for it, especially one I can use to bludgeon my natural science
colleagues who think it's \helpful/ to have a single identifier for a
physical thing and its digital description.

I propose to call what you remark upon  a "doppelgänger fallacy."

I especially relish the prospect of hopping up in a talk and saying
"Paolo! You, of all people, have introduced a doppelgänger fallacy on
slide 23!"  :-)

Bob Morris


On Fri, Aug 16, 2013 at 8:42 AM, Antoine Isaac <aisaac@few.vu.nl> wrote:
> Hi Paolo,
>
> I agree that the decisions on scenario 1-2-3 are entirely up to you. And the
> more fundamental decisions on having one annotation or one annotation and
> something else (another annotation or a PROV entity) also. But as long as
> your short-cuts have a sound grounding! If others can't understand clearly
> what you've done then it's a big problem ;-)
>
> As for the mappings (pav:authoredBy is a sub-property of oa:annotatedBy,
> pav:curatedBy a sub-property of oa:annotatedBy) I agree with you. I used
> "sub-property" but I meant it in a "local" context. I.e for all triples in
> your annotation case, you should generate another triple of the more general
> property. I think what you suggest is ok.
>
>
> As for pav:createdBy, sentence like:
>
> "In pav:createdBy is used for the digital artifact only. While
> pav:authoredBy, pav:curatedBy and so on...  are for the content of the
> artifact. So you can have both pav:createdBy (person that created the
> artifact) and pav:curatedBy (person that collected and curated its
> content)."
> and:
>
> "pav:createdBy is more specific than dct:creator as it refers only to the
> digital artifact."
> are really confusing.
> In RDF is one resource that is the object of the statements, and it denotes
> ONE entity. You can't have two properties applied to one resource, and
> sometimes the resource should be interpreted as the annotation, and some
> other times as the digital representation of the annotation.
>
> If you want to use createdBy as a shortcut, and still subject it to the
> oa:Annotation resource, then your definition should rather be :
> "pav:createdBy is used for the creator of digital artifact that represents
> the resource". And then the name should be changed because it doesn't
> reflect the short-cut at all. It should be something like
> pav:hasDigitalRepresentationCreatedBy...
>
> If you want to use createdBy on a resource that is a digital representation
> of the oa:Annotation (so not the oa:Annotation itself) then the wording is
> better. But in this case, well, you can't subject it to the resource you
> wanted to subject it on. And there's no point in minting a property that is
> not dc:creator... (or course dc:creator can be applied to any resource, be
> it a conceptual one of a representation).
>
> Cheers,
>
> Antoine
>
>
>> Hi Antoine,
>>
>>
>> On Fri, Aug 16, 2013 at 7:12 AM, Antoine Isaac <aisaac@few.vu.nl
>> <mailto:aisaac@few.vu.nl>> wrote:
>>
>>     Hi,
>>
>>     Wow, the return of a serious discussion, in an even more complex form,
>> awesome ;-)
>>
>>     First a remark on Stian's comment:
>>
>>     "and the conclusion seemed to have been that it is simpler to merge
>> the conceptual annotation with the formalized annotation as a
>>     datastructure."
>>
>>     Yes, and this was about the data structure only. The annotation is
>> really of conceptual nature. We just allow for attributes (e.g.
>> oa:serializedBy) that shortcut some provenance info. A full, correct
>> representation has the serialization appear as a fully-fledged (PROV)
>> entity, distinct from the oa:Annotation, as pictured at
>>
>> http://www.openannotation.org/__spec/core/appendices.html#__ProvMapping
>> <http://www.openannotation.org/spec/core/appendices.html#ProvMapping>
>>
>>
>>     Based on this indeed pav:authoredBy is a sub-property of
>> oa:annotatedBy (or an equivalent, in the specific context).
>>
>>
>> I would say it is more an equivalent as pav:authoredBy is not only for
>> annotations.
>>
>>
>>     The question next is how to handle the extra level of "digital
>> annotation" - the guy who captures the annotation in the system (I'll just
>> focus on the "creator" aspect, the discussion is long enough, let's ignore
>> "with" "at" and "on").
>>
>>     I like Jacco's and Stian's suggestions of double annotations (whether
>> one is the target or the body or the other...). It is complex, but it
>> represents the situation quite well. In this case both are annotators.
>>
>>
>> This would complicate the implementation though. In some sense, I see the
>> "extra level of "digital annotation"" more as extra level of provenance so I
>> can stay 'compact'.
>> We had a similar issue with Claims representation. You have the conceptual
>> Claim and then multiple embodiment of that claim in text. It is a tough
>> problem.
>>
>> But sure, you could think of both of them as annotators. Not sure Darwin
>> would like that but I cannot talk for him :)
>>
>>     An alternative is to create one annotation (oa:annotatedBy Darwin) and
>> another non-annotation resource. Something similar to the PROV entity we
>> have for the serialization. It would represent the act of capturing a
>> annotation in the system, where the student plays the creator role.
>>
>>     In any case, that's two resources.
>>
>>     But as for the serialization case, you may want to have only one
>> resource in a 'core' solution. Two options here:
>>
>>     1. Consider that the oa:Annotation is the result of the intellectual
>> work of both Darwin and the student. In this case both are the object of an
>> oa:annotatedBy. I think this choice is borderline, but in a specific
>> application context, where students spend hard work deciphering/interpreting
>> an annotation, why not?
>>     If you want to use pav:curatedBy still, then you would need to have it
>> a sub-property of oa:annotatedBy
>>
>>
>> We cannot really do that as pav:curatedBy is also used for objects that
>> are not annotations.
>> What I could think of doing now is:
>>
>> <ann1>
>>        oa:annotatedBy <Darwin>
>>        oa:annotatedBy <Student>
>>        pav:authoredBy <Darwin>
>>        pav:curatedBy <Student>
>>
>> It is redundant but that way the semantics is clear for both OA and PAV
>> and it allows OA clients to get to the provenance.
>> What do you think of it?
>>
>>
>>     2. Consider that the role of the student is minor. In this case, I
>> think a property with a name like pav:curatedBy still makes sense. But it
>> would be a specialization of something more general, maybe dc:contributor.
>> And its semantic would in fact be the one of "short-cut" for the more
>> complex situation where a second annotation (or a PROV entity) exist to
>> represent the situation at the right granularity.
>>
>>
>> In PAV most of the properties are short-cuts. The idea is to have a single
>> object rather than a series of them. It does not solve everything, but it
>> works for many use cases.
>> At the moment pav:curatedBy is sub-property of prov:wasAttributedTo and
>> also dct:contributor. So I think we are on the same page.
>>
>>
>>     3. Consider that the role of Darwin is minor (very borderline maybe).
>> In this case the student is the oa:annotatedBy, and Darwin a mere
>> dc:contributor.
>>
>>
>> Frankly I feel uncomfortable with this approach. But it is true, it
>> depends on how you intend the annotation.
>>
>>
>>     In any case I don't think you can do anything practical with a
>> solution that would only have one resource of type oa:Annotation and a
>> short-cut property with a name like pav:createdBy. The name and intuitive
>> semantics are really too close to dc:creator and oa:annotatedBy (as the
>> creator of the Annotation)! In fact pav:curatedBy is much better, which is
>> why I think it could be defined as a short-cut in option 2 above.
>>
>>
>> In pav:createdBy is used for the digital artifact only. While
>> pav:authoredBy, pav:curatedBy and so on...  are for the content of the
>> artifact.
>> So you can have both pav:createdBy (person that created the artifact) and
>> pav:curatedBy (person that collected and curated its content).
>>
>>
>>     Note that PAV mentions dct:createdBy as the super-property of
>> pav:createdBy, which to my knowledge does not exist. In fact I really
>> believe PAV would benefit from removing pav:createdBy. If you need it,
>> re-introduce it with a better name, and clearer semantics!
>>
>>
>> That is just a typo in the description it is dct:creator. pav:createdBy is
>> more specific than dct:creator as it refers only to the digital artifact.
>>
>> Best,
>> Paolo
>>
>

Received on Friday, 16 August 2013 14:15:38 UTC