Re: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Robert Sanderson on 2015-08-18 (public-annotation@w3.org from August 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Tue, 18 Aug 2015 12:57:02 -0700
To: Doug Schepers <schepers@w3.org>
Cc: t-cole3 <t-cole3@illinois.edu>, Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CABevsUGzoQT+4YcwBxqrGHNkJNSazT+RGwGY_gXFm1eAG6ctCA@mail.gmail.com>
On Tue, Aug 18, 2015 at 12:01 PM, Doug Schepers <schepers@w3.org> wrote:

> Hi, Rob–
> On 8/17/15 2:41 PM, Robert Sanderson wrote:
>
>> On Mon, Aug 17, 2015 at 10:13 AM, Timothy Cole wrote:
>> Now that resource has two roles, tagging and commenting.
>>
>
> Can you please describe again (I feel you've mentioned it before) the use
> case for this 'body' reuse? Especially in the case where the body is a text
> literal?
>

Sure.

As a reviewer using an annotation tool to comment on a paper, I want my
review to be persistent and referenced. It might refer to other papers
beyond the one I'm commenting on, for example to point out plagiarism or to
suggest other sources, but the review is of the target paper.

As a paper author, I want to link that review to my cited paper as a
justification for its value.  E.g. the same content reviews one paper and
provides support for another paper.

The review starts off as a block of text in an annotation client.  It is
then transferred via the protocol to a server.  The server creates a URI
for it.
The second annotation takes that URI and uses it as the body, with a
different role.


And another:

I post on twitter a comment noting a typo on a wikipedia page.
A system then uses that more specifically as the justification for an
annotation that also suggests the change, both using more specific
motivations.


And another:

I post on medium my thoughts about a particular politically charged topic.
It's a comment on a wikipedia page.
People on both sides of the topic take the same post different ways and use
it as support for their view and a dismissal of the opposition.


And another:

I transcribe a quote from a book as part of a crowd-sourcing platform.
I then use that quote as a comment on the museum exhibit that it is talking
about.

I can go on if needed.



> Is this use case common, or is it an edge case?
>

Common.


> I'm having a hard time imagining a large-scale annotation application that
> would reuse body literals, rather than simply having multiple instances of
> similar bodies, each contained in its own annotation. The user experience
> and workflow aren't clear to me.
>

When the authorship of the body is important. Which is almost always.  Note
that the author of the body is not necessarily the author of the
annotation, as per the examples above, bar the last one.


I totally understand that multiple annotations might use the same external
> resource (e.g. a picture or video) as a body, but that's a different case
> with a different object structure (and a different UX/workflow).
>

All of the above *start* as plain text, so the same UX for the first part.
The second annotation doesn't need to re-type the text, rather than
selecting existing content.  So I think I agree that there is a different
workflow, even if the same UI might allow both.

However I disagree that there must be a different structure.  Having a
consistent structure for both uses -of the same body- seems important, as
clients and servers will otherwise need to implement both, depending on the
otherwise arbitrary order in which the annotations were created.



At some point, if you're pointing to 2 different external resources, it
> seems like it would be hard to delineate between an annotation with
> multiple targets (or bodies), rather than a clear body-target relationship,
> and I don't see what kind of annotation client would structure things that
> way.
>

I don't understand this, sorry.


> I assume that your annotation client does something like this… can you
> tell us how that works?


And I'm not sure what you're asking for here.



Rather than consistently using the Specific resource pattern:
>>
>> "body": {
>>    "role": "tagging",
>>    "source": {
>>      "id": "http://repo.org/bodies/1",
>>      "value": "+1"
>>    }
>> }
>>
>> Which will always work at the (IMO minimal) cost of slightly more
>> structure.
>> It's also clearer without the explicit types, as role can only be on
>> SpecificResource.
>>
>
> Is this structured allowed, or required? If it's simply allowed, then we
> agree. If it's required, then I'm a bit less comfortable.
>

It would be required for resources with URIs.  I would prefer to require it
also for Embedded content for consistency, and to keep the separation of
concerns per my response to Tim.


When we extrapolate to multiple bodies (which is really what we're talking
> about), the extra code become more obvious:
>
> "body" : [
>   { "role" : "tagging", "value" : "+1"},
>   { "role" : "commenting", "value" : "This reminds me of a meme…" },
>   { "role" : "linking", "source" : "http://example.com/image.png" }
> ]

(Fixed and compacted inline)

versus:
>
> "body" : [
>   {
>     "role" : "tagging",
>     "source" : { "value" : "+1" }
>   },
>   {
>     "role" : "commenting",
>     "source" : { "value" : "This reminds me of a meme…" }
>   },
>   {
>     "role" : "linking",
>     "source" :  "http://example.com/image.png"
>   }
> ]
>
(Fixed and compacted inline)

But in most cases:

"body" : [
  {
    "role" : "tagging",
    "source" : { "value" : "+1" }
  },
  {
    "role" : "commenting",
    "source" : { "type" : "text", "value" : "This reminds me of a meme…" }
  },
  {
    "role" : "linking",
    "source" :  {  "type": "Image", "id": "http://example.com/image.png" }
  }
]



> At that point, it's not clear what this structure buys us, though I'll
> admit that it adds a uniformity of structure between constructs of
> different types might make it easier to always do the right thing.


Uniformity in data structures is good, rather than constantly having to
test for the existence of different structures.  Also in terms of making it
easier to do the right thing, and the actual complexity of the structure,
if you have to explain one thing well, that's easier than explaining two
things well plus when you would choose to use one or the other.

Especially when you have to understand and implement either both anyway, or
just one.


That is why I'm +0, rather than -1.  I can live with it if needed, but I
>> think there's a better way that separates the two concerns:
>>
>> EmbeddedContent:  Transfer content of any type for any resource, URI or
>> no, in the serialized annotation.  (Which is why we talked about it in
>> the Serialization section in the CG docs)
>> SpecificResource:  Make annotation specific assertions about a Body or
>> Target resource. (Until now, that has been selector, state, style and
>> scope ... we're just adding another specifier of role)
>>
>
> Perhaps the terms "EmbeddedContent" and "SpecificResource" are throwing me
> off a bit. Are those terms used in LD/RDF, or are they terms we've
> introduced?


We introduced both.

We (the WG) introduced EmbeddedContent to replace the defunct ContentAsText
work, after many failed efforts to get the people responsible for it to
take it forwards.
    http://www.w3.org/TR/Content-in-RDF10/

And (as earlier in the thread) we (the Open Annotation Collaboration, pre
CG) introduced Specific Resource based on Tim Berners-Lee's notion of
Specific vs Generic resources in the web architecture, previously called
Constrained resources.


Hope that helps,

Rob

-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Tuesday, 18 August 2015 19:57:31 UTC