Re: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Doug Schepers on 2015-08-18 (public-annotation@w3.org from August 2015)

From: Doug Schepers <schepers@w3.org>
Date: Tue, 18 Aug 2015 19:26:51 -0400
To: Robert Sanderson <azaroth42@gmail.com>
Cc: t-cole3 <t-cole3@illinois.edu>, Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <55D3BF3B.1070803@w3.org>
Hi, Rob–

I'm going to propose a compromise. I don't think either of us will love 
it, but I hope we can both live with it.

I'm going to use "object"/"property" terminology, but feel free to 
reformulate into RDF terminology.


1) We allow (but don't require) an "id" property on each object, to make 
it addressable;


2) We strive for a single consistent structure that applies equally to 
Body, Target, Tag, and so on, modulo the proposal in #3.


3) We make the nested "source" object a SHOULD, while the empty-node 
construct is only a MAY, and only allowed for text-literal resources, 
and (maybe?) only in a non-Linked Data context; we define a clear 
equivalence mapping.

For example, these statements would be equivalent:

   "body" : {
     "role" : "tagging",
     "source" : {
       "value" : "+1"
     }
   }

   "body" : {
     "role" : "tagging",
     "value" : "+1"
   }

But these would not:

   "body" : {
     "role" : "linking",
     "source" : {
       "type": "Image",
       "id": "http://example.com/image.png"
     }
   }

   "body" : {
     "role" : "linking",
     "type": "Image",
     "id": "http://example.com/image.png"
   }



4) We define the default type of "value" to be "text". We require 
explicit "type" values for all other datatypes.

For example, these statements would be equivalent:

   "body" : {
     "role" : "commenting",
     "source" : {
       "type" : "text",
       "value" : "This reminds me of a meme…"
     }
   }

   "body" : {
     "role" : "commenting",
     "source" : {
       "value" : "This reminds me of a meme…"
     }
   }

   "body" : {
     "role" : "commenting",
     "value" : "This reminds me of a meme…"
   }



5) We continue to discuss property names that might be more intuitive. 
For example, I find "source" less clear than "content", and I'd like to 
see different proposals for the terms "EmbeddedContent" and 
"SpecificResource".


Thoughts?

Regards–
–Doug


On 8/18/15 3:57 PM, Robert Sanderson wrote:
>
>
> On Tue, Aug 18, 2015 at 12:01 PM, Doug Schepers <schepers@w3.org
> <mailto:schepers@w3.org>> wrote:
>
>     Hi, Rob–
>     On 8/17/15 2:41 PM, Robert Sanderson wrote:
>
>         On Mon, Aug 17, 2015 at 10:13 AM, Timothy Cole wrote:
>         Now that resource has two roles, tagging and commenting.
>
>
>     Can you please describe again (I feel you've mentioned it before)
>     the use case for this 'body' reuse? Especially in the case where the
>     body is a text literal?
>
>
> Sure.
>
> As a reviewer using an annotation tool to comment on a paper, I want my
> review to be persistent and referenced. It might refer to other papers
> beyond the one I'm commenting on, for example to point out plagiarism or
> to suggest other sources, but the review is of the target paper.
>
> As a paper author, I want to link that review to my cited paper as a
> justification for its value.  E.g. the same content reviews one paper
> and provides support for another paper.
>
> The review starts off as a block of text in an annotation client.  It is
> then transferred via the protocol to a server.  The server creates a URI
> for it.
> The second annotation takes that URI and uses it as the body, with a
> different role.
>
>
> And another:
>
> I post on twitter a comment noting a typo on a wikipedia page.
> A system then uses that more specifically as the justification for an
> annotation that also suggests the change, both using more specific
> motivations.
>
>
> And another:
>
> I post on medium my thoughts about a particular politically charged
> topic.  It's a comment on a wikipedia page.
> People on both sides of the topic take the same post different ways and
> use it as support for their view and a dismissal of the opposition.
>
>
> And another:
>
> I transcribe a quote from a book as part of a crowd-sourcing platform.
> I then use that quote as a comment on the museum exhibit that it is
> talking about.
>
> I can go on if needed.
>
>     Is this use case common, or is it an edge case?
>
>
> Common.
>
>     I'm having a hard time imagining a large-scale annotation
>     application that would reuse body literals, rather than simply
>     having multiple instances of similar bodies, each contained in its
>     own annotation. The user experience and workflow aren't clear to me.
>
>
> When the authorship of the body is important. Which is almost always.
> Note that the author of the body is not necessarily the author of the
> annotation, as per the examples above, bar the last one.
>
>
>     I totally understand that multiple annotations might use the same
>     external resource (e.g. a picture or video) as a body, but that's a
>     different case with a different object structure (and a different
>     UX/workflow).
>
>
> All of the above *start* as plain text, so the same UX for the first
> part.  The second annotation doesn't need to re-type the text, rather
> than selecting existing content.  So I think I agree that there is a
> different workflow, even if the same UI might allow both.
>
> However I disagree that there must be a different structure.  Having a
> consistent structure for both uses -of the same body- seems important,
> as clients and servers will otherwise need to implement both, depending
> on the otherwise arbitrary order in which the annotations were created.
>
>
>     At some point, if you're pointing to 2 different external resources,
>     it seems like it would be hard to delineate between an annotation
>     with multiple targets (or bodies), rather than a clear body-target
>     relationship, and I don't see what kind of annotation client would
>     structure things that way.
>
>
> I don't understand this, sorry.
>
>     I assume that your annotation client does something like this… can
>     you tell us how that works?
>
>
> And I'm not sure what you're asking for here.
>
>
>
>         Rather than consistently using the Specific resource pattern:
>
>         "body": {
>             "role": "tagging",
>             "source": {
>               "id": "http://repo.org/bodies/1",
>               "value": "+1"
>             }
>         }
>
>         Which will always work at the (IMO minimal) cost of slightly
>         more structure.
>         It's also clearer without the explicit types, as role can only be on
>         SpecificResource.
>
>
>     Is this structured allowed, or required? If it's simply allowed,
>     then we agree. If it's required, then I'm a bit less comfortable.
>
>
> It would be required for resources with URIs.  I would prefer to require
> it also for Embedded content for consistency, and to keep the separation
> of concerns per my response to Tim.
>
>
>     When we extrapolate to multiple bodies (which is really what we're
>     talking about), the extra code become more obvious:
>
>     "body" : [
>        { "role" : "tagging", "value" : "+1"},
>        { "role" : "commenting", "value" : "This reminds me of a meme…" },
>        { "role" : "linking", "source" : "http://example.com/image.png" }
>     ]
>
> (Fixed and compacted inline)
>
>     versus:
>
>     "body" : [
>        {
>          "role" : "tagging",
>          "source" : { "value" : "+1" }
>        },
>        {
>          "role" : "commenting",
>          "source" : { "value" : "This reminds me of a meme…" }
>        },
>        {
>          "role" : "linking",
>          "source" :  "http://example.com/image.png"
>        }
>     ]
>
> (Fixed and compacted inline)
>
> But in most cases:
>
> "body" : [
>    {
>      "role" : "tagging",
>      "source" : { "value" : "+1" }
>    },
>    {
>      "role" : "commenting",
>      "source" : { "type" : "text", "value" : "This reminds me of a meme…" }
>    },
>    {
>      "role" : "linking",
>      "source" :  {  "type": "Image", "id": "http://example.com/image.png" }
>    }
> ]
>
>     At that point, it's not clear what this structure buys us, though
>     I'll admit that it adds a uniformity of structure between constructs
>     of different types might make it easier to always do the right thing.
>
>
> Uniformity in data structures is good, rather than constantly having to
> test for the existence of different structures.  Also in terms of making
> it easier to do the right thing, and the actual complexity of the
> structure, if you have to explain one thing well, that's easier than
> explaining two things well plus when you would choose to use one or the
> other.
>
> Especially when you have to understand and implement either both anyway,
> or just one.
>
>
>         That is why I'm +0, rather than -1.  I can live with it if
>         needed, but I
>         think there's a better way that separates the two concerns:
>
>         EmbeddedContent:  Transfer content of any type for any resource,
>         URI or
>         no, in the serialized annotation.  (Which is why we talked about
>         it in
>         the Serialization section in the CG docs)
>         SpecificResource:  Make annotation specific assertions about a
>         Body or
>         Target resource. (Until now, that has been selector, state,
>         style and
>         scope ... we're just adding another specifier of role)
>
>
>     Perhaps the terms "EmbeddedContent" and "SpecificResource" are
>     throwing me off a bit. Are those terms used in LD/RDF, or are they
>     terms we've introduced?
>
>
> We introduced both.
>
> We (the WG) introduced EmbeddedContent to replace the defunct
> ContentAsText work, after many failed efforts to get the people
> responsible for it to take it forwards.
> http://www.w3.org/TR/Content-in-RDF10/
>
> And (as earlier in the thread) we (the Open Annotation Collaboration,
> pre CG) introduced Specific Resource based on Tim Berners-Lee's notion
> of Specific vs Generic resources in the web architecture, previously
> called Constrained resources.
>
>
> Hope that helps,
>
> Rob
>
> --
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305
Received on Tuesday, 18 August 2015 23:26:55 UTC