Re: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Doug Schepers on 2015-08-18 (public-annotation@w3.org from August 2015)

From: Doug Schepers <schepers@w3.org>
Date: Tue, 18 Aug 2015 15:01:09 -0400
To: Robert Sanderson <azaroth42@gmail.com>, t-cole3 <t-cole3@illinois.edu>
Cc: Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <55D380F5.9070604@w3.org>
Hi, Rob–

On 8/17/15 2:41 PM, Robert Sanderson wrote:
> On Mon, Aug 17, 2015 at 10:13 AM, Timothy Cole wrote:
>
>     Now that EmbeddedContent is in our namespace (having replaced our
>     prior reliance on the now defunct Representing Content in RDF
>     effort), I'm not seeing that we have meaningful distinctions between
>     these classes that would make one more suitable than the other when
>     it comes to attaching role.  Personally I would be +1 for both of
>     these patterns in JSON:
>
>     "body" :  {
>              "type" : "Specific",
>              "source" : "http://example.org/body1.html" ,
>               "role" : "describing"
>        }
>
>     "body" :  {
>              "type" : "Embedded",
>              "value" : "I would be +1 for this." ,
>               "role" : "commenting"
>        }
>
>
>
> My concern is when the Embedded resource has a URI:
>
> "body": {
>    "id": "http://repo.org/bodies/1",
>    "value": "+1",
>    "role": "commenting"
> }
>
> And then someone else decides that id should be used as a tag:
>
> "body": {
>    "id" : "http://repo.org/bodies/1",
>    "value": "+1",
>    "role": "tagging"
> }
>
> Now that resource has two roles, tagging and commenting.

Can you please describe again (I feel you've mentioned it before) the 
use case for this 'body' reuse? Especially in the case where the body is 
a text literal?

Is this use case common, or is it an edge case?

I'm having a hard time imagining a large-scale annotation application 
that would reuse body literals, rather than simply having multiple 
instances of similar bodies, each contained in its own annotation. The 
user experience and workflow aren't clear to me.

I totally understand that multiple annotations might use the same 
external resource (e.g. a picture or video) as a body, but that's a 
different case with a different object structure (and a different 
UX/workflow).

At some point, if you're pointing to 2 different external resources, it 
seems like it would be hard to delineate between an annotation with 
multiple targets (or bodies), rather than a clear body-target 
relationship, and I don't see what kind of annotation client would 
structure things that way.

I assume that your annotation client does something like this… can you 
tell us how that works?


> Rather than consistently using the Specific resource pattern:
>
> "body": {
>    "role": "tagging",
>    "source": {
>      "id": "http://repo.org/bodies/1",
>      "value": "+1"
>    }
> }
>
> Which will always work at the (IMO minimal) cost of slightly more structure.
> It's also clearer without the explicit types, as role can only be on
> SpecificResource.

Is this structured allowed, or required? If it's simply allowed, then we 
agree. If it's required, then I'm a bit less comfortable.

Ivan used a structure with blank nodes, and I conjecture that this may 
be a common case (perhaps even the most common case).

Framing it using Ivan's structure, your examples would look like this 
(taking a few liberties with property names):

"body" : {
   "role" : "tagging",
   "value" : "+1"
}

vs:

"body" : {
   "role" : "tagging",
   "content" : {
     "value" : "+1"
   }
}


When we extrapolate to multiple bodies (which is really what we're 
talking about), the extra code become more obvious:

"body" : [
   {
     "role" : "tagging",
     "value" : "+1"
   },
   {
     "role" : "commenting",
     "value" : "This reminds me of a meme…"
   },
   {
     "role" : "linking",
     "content" : {
       "type" : "url"
       "value" : "http://example.com/image.png"
     }
   }
]

versus:

"body" : [
   {
     "role" : "tagging",
     "source" : {
       "value" : "+1"
     }
   },
   {
     "role" : "commenting",
     "source" : {
       "value" : "This reminds me of a meme…"
     }
   },
   {
     "role" : "linking",
     "content" : {
       "type" : "url"
       "value" : "http://example.com/image.png"
     }
   }
]

At that point, it's not clear what this structure buys us, though I'll 
admit that it adds a uniformity of structure between constructs of 
different types might make it easier to always do the right thing.


>     My rationale (FWIW): I see as the key characteristic of both classes
>     the ability to create and give identity (as needed) to a resource
>     required to create a specific annotation -- which is to my mind what
>     makes them both suitable objects to which to attach properties
>     specific to the annotation.
>
>
> We don't ever say that you can't embed non-annotation-specific resources
> within the annotation, using EmbeddedContent.
>
> Note also that Embedded could be used to embed Stylesheets, per example
> 55 in the model:
> http://www.w3.org/TR/annotation-model/#css-style
> And that stylesheet could have a URI.  (And wouldn't have a role, I expect)
>
> It could also embed the SVG for a selector, as per:
> http://www.w3.org/TR/annotation-model/#svg-selector.
> And ditto, regarding URI and role.
>
> So my concern comes from a different perspective on the use of
> EmbeddedContent. Yes, it solves the body issue, but it's not just
> solving that.  It's really a minimal-viable-product drop-in replacement
> for the defunct Content in RDF work.
>
>
>        The main substantive distinction is that one is limited to
>     resources that can be expressed as strings (rdf:value) and the other
>     is always derived from an existing resource (oa:hasSource). But
>     though we introduce SpecificResource in the context of using only a
>     segment or portion of a resource, SpecificResource can also be
>     effectively used as a kind of proxy for resource in its entirety (as
>     we are discussing in connection with Role).
>
>
> Right.  I would prefer one pattern rather than two.

I find this argument more compelling than other claims.



>     And similarly though we introduce EmbeddedContent in connection with
>     text/plain bodies, this class can also be used for embedding
>     text/html, text/xml, application/xml,  image/svg+xml, etc. anything
>     that can be expressed as a string -- e.g., use XML to create an SVG
>     meme and it can serve as the body of your annotation.
>
>
> Yup. Or, as above, in other non-body uses.
>
>     Both may appear as blank nodes in an Annotation, but both may also
>     be assigned a URI (though I tend to think this would not be the
>     norm), which does mean, as you point out for EmbeddedContent
>     resources, that we would be allowing role to be assigned to a
>     resource that could be reused.
>
>
> That is why I'm +0, rather than -1.  I can live with it if needed, but I
> think there's a better way that separates the two concerns:
>
> EmbeddedContent:  Transfer content of any type for any resource, URI or
> no, in the serialized annotation.  (Which is why we talked about it in
> the Serialization section in the CG docs)
> SpecificResource:  Make annotation specific assertions about a Body or
> Target resource. (Until now, that has been selector, state, style and
> scope ... we're just adding another specifier of role)

Perhaps the terms "EmbeddedContent" and "SpecificResource" are throwing 
me off a bit. Are those terms used in LD/RDF, or are they terms we've 
introduced?



>     But I think the same is true for SpecificResource, even more so
>     given current language, "If the Specific Resource has an HTTP URI,
>     then the exact segment of the Source resource that it identifies,
>     and only the segment, must be returned when the URI is
>     dereferenced." So if associating a role directly with an
>     EmbeddedContent meme is wrong because it could be created with or
>     subsequently given a de-referenceable URI, than I think the same is
>     true for SpecificResource.
>
>
> We would need to clarify that *all* of the properties of the
> SpecificResource are to be taken into account when considering re-use.
> I don't think that's a fundamental change, just a clarification.

Again, this "reuse" requirement eludes me in what it buys us.

Regards–
–Doug


>     As an aside, if we do decide that SpecificResource and
>     EmbeddedContent are together the right direction to go to resolve
>     the role issue (and my main concern here is that I don't like the
>     idea of implicit typing in JSON-LD -- I think we need to include
>     type explicitly in this situation),
>
>
> +1
>
>     I think we should consider introducing EmbeddedContent and
>     SpecificResources together in the data model. This would mean first
>     introducing SpecificResource prior to its use in Section 4.1 where
>     we begin talking about Specifiers. I think it would also be a good
>     idea not to make EmbeddedContent so much about Textual Bodies, but
>     rather make clear that it can be used for just about any resource
>     that can be expressed as a string.
>
>
> I would prefer to create a new 4.2 that describes the role use case, but
> to leave the initial description where it is.  However, the structure
> might change dramatically with the changes we could make to tags using
> this approach.
>
> HTH,
>
> Rob
>
>
> --
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305
Received on Tuesday, 18 August 2015 19:01:13 UTC