RE: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Timothy Cole on 2015-08-16 (public-annotation@w3.org from August 2015)

From: Timothy Cole <t-cole3@illinois.edu>
Date: Sun, 16 Aug 2015 13:11:31 -0500
To: "'Ivan Herman'" <ivan@w3.org>, "'W3C Public Annotation List'" <public-annotation@w3.org>
Message-ID: <029201d0d84e$fc32c9e0$f4985da0$@illinois.edu>
Good summary, and I agree with most of your points.

Regarding source vs. item, I have no objection to the substitution (though we probably will need to talk more about it at some point). My rationale for item (weak) was that currently, in addition to xsd:string and generic resource (i.e., a uri), the data model supports the following specialized oa classes as bodies / targets: Tag (has value as a property) , SemanticTag (has either foaf:page or skos:related as a property) , SpecificResource (has oa:hasSource as a property), Composite (has oa:item as a property), Choice & List (have oa:members as a property).  Assuming we create oa classes (instead of reusing oa:Motivation terms) for body and target roles, I thought oa:item would be the most innocuous property to borrow from the existing classes (i.e., to avoid minting, for purposes of illustration, an additional property specially for potential role classes). oa:item had the advantage of already being used in a scenario where it was repeatable within its domain Class, something that seemed desirable for JSON serialization, whereas oa:hasSource as currently used in the data model is not repeatable within a SpecificResource.  But this rationale was not fully baked and so have no problem with your change. It may be we need to mint a new property to use for Role Classes if we go with that solution. 

To your summary I would the following observation and a couple of additional thoughts. 

-- We are trying here to satisfy 3 masters and the inherent conflicts in their needs are most obvious in thinking about the multi-body role issue. We need to satisfy
  1.  those who want to aggregate annotation descriptions as RDF (graphs, non-linear, relational) presumably so they can do inferencing, discover additional relations between resources, etc.; 
  2.  those who want to create and consume annotation descriptions serialized in JSON (trees, linear, acyclic), in order to facilitate programming in JSON-friendly languages ;
  3.  those who want very efficiently query stores of annotation descriptions (regardless of how serialized), e.g.,  to identify all annotation bodies in the store, or all annotation targets, etc.

We've talked a lot about the desires of groups 1 and 2, but less recently about group 3 who prefer to flatten hierarchy and want to be able to ignore attributes like role (at least some times). Some of the proposals, including generating a sub-property of oa:hasBody for every role, makes it harder to gather together all bodies irrespective of role. We can't satisfy everyone. I just want to make sure we don't forget about agents falling into category 3.

Personally I like adding roles to SpecificResources and EmbeddedContent, but recognize that this makes it seem like these classes each have two orthogonal purposes.  To avoid minting new classes and keeping roles as terms rather than classes, we could create in our @context additional aliases for SpecificResource and EmbeddedContent, 

"body" : [
                    { "source" : "http://example.org/body1",
                      "type" : "sourceWithRole",
                     "role" : "commenting" } , 
                   { "value" : "this is a description ", 
                      "type" :  "valueWithRole" ,
                     "role" :  "describing" }
                  ]

but this would still require us to use type, would create issues for round tripping, and would create the appearance of a different model for JSON than RDF in a way that might be objectionable. The disconnect between id and source is also potentially jarring.  (The multiple aliases for SpecificResource and EmbeddedContext does not a problem, even if you attach both aliases to the same object. But it does create a problem for round-tripping I would think.)  

Tim Cole
  

-----Original Message-----
From: Ivan Herman [mailto:ivan@w3.org] 
Sent: Sunday, August 16, 2015 5:51 AM
To: W3C Public Annotation List <public-annotation@w3.org>
Subject: My thoughts on the multi-body alternatives (as shown on Tim's wiki page)

Guys,

Reviewing the multi-body annotation pages triggered some comments/thoughts for me. Hopefully these will help in closing this issue soon.

As a reminder, we have five different patterns on that page:

1. role assignments (originally proposed by Ray)
2. role attached to resources (specific or not), originally proposed by Rob
3. roles as subproperties
4. roles as classes, ie, typed bodies or targets

Here are my (random) thoughts:


- I believe that the pattern

  "a" : {
     "b" : "something",
     "c" : "something else"
  }

is a fairly natural pattern in JSON. To be specific, the fact that

"body" : "This image is worth viewing on my desktop."

is transformed into something like

"body" : {
 "source": "This image is worth viewing on my desktop.",
 "role" : "commenting"
}

is not, as far as I can judge, shocking for a JSON user.

Except when "b" or "c" is "id", this pattern translates perfectly well through JSON-LD to RDF: it is a anonymous blank node, ie, where there is even no attempt to provide an identifier. (This is the equivalent of the [...] idiom in Turtle and I changed the Turtle codes to make this analogy very visible). That usage of blank nodes is (should be...) perfectly all right even for the most fervent defenders of the Linked Data principles. (There have even been proposals to define the concept "Well behaved RDF"[1] using that pattern to 'tame' blank nodes...)

In other words, I believe that using that pattern should be something we embrace.

- However, if the blank node is not anonymous, ie, we *must* add an "id": that I think is a problem. It forces the user either to mint a (fairly artificial) URI (eg, a urn:XXXX) or use the _:XXX pattern for a blank node ID. Something that makes the structure more complex, and forces a JSON user to use a notion (the blank node id) which is far from obvious. I believe we should try to avoid that.

This is the reason that I have to agree with Doug that the 'role assignment' approach is probably way too complex for a JSON user, and we should drop it. This in spite of the fact that, from a Semantics point of view, it is certainly attractive (that is why I was in favour of it, originally). Sorry Ray:-)

- The subproperty approach seems to be very simple; the JSON structure (see, eg, [2]) is structurally very close to the serialization without any role assignment (eg, [3]). What worries me the most is the proliferation of additional predicates, and the fact that the environment (including in JSON) has to, in effect, implement the subproperty relationship. Looks a bit as a spaghetti code, and may not be obvious to extend

- The 'role attached to a resource', and the 'role as a class' have a very similar structure when serialized (eg, [4] and [5]). In fact, as I said, the current SemanticTag notion is already a representation of the 'role as a class' pattern. I must admit that I cannot make a big difference between the two; they look fairly similar to me, and I am not sure how I would choose among the two. I can live with both.

A side issue, though: we should align, imho, the Semantic tagging structure to whichever we choose. If we go for the 'role attached to a resource' approach, having semantic tagging as it is now seems to conflate different models; never a good thing...

- As far as I am concerned, the issue on when to use typing[6] is also part of this discussion. Again trying to make life of JSON users easier I would like to try to reduce the usage of "type" to a strict minimum. Following a separate thread, I did not use the embedded resource approach for something like

{
  "value": This image is worth viewing on my desktop.",
  "role" : "commenting"
}

which I would find superfluous. I am also not sure that explicitly typing the oa:SpecificResource is necessary and useful, and it would nevertheless pollute the serialization (again, trying to prune the JSON encoding as much as I can to hide the Linked Data aspects).

- I was a bit mixed up by the presence of oa:item and oa:source. Tim's original examples used oa:item for the target (third scenario) and it was unclear to me why not use, uniformly, oa:source. I changed this to oa:source, but there may be some misunderstanding on my part. Nevertheless, we may want to unify this, too.

I hope these thoughts may be useful...

Ivan

[1] http://dbooth.org/2013/well-behaved-rdf/Booth-well-behaved-rdf.pdf
[2] https://www.w3.org/annotation/wiki/Expressing_Role_in_Multi-Body_Annotations#Role_as_Subproperty_of_hasBody.2FhasTarget
[3] https://www.w3.org/annotation/wiki/Expressing_Role_in_Multi-Body_Annotations#Current_Model_.28no_role_descriptions.29
[4] https://www.w3.org/annotation/wiki/Expressing_Role_in_Multi-Body_Annotations#Role_Attached_to_resources_.28e.g.2C_SpecificResource.29
[5] https://www.w3.org/annotation/wiki/Expressing_Role_in_Multi-Body_Annotations#Role_as_Class.2FTyped_Bodies_and_Targets
[6] https://github.com/w3c/web-annotation/issues/61




----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Sunday, 16 August 2015 18:12:06 UTC