Re: My thoughts on the multi-body alternatives (as shown on Tim's wiki page) from Robert Sanderson on 2015-08-16 (public-annotation@w3.org from August 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Sun, 16 Aug 2015 12:08:19 -0700
To: Ivan Herman <ivan@w3.org>
Cc: W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CABevsUF_CmB+WMVGztqFDN6LeGfYK-BhW2uJgsE3oWqW9KHvOQ@mail.gmail.com>
Thanks Ivan!  Replies inline.

On Sun, Aug 16, 2015 at 3:50 AM, Ivan Herman <ivan@w3.org> wrote


> Here are my (random) thoughts:
> - I believe that the pattern
>   "a" : {
>      "b" : "something",
>      "c" : "something else"
>   }
> is a fairly natural pattern in JSON. To be specific, the fact that
>
> "body" : "This image is worth viewing on my desktop."
> is transformed into something like
> "body" : {
>         "source": "This image is worth viewing on my desktop.",
>         "role" : "commenting"
> }
> is not, as far as I can judge, shocking for a JSON user.
>

That actually doesn't work as is (oa:hasSource must be a URI), but yes,
something like that should not be too disturbing.

Except when "b" or "c" is "id", this pattern translates perfectly well
> through JSON-LD to RDF: it is a anonymous blank node, ie, where there is
> even no attempt to provide an identifier.


Until such time as a server assigns a URI to a resource that was formerly a
blank node, via some sort of skolemization routine. Per:
http://www.w3.org/TR/rdf11-concepts/#section-blank-nodes

This is recommended as a pattern by Bizer and Heath:
http://linkeddatabook.com/editions/1.0/#htoc16

"The scope of blank nodes is limited to the document in which they appear,
meaning it is not possible to create RDF links to them from external
documents, reducing the potential for interlinking between different Linked
Data sources. In addition, it becomes much more difficult to merge data
from different sources when blank nodes are used, as there is no URI to
serve as a common key. Therefore, all resources in a data set should be
named using URI references."


And by David Wood, Michael Hausenblas (et al), in their Linked Data book:

"You should note that many people avoid using blank nodes. Blank nodes can
cause
some difficulty when you get them back in query results because you can’t
query them
later. They don’t have a name, so you can’t resolve them. For this reason,
many people
just make up URIs whenever they need to and avoid blank nodes altogether."



So unless we propose that blank nodes MUST NOT be given URIs (and a very
quick -1 to that, unless we also intend to require LDPatch, and another -1
to that), relying on resources staying blank nodes is a dangerous
assumption, in my opinion.


In other words, I believe that using that pattern should be something we
> embrace.
> - However, if the blank node is not anonymous, ie, we *must* add an "id":
> that I think is a problem. It forces the user either to mint a (fairly
> artificial) URI (eg, a urn:XXXX) or use the _:XXX pattern for a blank node
> ID. Something that makes the structure more complex, and forces a JSON user
> to use a notion (the blank node id) which is far from obvious. I believe we
> should try to avoid that.
>

Agree with not *requiring* an ID, but also to stress that we also shouldn't
require that it never have an id.

With -1 meaning cannot live with, +0 being can live with if that's the
general consensus, and +1 being strongly prefer...


> This is the reason that I have to agree with Doug that the 'role
> assignment' approach is probably way too complex for a JSON user, and we
> should drop it. This in spite of the fact that, from a Semantics point of
> view, it is certainly attractive (that is why I was in favour of it,
> originally). Sorry Ray:-)
>

Agreed. While the role assignment approach is able to be ignored when it
doesn't apply and doesn't make assertions that aren't always true, it
limits the generalization of the approach to tags, semantic tags and other
situations where annotation specific information must be associated with a
resource.  It's also more complex and surfaces the RDF blank node issue.
So, I'm also not in favor of the approach, but it's better than others.

*Role Assignment:  +0*



> - The subproperty approach seems to be very simple; the JSON structure
> (see, eg, [2]) is structurally very close to the serialization without any
> role assignment (eg, [3]). What worries me the most is the proliferation of
> additional predicates, and the fact that the environment (including in
> JSON) has to, in effect, implement the subproperty relationship. Looks a
> bit as a spaghetti code, and may not be obvious to extend
>

Agreed.  Again, it doesn't break the RDF framework, and might be argued
that it's in fact best practice to create subproperties, we're trying to
solve the problem for pure json clients, not clients with a full RDF stack
that could determine that xxx:hasReplacement is a subPropertyOf
oa:hasBody.  So again, in terms of fulfilling *all* of the requirements
(must not break RDF, must be friendly to developers), it's not great.

*SubProperties: +0*


- The 'role attached to a resource', and the 'role as a class' have a very
> similar structure when serialized (eg, [4] and [5]). In fact, as I said,
> the current SemanticTag notion is already a representation of the 'role as
> a class' pattern. I must admit that I cannot make a big difference between
> the two; they look fairly similar to me, and I am not sure how I would
> choose among the two. I can live with both.
>

The JSON pattern is the one we want to adopt, I agree, but the devil is in
the details.

Both, as stated, generate broken RDF when used with resources that have
identity.  We explicitly made a change to the CG model to fix this exact
issue for Semantic Tags in the FPWD, and this would revert that fix.  A
video must not be given a class of oa:Comment in one Annotation and a class
of oa:Question in another, which this model would require.

*Role as a Class:  -1*
*Role attached to _any_ Resource: -1*


The embedded content resource, while it is a blank node, does not suffer
from having its role conflated with it.  However (as above) when it gets
given a URI, it falls into the same pattern as the video. As Ivan has
already demonstrated (by putting a literal into hasSource) the confusion
that this would generate would be huge, and particularly if we also remove
types from the representation. We would need to explain when to use one
pattern and when to use the other, thereby defeating the purpose of making
the developer's life simpler.

So, I don't think it really meets the requirements of making things
easier.  As soon as a server receives and transforms the pattern into the
one needed for the non-blank-node resources, the client needs to now
understand two patterns anyway.

*Role attached to EmbeddedContent or SpecificResource: +0*


The role of the resource in the annotation is not a property of the
resource, it is a relationship between the Annotation and the Resource.
Given that we don't want to do subproperties, there is only one possible
method to use, which is to reify that relationship into a resource and a
role.  This is (IMO) what Specific Resources and Motivations are,
respectively.

A Specific Resource is the body or target -as it relates to the
annotation-.  It's not the entire image, it's the segment identified by the
Specific Resource and described by the selector.  It's not just the part of
the image, it's the part of the image as identified by the Specific
Resource, and described by the selector and the CSS Style. It's not every
representation of the image, it's the JPG representation, as identified by
the Specific Resource and described via the HTTP Request State.  It's not
any role of the image, it's the role of tagging, as identified by the
Specific Resource, and described by the Motivation.

As a hopefully illuminating historical note, previously Specific Resources
were Constrained Resources, and Specifiers were Constraints [1]. This was
because the selectors (etc) constrain the scope of the resource.  We
changed the name to Specific for two reasons ... the notion of X Specific
Resource vs X Generic Resource in Tim Berners-Lee's 2006 ontology [3], and
that constraint based programming/reasoning is a very different thing.  We
also played with ORE Proxies for the same role [2] (which would have looked
like role assignments) and discarded for the same reasons as above.

[1] http://www.openannotation.org/spec/beta/#DM_Constraint
[2] http://www.openannotation.org/spec/alpha2/#DM_Segments
[3] http://www.w3.org/2006/gen/ont

So ... with -one consistent change- (allow Motivation to be associated with
SpecificResource) we solve the problem in the desired tree hierarchy, for
both body and target, without introducing new structure (role assignment)
or opening the flood gates for new subproperties.  We solve the tagging
inconsistency at the same time, for free.

*Role attached to SpecificResource: +1*



> A side issue, though: we should align, imho, the Semantic tagging
> structure to whichever we choose.


Agreed. And Tagging. If we can have a single consistent model, that would
be great!

[Leaving out typing and multiplicity, which I think we should discuss
separately from roles]

Rob

-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Sunday, 16 August 2015 19:08:48 UTC