Re: Annotations and the Graph from Benjamin Young on 2015-11-03 (public-annotation@w3.org from November 2015)

From: Benjamin Young <bigbluehat@hypothes.is>
Date: Tue, 3 Nov 2015 11:20:01 -0500
To: Jacob Jett <jgjett@gmail.com>
Cc: Randall Leeds <randall@bleeds.info>, Robert Sanderson <azaroth42@gmail.com>, Web Annotation <public-annotation@w3.org>
Message-ID: <CAE3H5F+5WQKhz+Zh=Z0aXYzG77E+yesTFMFgqGaWdc0t-2zf2A@mail.gmail.com>
Thanks for starting this discussion, Randall! One thought inline below.

On Thu, Oct 29, 2015 at 9:52 AM, Jacob Jett <jgjett@gmail.com> wrote:

> Hi Randall,
>
> Apologies if my responses have seemed a bit curt. I am just frustrated by
> all of the use case-fu that seems to be bogging us down. Remarks in-line.
>
> On Wed, Oct 28, 2015 at 5:38 PM, Randall Leeds
>>
>> It is exactly this interpretation of motivations that I would like to
>> specify so that the interpretation is not idiosyncratic but normative.
>>
>
> I'm pretty certain that's a bad idea. No one is actually going to agree on
> the scope and meaning of 'comment', 'remark', 'bookmark', etc. Part of the
> problem is that while there is obvious overlap with these concepts and the
> concept of an annotation it isn't clear at all that all comments are
> annotations. Counter-examples exist. For instance, if you and I are
> discussing a baseball game and I comment on the performance of the pitcher,
> it is difficult to assert that I also annotated the pitcher's performance.
>
> It seems to me that annotations are limited to recorded discourse but
> comments have no such limitation.
>
> There is also the issue of contingent facts. That the body of an
> annotation plays the role of a comment is a contingent fact to the whole of
> the annotation. We can imagine a universe where the body plays some other
> role in the context of the annotation yet the particular content of the
> annotation has remained the same, e.g., the body plays the role of
> explanation. How would we decide when to use a comment predicate instead of
> an explanation predicate? Isn't relatedTo a simpler relationship?
>
>
>> I believe that reifying that interpretation results in a dataset that is
>> eminently more consistent with how I would like to query it in practice,
>> e.g. ask a server for <Article, comment, ?> and get back all the objects
>> (structure of which are unspecified, as per the extensibility we've both
>> pointed to).
>>
>
> This is a very brittle form of retrieval. You're reduced to hunting for
> the string label of a particular predicate. This is not a semantic query
> and is going to both miss pertinent results and retrieve non-relevant
> results. At least with a query like <?, motivatedBy, <commenting>> you
> might have a chance of grabbing annotations whose conceptual relationships
> between body and target are related to the motivation represented by the
> string "commenting".
>
>
>> I don't see that anywhere in the model specification, unless you infer
>> some particular relatedTo predicate from an arbitrary vocabulary from the
>> English language description, "conveys that the body is related to the
>> target".
>>
>
> That is indeed it. The entire model is built around this inference (to the
> point that I'd hardly call it an inference).
>
> Keep in mind that reasoners do not just rely on the model but also
> whatever inferencing rules the programmer sees fit to add in. My
> interpretation of the English in the introduction is that body relatedTo
> target is a valid inference. Within my own domain I can also easily argue
> that inferring motivation subclassOf relatedTo is a valid interpretation (I
> would actually go so far as to use the predicate 'annotates'), especially
> if it serves some search retrieval function. However it's not so easy to
> argue that this should be the normative interpretation because
> myCommunity:Commenting =/= oa:Commenting. It's actually difficult to say
> with authority what was the user intention unless I'm operating in an
> annotation system that only allows its annotators one or two intentions.
>
>
>> In our current model this is not true, and that's why I raised this
>> thread.
>>
>
> For very practical reasons, anyone storing annotations in a triple store
> will do so as a named graph. It makes retrieval, search, and serialization
> much easier. If one doesn't store it this way (with regards to graph
> databases anyway) then one quickly runs into one of RDF's worst
> decidability problems -- document composition.
>
>
>> The annotation is the root of a contiguous graph fragment. It cannot, for
>> example, convey disjoint subgraphs. It is not a dataset, it is a particular
>> resource.
>>
>
> You're right, it's not a set and set theoretic approaches won't work. It
> is, however, structured data. One of the primary drawbacks to RDF is that
> there are no clear boundaries where a document may begin or end. I believe
> that this is by design -- RDF is not a document language. It's an
> information language and an open problem that has not been resolved to my
> knowledge is how to compose the information in the graph into a
> human-readable document. This problem can be attenuated (but not fixed) by
> storing particular graph fragments as named graphs
>
>
>> I still don't understand why those motivations are not themselves, or do
>> not imply normative relations, between the bodies and targets.
>>
>
> Because it's hard to make that leap. The particular role a body plays in
> relation to a target beyond annotating is very difficult to establish. It's
> hard for me to picture an annotation client that successfully captures such
> granular information. You're not likely to get the information from the end
> user, yet you're in the awkward position of trying to capture the intention
> with the annotation. Motivation at least abstracts it away to the point
> that only the motivation might be incorrect. If you collapse the annotation
> node into the wrong predicate then the entire graph fragment becomes false
> (or really that assertion is false and everything related to it moot).
>

This bit, about it being hard to "make that leap"--to get the user to make
or imply a normative relation between bodies and targets is probably the
core reason why we have this "new thing" called Annotation (vs. "just RDF")
and why that thing doesn't include (yet?) a place to directly relate the
body and target.

The one question now buzzing in my brain is whether we should provide at
least an appendix that points to how one might make an Annotation which
also includes triples that *do* directly relate the body and target.

This is perhaps doable already using something like `@graph` in JSON-LD or
by simply mixing in more statements on either the `body` or `target`.

I do think we avoid "polluting" the global graph with (very) likely to be
incorrect content by *not* simply making annotation === to RDF.

RDF is certainly annotation, but I think this thing we're creating here
does serve a purpose which is deliberately "non-committal" in the world of
graphs by sitting somewhat along side of it rather than having the
requirement of relating body and target at every step.

However, I'd love to see a proposal written up for some of our examples
that show what you're thinking done either separate from (or counter to)
the Web Annotation Data Model as well as expanding the Data Model such that
direct relationships between body and target could also be encoded and
packaged along with the other bits.

That last option--of expanding the model to include encoding arbitrary,
relationships between body and target--seems specifically valuable in
relation to oa:identifying, oa:describing, and oa:classifying motivations.

It may be this is already possible, and I've just missed it, but if it's
not, I think Randall's on to something useful here, and I'd love to see it
explored by anyone interested enough to write a wiki page and some more
email. ;)

Thanks again for posting this, Randall, and for your input, Jacob! Good
stuff. :)

Cheers!
Benjamin
--
Developer Advocate
http://hypothes.is/


>
>
>> The annotation provides a subject for extended attributes about the
>> activity, I'm still just asking why one of those attributes, which is about
>> the reason for relating the bodies and targets, cannot be interpreted in a
>> consistent way to imply a particular relation.
>>
>
> The key is "consistent". I don't believe it's possible. People do not mean
> the same thing when the use the same word. Even a term like "tagging"
> easily becomes overloaded if we believe it indicates something about the
> content of the body, e.g., its a simple string mono- or bi-gram as opposed
> to a URL.
>
> Regards,
>
> Jacob
>
>
Received on Tuesday, 3 November 2015 16:20:32 UTC