Re: Annotations and the Graph from Randall Leeds on 2015-10-28 (public-annotation@w3.org from October 2015)

From: Randall Leeds <randall@bleeds.info>
Date: Wed, 28 Oct 2015 18:20:56 +0000
To: Jacob Jett <jgjett@gmail.com>
Cc: Robert Sanderson <azaroth42@gmail.com>, Web Annotation <public-annotation@w3.org>
Message-ID: <CAAL6JQgt7o=zX4v4xgyurjoQSeoQKFLr9xgGHZic=LkSeNGd0g@mail.gmail.com>
On Wed, Oct 28, 2015 at 10:26 AM Jacob Jett <jgjett@gmail.com> wrote:

> More to the point though, since there is no semantic difference (that I
> can see) between rdf:Statement or oa:Annotation (as a sub-class of
> rdf:Statement) it's hard to predict exactly how an OWL-based reasoner is
> going to treat it. As near as I can tell (and I'm not an expert here) one
> of at least (there could be more permutations) three possible scenarios
> might play out in the reasoner.
>
>
This part is extremely helpful for me. I do not have any idea whatsoever
how OWL-based reasoners work.

It's a little disturbing to me that you seem to be suggesting that they
totally fall down and die in the presence of sub-classes and can't make
useful inferences anymore. That's pretty bad.


> RDF only really works when things are instances of classes of things it
> already defines. An rdf-triple is an instance of an rdf:Statement and of no
> other thing. The class of rdf:Statements is an instance of rdfs:Class
> asserted by the rdf:predicate, rdf:type.
>
>
I understand what you're saying, insofar as rdf:Statement is the class of
statements, triples. The first sentence makes no sense to me.


> I think maybe you're trying to treat RDF as though it were UML, which it
> isn't. Likewise, we often discuss everything about this RDF-based model as
> though RDF was a serialization format, which it also isn't. On the whole
> the group frequently asks the wrong questions about the model, i.e., "what
> does this data do?" Data is data, it just kind of is. The correct question
> is "what can I do with this data?"
>
>
I'm not sure what point of difference you're trying to highlight between
RDF and UML.

To your suggestion about appropriate questions, though, that is precisely
how I arrived at this proposal. I asked myself, "Can I infer any statement
of relation between the body and the target from this data?" I've found the
answer to be, "No" and I'm very disappointed by that.

No such statement implied by the annotation can actually be made because
the model doesn't specify that there is *any* relationship between the body
and the target. That seems absurd considering that the purpose of
annotation is so often to relate the them.


> I'm still not sure I'm convinced that an annotation without a body is a
>> useful thing. I know we debated it and decided to include it. I can't
>> remember why. It would seem to me any such annotation could have a stub
>> body. Or, more likely, that there is no annotation, there is only the
>> production of a SpecificResource.
>>
>
> But it's not questionable to others. The question is should the model be
> inclusive of use cases (and communities) or exclusive? If the latter then
> why is it in the W3C standards process? The beauty (and the high price) of
> the Web (and Semantic Web) is that any community can develop data
> vocabularies particular to their community's needs. The W3C standards are
> supposed to be the backbone that supports those activities and makes it
> possible for them work together at various levels. If we discount certain
> positions as not worthy of support then I'm not certain how we'll ever
> develop a standard that makes that backbone possible.
>

I'm skeptical of this reasoning.

It's not a forgone conclusion that bodiless annotations are unquestionably
needed by anyone. There may be alternative representations that they have
not considered. That's the purpose of discussion. If we don't question one
another we do each other a disservice.

Unfortunately, no one has responded by explaining a use case where either:

- The annotation is bodiless, but what's being created or consumed is not
equally well represented as just a SpecificResource.
- It would be unacceptable to infer any relationship between the body and
target from an annotation resource.


> I get that you have a need for a container with some provenance but, an
> annotation would be a sub-class of *that container*. There are other
> kinds of containers too, like collections. Collections are much more
> generic. There's actually a formalization that defines them quite well:
>
> ∀y(∃x isGatheredInto(x,y)↔ Collection(y))
>
> You can read all about it in the following paper: Wickett, K. M., Renear,
> A. H., & Furner, J. (2011). Are collections sets? *Proceedings of the
> 74th ASIS&T Annual Meeting* (New Orleans, LA, 9-13 October 2011).
>
> The important thing is to avoid disenfranchising entire web-going
> communities of practice simply because their needs are different.
>

I agree. That is exactly why I'm trying to question model and make its
semantics more actionable.

If my suggestions disenfranchise some community, name that community and
show me its use case(s).



> Why should we avoid instantiating a Bookmark and instead create an
>> Annotation that refers to some tags, some text, and some link, and
>> complicates the relationship between them?
>>
>
> See above, but more generally simply because we're the web annotation
> group and we're modeling annotations.
>

That's awfully circular.

We're okay with defining a predicate that says that the annotation is
"motivated by bookmarking" but not with defining a class "Bookmark".

That's actually totally fine with me. I don't need to define the structure
of Bookmark, and am happy to leave that to other communities of practice.

What concerns me is that if one creates an annotation with the
"bookmarking" motivation, one hasn't, according to our model, actually
asserted a relationship between the Bookmark (whatever its shape and
properties) and the resource being bookmarked. As a result, I have to query
my bookmarks indirectly by asking for "bodies of annotations that are
motivated by bookmarking" rather than simply asking for the subjects or
objects of a "hasBookmark" or "isBookmarkOf" property or something similar.

I think I missed the point where he made the point for you. The problem is
> that your proposal doesn't actually create a sub-class of rdf:Statement
> inasmuch as it simply makes an alias for it. The effects of aliasing
> rdf:Statement are uncertain (but are probably calamitous if you don't alias
> the rest of RDF). It isn't clear if oa:Annotation inherits the properties
> of rdf:Statement or simply replaces it wholecloth vis-a-vis OWL-based
> reasoners. This undermines the firmament upon which the model stands for
> the Semantic Web side of this working group (and the Semantic Web
> community).
>
>
This is helpful. Maybe I'm misunderstanding. I thought that a class was not
only a dumb placeholder identity to be used in describing the domain or
range of a predicate, but could actually require (as we do in some places
in our model) the presence or count of certain properties. To the extent
that we would define _any_ properties not generally included in
rdfs:Statement but required on an oa:Annotation it would be distinct.

If my understanding of RDF is off, and there's some reason I still can't
grasp why sub-classing is harmful, then that's totally fine for me.

I don't care about the specifics of the class hierarchy, only that there
can be inference of a triple involving the body and target.


> I feel like this is a problem for engineers in general and developers in
> particular. You're reinventing the wheel by sub-classing. But we already
> have plenty of wheels. Just pick one and stick with it. The container here
> is the annotation-flavored one, if another container is required then this
> may be the wrong standard for your use cases.
>

It's ironic to be told that I'm re-inventing the wheel when the purpose of
the entire thread is to ask whether Annotation is merely re-inventing the
named graph.


>
>
>> SpecificResource and the selector vocabulary is great and I don't see
>> anything that exists quite like that.
>>
>
> Here I'm in total agreement with you. I've suggested to Rob in the past
> that the Specific Resource / Specifiers generalizes to a broader set of
> Web/Semantic Web use cases. Perhaps what is really needed is a more
> generalized Web/Semantic Web container specification that exploits and
> develops that part of the vocabulary. What is the procedure for proposing
> that we spin that part out to a different, more general working/community
> group?
>

+1


> It's providing a framework where people who don't agree on the nature of
> the relationship between body and target can still agree that body and
> target are related and also provide their interpretation of that
> relationship through motivation/role. Afterall what you call a comment, I
> (rather intractably) call a remark. It prevents a combinatorial explosion
> of predicates by dumping that information into an attribute value bucket.
>

I don't understand why people need to agree. The dataset/graph/annotation
provides attribution to someone's claim about the nature of the
relationship, even if that is just the generic assertion that they are
related at all. Others may disagree with the specific relationship, or even
its existence, and therefore decide not to consume the annotation or
redistribute it.


We're trying to provide a model that gives us agnostic annotation-flavored
> containers. The precise relationships between the bodies and targets is
> left for individual communities of practice to define through the
> motivation/role attribute value. Rather than mandate what kinds of
> annotations there are from on-high, we just give some high level examples
> of how they can express this information themselves.
>

We do specify what kinds of annotations (actually, what kinds of relations)
there are by way of motivations. What we fall short of doing is saying that
these are relations in the sense of being predicates for triples.


> I get that this makes like harder on JSON developers but I'm going to take
> an extremely blunt tack and tell you that it's for your own good. The
> end-users need to be the community that shapes the structure of the data.
> What best captures the idiosyncracies of their intentions and needs. Not,
> what is most convenient for us to parse and serialize.
>

I'm actually not thinking about JSON developers in any part of this
conversation. Please don't pigeon-hole me into that role because I have
done a lot of JSON annotation implementation.

I think what I'm suggesting might actually in some cases make things harder
for JSON developers.

If anything, what I'm proposing might actually be most helpful to XML
developers, as RDF-XML seems, from my reading, to be the degenerate
serialization that, by virtue of having a single root node, doesn't have a
clear way to describe named graphs or datasets.


> I apologize to everyone for the soapbox sermon but this working group
> sometimes feels like a house divided. If we don't stop playing use case
> trumpery and focus more broadly on the cost and benefits for all of the
> communities involved, it's hard for me to see how this process will result
> in something successful.
>

I think it'd be great to separate the selector work. That seems like it
could be successful.

I don't think I'm trying to play "use case trumpery", whatever that is. I'm
asking whether the existing use cases wouldn't be served by RDF datasets +
selectors + annotation-relation predicates (bookmarks, references,
explains, highlights, edits, etc).

>
Received on Wednesday, 28 October 2015 18:21:35 UTC