Re: Data Model Assumptions from Doug Schepers on 2015-08-18 (public-annotation@w3.org from August 2015)

From: Doug Schepers <schepers@w3.org>
Date: Tue, 18 Aug 2015 13:01:32 -0400
To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>, Jacob Jett <jjett2@illinois.edu>
Cc: Raphaël Troncy <raphael.troncy@eurecom.fr>, Ivan Herman <ivan@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <55D364EC.70503@w3.org>
Hi, Stian–

So far as I can tell, having strong typing with properties hasn't been a 
source of contention, and doesn't seem likely to be so.

Having a URL tucked away in the @context for each properties, even (or 
especially) implementation-specific properties not defined in the Web 
Annotation Data Model, seems like a good idea, for the reasons you cite 
(establishing scales and units, for example).

Similarly, I don't think you'd get much argument about the need to 
structure the location of that rating appropriately. Again using your 
'rating' example, if a system places the 'rating' in the Annotation, the 
Body, or the Target, that's expressing different things (in the 
Annotation, the rating applies to the whole thing, including the Targets 
and Bodies; in the Body, the rating applies only to what that particular 
Body contains; in the Target, the rating applies only to the the 
resource indicated by that particular Target, and maybe not the 
annotation at all).

This is actually why I raised the issue of putting Motivation directly 
on the Body; it explains what role each Body plays in the whole 
Annotation (though not necessarily what the relationship of each Body is 
to each Target, for example).


These entailments are where Rob and I might differ, BTW; he said [1], 
“The role of the resource in the annotation is not a property of the 
resource, it is a relationship between the Annotation and the Resource.”

I might be misunderstanding him, but if "Resource" means a Body (in this 
case), then putting the role/motivation in the Body seems like the most 
straightforward way to describe the relationship of that Resource to the 
Annotation.


[1] https://lists.w3.org/Archives/Public/public-annotation/2015Aug/0145.html

Regards–
–Doug

On 8/18/15 11:52 AM, Stian Soiland-Reyes wrote:
> Sorry for not being engaged earlier in this important debate.
>
>
> While the pure JSON model of "just get on with it" gives great
> flexibility, it is not so convenient in the world of interoperability
> , because properties can be made out of the blue without consistent
> definitions, which leads to issues such as the directionality
> question.   This is fine in the dictatorship model, where a single
> vendor defines their own protocol and all the clients just adhere to
> it.
>
> But I didn't think OA was made for (primarily) such scenarios - but
> for a neutral, platform-independent way to talk about and exchange
> annotations about web resources.
>
>
> Some examples:
>
> Without use of namespaces you don't know if a vendor-specific property
> like { "rating": "3" }  is from vendor 1 or 2 (which might have
> different scales).
>
> With arbitrarily nested JSON objects you no longer know how they
> relate to actual things in the world, except as a JSON blob in an
> annotation. So you can't give even the most naive rendering except to
> just propagate the actual JSON.
>
>
> Now it might be that attaching arbitrary nested JSON blobs is
> something that is genuinely needed - or it could just be that JSON
> developers find it hard to adapt a particular structure that doesn't
> match their own. JSON blobs, if needed, should be an annotation body
> of their own - as a separate HTTP resource. This would for me be the
> cleanest approach.
>
> We have used a similar approach for annotation bodies that are
> themselves RDF graphs - the body might be a HTTP resource that happens
> to be JSON-LD - but that doesn't mean it has to be contained in a
> @graph in the same HTTP resource as the annotation in JSON-LD.
>
> There's a big danger of going down the "married to serialization
> format" route here - RDF/XML had a similar way to embed arbitrary XML
> which brought along its own issues.  Rather than embedding everything
> I would try to simply separate out things that are truly separate.
>
>
> JSON-LD adds some constraints here which I had hoped would help shape
> the objects so that we have consistent semantics and namespaces. If
> you want to use say "rating" within your annotation directly, then you
> need to either add it to your own custom @context with a fully
> qualified URL, or you need to use a fully qualified URL for that
> property so we know which one it is. Ideally clicking the URL of
> rating should tell you what it is - even if it's just a wiki page.
>
> If you want to add a deeper structure, but don't want to put it as a
> separate resource, then that structure should itself also be valid
> JSON-LD. That way it would be interoperable with any OA clients and
> servers - even if they don't understand the fine details of your
> statements they can see that you say "something" about A, B and C.
>
>
> As a side note:
> There is nothing wrong with a JSON-based server that accept/store and
> return verbatim non-JSON-LD blobs within a JSON structure that
> otherwise happens to be JSON-LD. To avoid any accidental invalid
> JSON-LD you can simply nest these behind a property that is
> deliberately NOT in the @context, and thus ignored. The returned
> annotation would then still be valid JSON-LD.
>
> It's just that those other JSON bits would disappear in the ether if
> you use an RDF store or alternative serialization (e.g. changed the OA
> storage backend) - so I assume nothing important should be stored in
> those blobs.
>
>
>
> On 18 August 2015 at 15:09, Jacob Jett <jjett2@illinois.edu> wrote:
>> +1 to what Ivan and Raphaël have said.
>>
>>
>> I would add a further caution that flexibilty in object-property pairings
>> always comes at a high price in both interchange and interoperability. These
>> kinds of data models are always arbitrary and ad hoc (and I mean arbitrary
>> and ad hoc in the non-pejorative way). They're simply specialized to the
>> specific needs of particular domain-communities.
>>
>> Rather than looking at the model and asking what is the minimum that the
>> Javascript developer community needs, perhaps it would be more constructive
>> to ask what can the Linked Data community live without. Because we're trying
>> to build a broadly general model, that will satisfy the needs of multiple
>> communities, it's necessary to identify the "lowest common denominator," so
>> to speak, among those communities. If we place one community over another
>> then things are likely to fall apart. The rub is, because of the open world
>> assumption, the LD community needs much more structure than the Javascript
>> developers. Linked Data / SemWeb developers are not free to make the same
>> kinds of assumptions about the data and the users that other flavors of
>> developer are.
>>
>> This is essentially why we didn't model roles on the bodies to start with.
>> Linked Data folks have to pay a steep price in verbosity to represent that
>> kind of information in the model.
>>
>> Regards,
>>
>> Jacob
>>
>>
>> _____________________________________________________
>> Jacob Jett
>> Research Assistant
>> Center for Informatics Research in Science and Scholarship
>> The Graduate School of Library and Information Science
>> University of Illinois at Urbana-Champaign
>> 501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
>> (217) 244-2164
>> jjett2@illinois.edu
>>
>> On Tue, Aug 18, 2015 at 8:34 AM, Raphaël Troncy <raphael.troncy@eurecom.fr>
>> wrote:
>>>
>>> Dear all,
>>>
>>> I would do a +1 for the very good set of precisions brought by Ivan, I
>>> concur to everything which is said in this message. I would bring the
>>> following clarification note:
>>>
>>>>> * unusual, but apparently optional, "predicate" names (e.g. "hasBody")
>>>>
>>>>
>>>> That is not part of any kind of any RDF standard, it is just the habits
>>>> that a particular community has (often inherited from people who defined
>>>> vocabularies, library catalogues, etc, way before even the Web existed).
>>>
>>>
>>> The RDF data model results in a DAG (Directed Acyclic Graph). This means
>>> that the predicates are directed (in your image representation, you have a
>>> left side and a right side of the arrow). RDF does not say more than this.
>>>
>>> As Ivan pointed out, in practice, some people felt that this "direction"
>>> of the predicate should be conveyed in the predicate name, thus the pattern
>>> you encounter under the form "hasXXX" or "isXXXBy" and that apparently you
>>> found "unusual" or even perhaps "awkward".
>>>
>>> Now, let's imagine that you encounter the predicate name "broader", and
>>> more precisely, the statement "x broader y" ... do you know if x is broader
>>> than y or if this is the over way around? You may want to ask SKOS that has
>>> an opinion on this [1]. The truth is that if you ask the developers(*), you
>>> will get 20% of the people that think this is one way, 20% that thinks this
>>> is the other way and 60% that just think this was a terrible predicate name
>>> since they have to systematically look at the spec [1]! I'm not saying that
>>> the hasXXX pattern should always been used, I'm just trying to explain you
>>> where it comes from ... releasing the ambiguity that "some" predicate names
>>> naturally have when providing a cue of their direction.
>>>
>>> Best regards.
>>>
>>>    Raphaël
>>>
>>> (*) Very informal but still serious poll among the many debates that took
>>> place when implementing the SKOS specification.
>>>
>>> [1] http://www.w3.org/TR/skos-reference/#broader
>>>
>>> --
>>> Raphaël Troncy
>>> EURECOM, Campus SophiaTech
>>> Multimedia Communications Department
>>> 450 route des Chappes, 06410 Biot, France.
>>> e-mail: raphael.troncy@eurecom.fr & raphael.troncy@gmail.com
>>> Tel: +33 (0)4 - 9300 8242
>>> Fax: +33 (0)4 - 9000 8200
>>> Web: http://www.eurecom.fr/~troncy/
>>>
>>
>
>
>
Received on Tuesday, 18 August 2015 17:01:38 UTC