Re: Data Model Assumptions from Stian Soiland-Reyes on 2015-08-18 (public-annotation@w3.org from August 2015)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Tue, 18 Aug 2015 16:52:13 +0100
To: Jacob Jett <jjett2@illinois.edu>
Cc: Raphaël Troncy <raphael.troncy@eurecom.fr>, Ivan Herman <ivan@w3.org>, Doug Schepers <schepers@w3.org>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CAPRnXtkKrS4KNqPhT07tDxTUpRYOmRJt9OG0h+ECWDKCqg69Og@mail.gmail.com>
Sorry for not being engaged earlier in this important debate.


While the pure JSON model of "just get on with it" gives great
flexibility, it is not so convenient in the world of interoperability
, because properties can be made out of the blue without consistent
definitions, which leads to issues such as the directionality
question.   This is fine in the dictatorship model, where a single
vendor defines their own protocol and all the clients just adhere to
it.

But I didn't think OA was made for (primarily) such scenarios - but
for a neutral, platform-independent way to talk about and exchange
annotations about web resources.


Some examples:

Without use of namespaces you don't know if a vendor-specific property
like { "rating": "3" }  is from vendor 1 or 2 (which might have
different scales).

With arbitrarily nested JSON objects you no longer know how they
relate to actual things in the world, except as a JSON blob in an
annotation. So you can't give even the most naive rendering except to
just propagate the actual JSON.


Now it might be that attaching arbitrary nested JSON blobs is
something that is genuinely needed - or it could just be that JSON
developers find it hard to adapt a particular structure that doesn't
match their own. JSON blobs, if needed, should be an annotation body
of their own - as a separate HTTP resource. This would for me be the
cleanest approach.

We have used a similar approach for annotation bodies that are
themselves RDF graphs - the body might be a HTTP resource that happens
to be JSON-LD - but that doesn't mean it has to be contained in a
@graph in the same HTTP resource as the annotation in JSON-LD.

There's a big danger of going down the "married to serialization
format" route here - RDF/XML had a similar way to embed arbitrary XML
which brought along its own issues.  Rather than embedding everything
I would try to simply separate out things that are truly separate.


JSON-LD adds some constraints here which I had hoped would help shape
the objects so that we have consistent semantics and namespaces. If
you want to use say "rating" within your annotation directly, then you
need to either add it to your own custom @context with a fully
qualified URL, or you need to use a fully qualified URL for that
property so we know which one it is. Ideally clicking the URL of
rating should tell you what it is - even if it's just a wiki page.

If you want to add a deeper structure, but don't want to put it as a
separate resource, then that structure should itself also be valid
JSON-LD. That way it would be interoperable with any OA clients and
servers - even if they don't understand the fine details of your
statements they can see that you say "something" about A, B and C.


As a side note:
There is nothing wrong with a JSON-based server that accept/store and
return verbatim non-JSON-LD blobs within a JSON structure that
otherwise happens to be JSON-LD. To avoid any accidental invalid
JSON-LD you can simply nest these behind a property that is
deliberately NOT in the @context, and thus ignored. The returned
annotation would then still be valid JSON-LD.

It's just that those other JSON bits would disappear in the ether if
you use an RDF store or alternative serialization (e.g. changed the OA
storage backend) - so I assume nothing important should be stored in
those blobs.



On 18 August 2015 at 15:09, Jacob Jett <jjett2@illinois.edu> wrote:
> +1 to what Ivan and Raphaël have said.
>
>
> I would add a further caution that flexibilty in object-property pairings
> always comes at a high price in both interchange and interoperability. These
> kinds of data models are always arbitrary and ad hoc (and I mean arbitrary
> and ad hoc in the non-pejorative way). They're simply specialized to the
> specific needs of particular domain-communities.
>
> Rather than looking at the model and asking what is the minimum that the
> Javascript developer community needs, perhaps it would be more constructive
> to ask what can the Linked Data community live without. Because we're trying
> to build a broadly general model, that will satisfy the needs of multiple
> communities, it's necessary to identify the "lowest common denominator," so
> to speak, among those communities. If we place one community over another
> then things are likely to fall apart. The rub is, because of the open world
> assumption, the LD community needs much more structure than the Javascript
> developers. Linked Data / SemWeb developers are not free to make the same
> kinds of assumptions about the data and the users that other flavors of
> developer are.
>
> This is essentially why we didn't model roles on the bodies to start with.
> Linked Data folks have to pay a steep price in verbosity to represent that
> kind of information in the model.
>
> Regards,
>
> Jacob
>
>
> _____________________________________________________
> Jacob Jett
> Research Assistant
> Center for Informatics Research in Science and Scholarship
> The Graduate School of Library and Information Science
> University of Illinois at Urbana-Champaign
> 501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
> (217) 244-2164
> jjett2@illinois.edu
>
> On Tue, Aug 18, 2015 at 8:34 AM, Raphaël Troncy <raphael.troncy@eurecom.fr>
> wrote:
>>
>> Dear all,
>>
>> I would do a +1 for the very good set of precisions brought by Ivan, I
>> concur to everything which is said in this message. I would bring the
>> following clarification note:
>>
>>>> * unusual, but apparently optional, "predicate" names (e.g. "hasBody")
>>>
>>>
>>> That is not part of any kind of any RDF standard, it is just the habits
>>> that a particular community has (often inherited from people who defined
>>> vocabularies, library catalogues, etc, way before even the Web existed).
>>
>>
>> The RDF data model results in a DAG (Directed Acyclic Graph). This means
>> that the predicates are directed (in your image representation, you have a
>> left side and a right side of the arrow). RDF does not say more than this.
>>
>> As Ivan pointed out, in practice, some people felt that this "direction"
>> of the predicate should be conveyed in the predicate name, thus the pattern
>> you encounter under the form "hasXXX" or "isXXXBy" and that apparently you
>> found "unusual" or even perhaps "awkward".
>>
>> Now, let's imagine that you encounter the predicate name "broader", and
>> more precisely, the statement "x broader y" ... do you know if x is broader
>> than y or if this is the over way around? You may want to ask SKOS that has
>> an opinion on this [1]. The truth is that if you ask the developers(*), you
>> will get 20% of the people that think this is one way, 20% that thinks this
>> is the other way and 60% that just think this was a terrible predicate name
>> since they have to systematically look at the spec [1]! I'm not saying that
>> the hasXXX pattern should always been used, I'm just trying to explain you
>> where it comes from ... releasing the ambiguity that "some" predicate names
>> naturally have when providing a cue of their direction.
>>
>> Best regards.
>>
>>   Raphaël
>>
>> (*) Very informal but still serious poll among the many debates that took
>> place when implementing the SKOS specification.
>>
>> [1] http://www.w3.org/TR/skos-reference/#broader
>>
>> --
>> Raphaël Troncy
>> EURECOM, Campus SophiaTech
>> Multimedia Communications Department
>> 450 route des Chappes, 06410 Biot, France.
>> e-mail: raphael.troncy@eurecom.fr & raphael.troncy@gmail.com
>> Tel: +33 (0)4 - 9300 8242
>> Fax: +33 (0)4 - 9000 8200
>> Web: http://www.eurecom.fr/~troncy/
>>
>



-- 
Stian Soiland-Reyes, eScience Lab
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/    http://orcid.org/0000-0001-9842-9718
Received on Tuesday, 18 August 2015 15:53:10 UTC