RE: JSON-LD serialization and linked data support from Timothy Cole on 2015-08-13 (public-annotation@w3.org from August 2015)

From: Timothy Cole <t-cole3@illinois.edu>
Date: Thu, 13 Aug 2015 10:43:18 -0500
To: "'James M Snell'" <jasnell@gmail.com>, "'Ivan Herman'" <ivan@w3.org>
CC: "'Frederick Hirsch'" <w3c@fjhirsch.com>, "'W3C Public Annotation List'" <public-annotation@w3.org>, "'Rob Sanderson'" <azaroth@stanford.edu>
Message-ID: <00e201d0d5de$c70b58d0$55220a70$@illinois.edu>
James's experience on this resonates with me and I'll remain skeptical until I see some examples that a custom JSON-OA serialization, assuming it retains the precision, full expressiveness and extensibility of our Data Model and supports all articulated MUST, SHOULD and MAY requirements / options, will look significantly more natural to JSON developers than a JSON-LD serialization with a well-thought out @context. I really think Greg et al. did an amazing job with the JSON-LD spec. When it comes to customizing JSON-LD to a specific data model, the @context approach is powerful. Even though it may be worth a try, I'm not convinced we're going to do enough better to justify minting our own custom OA RDF serialization in JSON.  

But we'll see. Lengthy discussions about balancing the trade-offs between uptake, interoperability and completeness are inherent to this kind of work.

-Tim Cole

-----Original Message-----
From: James M Snell [mailto:jasnell@gmail.com] 
Sent: Thursday, August 13, 2015 10:07 AM
To: Ivan Herman <ivan@w3.org>
Cc: Frederick Hirsch <w3c@fjhirsch.com>; W3C Public Annotation List <public-annotation@w3.org>; Tim Cole <t-cole3@illinois.edu>; Rob Sanderson <azaroth@stanford.edu>
Subject: Re: JSON-LD serialization and linked data support

If I can interject a few thoughts from the sidelines... I faced a similar decision with regards to Activity Streams 2.0 -- only I came at it from the opposite point of view. That is, we had a pure JSON syntax to start and moved to a Vocabulary model with a JSON-LD syntax.
One of the key goals of this move, however, has been to make sure that developers who wish to ignore the JSON-LD processing model can do so if they wish -- albeit at a cost of some features.

The short version of the story is that Activity Streams 2.0 builds on JSON-LD but requires only a subset of what JSON-LD provides. For instance, the data format *requires* JSON-LD compact form serialization, it requires use of a normative JSON-LD @context definition that ensures consistent serialization, it strongly recommends that certain JSON-LD features are avoided, and -- perhaps most importantly -- does not require that developers implement the full RDF world view in order to make sense of the data.

A similar approach can be applied here. By defining a normative JSON-LD @context and requiring compact serialization using that @context, and by limiting the JSON-LD specific features you depend on, you can place practical limits on those various JSON-LD idiosyncrasies that everyone loves to hate.

- James

On Thu, Aug 13, 2015 at 6:16 AM, Ivan Herman <ivan@w3.org> wrote:
> Frederick, I put Tim and Rob into the Cc list just to make it clear that this is not a direct answer to this mail but, rather, the three mails in this thread ([1,2]), and also Rob's separate mail[3].
>
> (Apologies if parts of what I write is obvious to some of the people 
> on the group. It may not be for others…)
>
> The annotation model is *not* in JSON-LD. Nor is it in Turtle, for that matter. It is in RDF. RDF is defined in terms of abstract concepts (IRI-s as identifiers, literals, blank nodes, triples, etc.) defined in the RDF1.1 Concept document[4]; that document is *serialization agnostic*. (<digress> it has been one of the biggest mistake ever in the history of RDF that the concept and a particular serialization in XML, ie, RDF/XML, have been conflated in the story line. This has done more harm to RDF than anything else!</digress>). There are quite a number of serialization syntaxes (Turtle, JSON-LD, RDFa, N-Triples, RDF/XML, there is even a simple JSON serialization, though not as a Rec).
>
> I believe that, at this point, nobody (including Paolo) is considering moving away from the model. It is a model in RDF and, so far, it has served us well. In other words, we are firmly in the domain of Linked Data. We should get this issue off the table.
>
> RDF can be serialized. We use already two of those in our document: Turtle and JSON-LD. Other people may use other serialization for OA: RDFa or, (God forbid!) RDF/XML. The model is oblivious to that and we cannot even forbid that to happen.
>
> In my *personal* opinion, Semantic Web people would use Turtle, which 
> is a simple, straightforward representation of the model. But it is an 
> alien syntax to most, so we decided to push JSON to the fore. To 
> achieve that, we are looking at a particular *serialization* of RDF, 
> which is JSON-LD. We are hoping that this works for us, including 
> those among us who do not care about RDF. But JSON-LD has its 
> idiosyncrasies that some may live with, but others do not. It has the 
> advantage of being a generic RDF serialization, but it also has the 
> disadvantage of being a generic RDF serialization:-)
>
> Here comes Paolo's proposal (at least the way I understand it): let us *replace* the JSON-LD serialization with a dedicated JSON serialization of our model. Ie, we drop the -LD *from the syntax* (but that does not mean dropping Linked Data) and we may replace it with -OA to yield something like JSON-OA. What a JSON-LD processor does is to map a generic JSON-LD file to the abstract RDF model; well, we can define a processor that does the same *to a very restricted JSON syntax* that is defined for the annotation model only. There is no real interoperability issue: we drop JSON-LD, and we require JSON-OA to be the interchange format; for Linked Data aware systems there is a processor that maps this the internal representation of RDF, whereas non-Linked Data aware systems can use that particular JSON dialect only.
>
> In fact, this is not so far off from what Rob proposed in [1]:
>
> [[[
> * Define the model to fully encapsulate all of the requirements without taking into consideration any serialization or convenience.
> * The on-the-wire bits are the JSON-LD serialization of that model. We can discuss later whether we need to require a specific crystalization or whether we can just say JSON-LD.
> * We provide implementations that take that serialization and further compact it into whatever structure is most useful, but those are non-normative. They're code that we can write to make developers' lives easier.
> ]]]
>
> But, I think:
>
> * Per point 1: we have the model, and we should not change it
> * Per point 2: we can, actually, use JSON-OA as a the on-the-wire bits 
> as a serialization of that model (yeah, I know, this is a bit touchy 
> with the definition of LDP, let us see whether we can solve that)
> * Per point 3: JSON-OA *may* be the normative serialization and we 
> ditch JSON-LD altogether
>
> This approach may or may not work. Tim may be right that the proper modeling of the problem area would lead us to a certain level of complication anyway, and the whole thing may not lead to a real simplification compared to JSON-LD. In which case we declare this a dead end and we may be stuck with JSON-LD. But let us not pretend that by trying to that we create more interoperability problems (we don't, because there is a plethora of RDF serializations out there already) or that we drop Linked Data approach from our model (we don't because we touch only a particular serialization of the model).
>
> Ivan
>
> P.S. a different remark: yes, JSON-LD is included in schema.org, ie, 
> Google think it is ready and easy for… webmasters! Not developers in 
> general…
>
>
> [1] 
> http://www.w3.org/mid/CABevsUFyszpujiZq2qGd-wUQVvzzBgHY6K9sAKcatyjdj16
> PUA@mail.gmail.com [2] 
> http://www.w3.org/mid/009201d0d585$696b9810$3c42c830$@illinois.edu
> [3] 
> http://www.w3.org/mid/CABevsUGMeisPtx3xgxv1Dy52nmnUuoaRwWfi2Q10X5QJhr-
> 0JA@mail.gmail.com [4] http://www.w3.org/TR/rdf11-concepts/
>
>
>> On 13 Aug 2015, at 24:15 , Frederick Hirsch <w3c@fjhirsch.com> wrote:
>>
>> On today's call the topic of serializations came up and a question 
>> seemed to be raised over whether JSON-LD should be used (perhaps I 
>> heard incorrectly)
>>
>> There are some strong reasons to continue to require JSON-LD as a mandatory serialization, the abstract argument being the value of linked data on the back end.
>>
>> A specific concrete example of the value of linked data in combination with annotations might be "CATCH: Common Annotation, Tagging, and Citation at Harvard"
>>
>> [[
>>
>> It is designed to interoperate with third-party annotation tools to 
>> aggregate and associate contextualized annotation metadata from 
>> various pedagogical and research tools with reference to persistent 
>> digital media in repositories, such as the Harvard Library DRS. - See 
>> more at: 
>> https://osc.hul.harvard.edu/liblab/projects/catch-common-annotation-t
>> agging-and-citation-harvard#sthash.fr7L4qa3.dpuf
>>
>> ]]
>>
>> Do we have other concrete examples of how the linked data aspect of the Open Annotation model adds value to annotations? Pointers would be welcome.
>>
>> I'm concerned about specifying multiple serializations as we have to be more careful of interoperability in this case, specifically is round-tripping without information loss despite the serialization a potential issue? More serializations also mean more testing.
>>
>> In a related thought, is directly embedding JSON-LD in HTML ( http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents ) a viable option? What is the status of browser support for this? If it is supported (or is in progress) what is the case for HTML serialization as an alternative? Would it be more productive to focus on generic support for JSON-LD in browsers rather than a specific annotation serialization?
>>
>> The fundamental issue I heard us discuss is that even with all our efforts to simplify the JSON-LD serialization, there will remain some aspects that do not appear 'natural' to JSON developers.  The next question I have is whether these aspects can be managed with suitable libraries etc.
>>
>> Thanks
>>
>> regards, Frederick
>>
>> Frederick Hirsch
>>
>> www.fjhirsch.com
>> @fjhirsch
>>
>>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
Received on Thursday, 13 August 2015 15:47:57 UTC