Re: JSON-LD serialization and linked data support from Ivan Herman on 2015-08-13 (public-annotation@w3.org from August 2015)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 13 Aug 2015 18:08:03 +0200
To: Robert Sanderson <azaroth42@gmail.com>
Cc: Frederick Hirsch <w3c@fjhirsch.com>, W3C Public Annotation List <public-annotation@w3.org>, Tim Cole <t-cole3@illinois.edu>
Message-Id: <9CA0FB38-8923-4FAA-BA6F-21F1327056A3@w3.org>
> On 13 Aug 2015, at 17:28 , Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> 
> 
> On Thu, Aug 13, 2015 at 6:16 AM, Ivan Herman <ivan@w3.org> wrote:
> The annotation model is *not* in JSON-LD. Nor is it in Turtle, for that matter. It is in RDF.
> I believe that, at this point, nobody (including Paolo) is considering moving away from the model. It is a model in RDF and, so far, it has served us well. In other words, we are firmly in the domain of Linked Data. We should get this issue off the table.
> 
> +1
> 
> In my *personal* opinion, Semantic Web people would use Turtle, which is a simple, straightforward representation of the model. But it is an alien syntax to most, so we decided to push JSON to the fore. To achieve that, we are looking at a particular *serialization* of RDF, which is JSON-LD. We are hoping that this works for us, including those among us who do not care about RDF. But JSON-LD has its idiosyncrasies that some may live with, but others do not. It has the advantage of being a generic RDF serialization, but it also has the disadvantage of being a generic RDF serialization:-)
> 
> It has the advantage of being an existing, established, implemented standard too.
> 
> 
> Here comes Paolo's proposal (at least the way I understand it): let us *replace* the JSON-LD serialization with a dedicated JSON serialization of our model. Ie, we drop the -LD *from the syntax* (but that does not mean dropping Linked Data) and we may replace it with -OA to yield something like JSON-OA.
> 
> And this is where I disagree.  If there is no way in which a linked data system can get the information in a way that it understands, we are not doing linked data.  There's just a graph based model behind some JSON.

Why? I do not understand this reasoning at all. Annotation clients/servers can very well export their data into Turtle or, for that matter, JSON-LD. The JSON-OA (let us call it that way) is for usage among annotation clients and servers.

> 
> To annotate the linked data requirements, in this scenario...
> 
> 1. Use URIs as names for things
> 
> -- Minimally. There's a URI name for a blob of JSON, and we would (of course) have to use URIs for the targets, but nothing else would likely get a URI (right?)
> 
> 2. Use HTTP URIs so that people can look up those names.
> 
> -- Equally minimally.
> 
> 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
> 
> -- Not at all.
> 
> 4. Include links to other URIs. so that they can discover more things.
> 
> -- Equally minimally.
> 
> So I contend that if the fact that we're doing linked data or not is not in question, then we should actually do linked data and get the benefits AND the costs of that.

I do not understand any of these arguments either. It is in our hands to define the JSON-OA. We can define the syntax as we want, including the generation of URI-s, etc. In essence, we can define it in such a way that is reflects the RDF model or, more exactly, it can be mapped into perfectly fine RDF.

> 
> 
> What a JSON-LD processor does is to map a generic JSON-LD file to the abstract RDF model; well, we can define a processor that does the same *to a very restricted JSON syntax* that is defined for the annotation model only.
> 
> Is someone willing to do this? For reference, it means designing, writing, implementing and testing something like:  http://www.w3.org/TR/json-ld-api/
> 

That *is* the question indeed, as Paolo and others (me included) said on the call. And maybe it is more work than we can get. The only thing I am saying is: we should not throw away this idea as a principle, although we may be forced to do so out of practicalities.

> 
> There is no real interoperability issue: we drop JSON-LD, and we require JSON-OA to be the interchange format; for Linked Data aware systems there is a processor that maps this the internal representation of RDF, whereas non-Linked Data aware systems can use that particular JSON dialect only.
> In fact, this is not so far off from what Rob proposed in [1]:
> 
> I think it's the complete opposite, actually. You are proposing to have an abstract model with no practical interoperability with linked data systems,

No I am not.

> the core of which is a new JSON format,

No I am not. The core is the model, with a specific serialization thereof.

> that any linked data system needs to revert back to something it can deal with via special code.

No I am not. Annotation systems are able to export their data into RDF at their heart's content.

> I'm proposing that we instead double down on linked data, and provide developers with a means to map to -whatever structure they want to use- and stick firmly within existing standards.
> 
> [[[
> * Define the model to fully encapsulate all of the requirements without taking into consideration any serialization or convenience.
> * The on-the-wire bits are the JSON-LD serialization of that model. We can discuss later whether we need to require a specific crystalization or whether we can just say JSON-LD.
> * We provide implementations that take that serialization and further compact it into whatever structure is most useful, but those are non-normative. They're code that we can write to make developers' lives easier.
> ]]]
> 
> But, I think:
> * Per point 1: we have the model, and we should not change it
> 
> We have to change it, because it currently doesn't allow for bodies (or targets) to have individual roles.
> And while we're changing it, we should consider how else it can be improved by reusing the new feature.

But that is a different matter. We may have to change it if we go for multiple bodies. But what I am saying is that the JSON-OA does not require us to change the model

> 
> * Per point 2: we can, actually, use JSON-OA as a the on-the-wire bits as a serialization of that model (yeah, I know, this is a bit touchy with the definition of LDP, let us see whether we can solve that)
> 
> "A bit touchy" seems rather an understatement.  As they're just JSON blobs, all Annotations would be NonRdfSources. So, again, no linked data.
> 

LDP is not Linked Data. Let us not mix things. We are defining a protocol *based* on LDP

> * Per point 3: JSON-OA *may* be the normative serialization and we ditch JSON-LD altogether
> 
> -1 for all the reasons previously laid out.
> 
> 
> 
> This approach may or may not work. Tim may be right that the proper modeling of the problem area would lead us to a certain level of complication anyway, and the whole thing may not lead to a real simplification compared to JSON-LD.
> 
> I am with Tim on this one.  We would invent something very very similar to JSON-LD and something very very similar to LDP to solve ... what?
> 
> Let's put this in perspective:
> 
> * One person (Doug) has expressed his concern for the proposed models for per-body roles because he feels that the JSON-LD serialization comes out more complicated than if you ignored the RDF model and associated the role directly with the resource.

I do not think it is fair to put everything on Doug. JSON-LD and the RDF model is difficult to swallow for people who are not used to it. Other groups are fighting with the same issue.

> * The response to that one person's concern is to propose that we throw out existing standards and invent our own custom syntax and protocol, essentially resetting all of the past year's work to zero.

I think you are overreacting. The model does not change at all. Actually (still trying to explore the idea) a JSON-OA may not look all that different from the current JSON-LD syntax.

> 
> And again, the initially disliked serialization (that led to offset roles) is:
> 
> {
>   "body": {
>     "source": "http://youtube.com/v/abc123",
>     "role": "commenting"
>   },
>   "target": "http://cnn.com/"
> }
> 
> Compared to the non RDF compatible, but liked (as far as I can tell) serialization of:
> 
> {
>   "body": {
>     "id": "http://youtube.com/v/abc123",
>     "role": "commenting"
>   },
>   "target": "http://cnn.com/"
> }
> 
> Clearly if a developer can do the preferred bottom one, they can do the technically correct top one because the only difference is using something other than id / @id for the URI.
> 
> In which case we declare this a dead end and we may be stuck with JSON-LD. But let us not pretend that by trying to that we create more interoperability problems (we don't, because there is a plethora of RDF serializations out there already) or that we drop Linked Data approach from our model (we don't because we touch only a particular serialization of the model).
> 
> Sure, we wouldn't be dropping it from the *model*, we'd be dropping it from *practice*.

I do not believe that would be the case.

To make my position clear: I believe Paolo came up with an idea that deserves attention and should not be thrown out off hand. I cannot say at this point whether this may be successful and/or doable or not; I also acknowledge that it is not an optimal solution and it would be much better if JSON-LD *was* acceptable to all. But the recent stalemate (and some of the experiences in other groups) show that it is not easy and we may be stuck.

Ivan

> 
> Rob
> 
> --
> Rob Sanderson
> Information Standards Advocate
> Digital Library Systems and Services
> Stanford, CA 94305


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Thursday, 13 August 2015 16:08:25 UTC