Re: JSON-LD serialization and linked data support from Robert Sanderson on 2015-08-13 (public-annotation@w3.org from August 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Thu, 13 Aug 2015 08:28:20 -0700
To: Ivan Herman <ivan@w3.org>
Cc: Frederick Hirsch <w3c@fjhirsch.com>, W3C Public Annotation List <public-annotation@w3.org>, Tim Cole <t-cole3@illinois.edu>
Message-ID: <CABevsUGAef8Vk8wV03Fbwe_jfdifbF0JV1B0MAmH=27M9=7-Jw@mail.gmail.com>
On Thu, Aug 13, 2015 at 6:16 AM, Ivan Herman <ivan@w3.org> wrote:

> The annotation model is *not* in JSON-LD. Nor is it in Turtle, for that
> matter. It is in RDF.
> I believe that, at this point, nobody (including Paolo) is considering
> moving away from the model. It is a model in RDF and, so far, it has served
> us well. In other words, we are firmly in the domain of Linked Data. We
> should get this issue off the table.
>

+1


> In my *personal* opinion, Semantic Web people would use Turtle, which is a
> simple, straightforward representation of the model. But it is an alien
> syntax to most, so we decided to push JSON to the fore. To achieve that, we
> are looking at a particular *serialization* of RDF, which is JSON-LD. We
> are hoping that this works for us, including those among us who do not care
> about RDF. But JSON-LD has its idiosyncrasies that some may live with, but
> others do not. It has the advantage of being a generic RDF serialization,
> but it also has the disadvantage of being a generic RDF serialization:-)
>

It has the advantage of being an existing, established, implemented
standard too.


Here comes Paolo's proposal (at least the way I understand it): let us
> *replace* the JSON-LD serialization with a dedicated JSON serialization of
> our model. Ie, we drop the -LD *from the syntax* (but that does not mean
> dropping Linked Data) and we may replace it with -OA to yield something
> like JSON-OA.


And this is where I disagree.  If there is no way in which a linked data
system can get the information in a way that it understands, we are not
doing linked data.  There's just a graph based model behind some JSON.

To annotate the linked data requirements, in this scenario...

1. Use URIs as names for things

-- Minimally. There's a URI name for a blob of JSON, and we would (of
course) have to use URIs for the targets, but nothing else would likely get
a URI (right?)

2. Use HTTP URIs so that people can look up those names.

-- Equally minimally.

3. When someone looks up a URI, provide useful information, using the
standards (RDF*, SPARQL)

-- Not at all.

4. Include links to other URIs. so that they can discover more things.

-- Equally minimally.

So I contend that if the fact that we're doing linked data or not is not in
question, then we should actually do linked data and get the benefits AND
the costs of that.



> What a JSON-LD processor does is to map a generic JSON-LD file to the
> abstract RDF model; well, we can define a processor that does the same *to
> a very restricted JSON syntax* that is defined for the annotation model
> only.


Is someone willing to do this? For reference, it means designing, writing,
implementing and testing something like:  http://www.w3.org/TR/json-ld-api/



> There is no real interoperability issue: we drop JSON-LD, and we require
> JSON-OA to be the interchange format; for Linked Data aware systems there
> is a processor that maps this the internal representation of RDF, whereas
> non-Linked Data aware systems can use that particular JSON dialect only.
> In fact, this is not so far off from what Rob proposed in [1]:
>

I think it's the complete opposite, actually. You are proposing to have an
abstract model with no practical interoperability with linked data systems,
the core of which is a new JSON format, that any linked data system needs
to revert back to something it can deal with via special code.   I'm
proposing that we instead double down on linked data, and provide
developers with a means to map to -whatever structure they want to use- and
stick firmly within existing standards.


> [[[
> * Define the model to fully encapsulate all of the requirements without
> taking into consideration any serialization or convenience.
> * The on-the-wire bits are the JSON-LD serialization of that model. We can
> discuss later whether we need to require a specific crystalization or
> whether we can just say JSON-LD.
> * We provide implementations that take that serialization and further
> compact it into whatever structure is most useful, but those are
> non-normative. They're code that we can write to make developers' lives
> easier.
> ]]]
>
> But, I think:
> * Per point 1: we have the model, and we should not change it
>

We have to change it, because it currently doesn't allow for bodies (or
targets) to have individual roles.
And while we're changing it, we should consider how else it can be improved
by reusing the new feature.

* Per point 2: we can, actually, use JSON-OA as a the on-the-wire bits as a
> serialization of that model (yeah, I know, this is a bit touchy with the
> definition of LDP, let us see whether we can solve that)
>

"A bit touchy" seems rather an understatement.  As they're just JSON blobs,
all Annotations would be NonRdfSources. So, again, no linked data.

* Per point 3: JSON-OA *may* be the normative serialization and we ditch
> JSON-LD altogether
>

-1 for all the reasons previously laid out.




> This approach may or may not work. Tim may be right that the proper
> modeling of the problem area would lead us to a certain level of
> complication anyway, and the whole thing may not lead to a real
> simplification compared to JSON-LD.


I am with Tim on this one.  We would invent something very very similar to
JSON-LD and something very very similar to LDP to solve ... what?

Let's put this in perspective:

* One person (Doug) has expressed his concern for the proposed models for
per-body roles because he feels that the JSON-LD serialization comes out
more complicated than if you ignored the RDF model and associated the role
directly with the resource.
* The response to that one person's concern is to propose that we throw out
existing standards and invent our own custom syntax and protocol,
essentially resetting all of the past year's work to zero.

And again, the initially disliked serialization (that led to offset roles)
is:

{
  "body": {
    "source": "http://youtube.com/v/abc123",
    "role": "commenting"
  },
  "target": "http://cnn.com/"
}

Compared to the non RDF compatible, but liked (as far as I can tell)
serialization of:

{
  "body": {
    "id": "http://youtube.com/v/abc123",
    "role": "commenting"
  },
  "target": "http://cnn.com/"
}

Clearly if a developer can do the preferred bottom one, they can do the
technically correct top one because the only difference is using something
other than id / @id for the URI.

In which case we declare this a dead end and we may be stuck with JSON-LD.
> But let us not pretend that by trying to that we create more
> interoperability problems (we don't, because there is a plethora of RDF
> serializations out there already) or that we drop Linked Data approach from
> our model (we don't because we touch only a particular serialization of the
> model).
>

Sure, we wouldn't be dropping it from the *model*, we'd be dropping it from
*practice*.

Rob

-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Thursday, 13 August 2015 15:28:48 UTC