Re: JSON-LD should be an RDF syntax from Manu Sporny on 2013-04-09 (public-rdf-comments@w3.org from April 2013)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Tue, 09 Apr 2013 09:59:35 -0400
To: David Booth <david@dbooth.org>
CC: public-rdf-comments@w3.org
Message-ID: <51641EC7.8020605@digitalbazaar.com>
Apologies for the delayed response David, been slammed, responses to
your responses below...

On 03/26/2013 04:22 PM, David Booth wrote:
>> In JSON-LD graph names can be IRIs or blank nodes whereas in RDF
>> graph names have to be IRIs.
> 
> Blank nodes already cause more grief than any other RDF feature.  I
> do not think it makes sense to promote or condone the expansion of
> their use.  Where are the compelling use cases for this?

In JSON, there are implicit blank nodes everywhere, both as predicates,
and as subject identifiers. So, the compelling use case is interpreting
JSON as RDF. The other major compelling use case is to not require JSON
developers to change their workflow by forcing them to give everything
an IRI identifier.

You may not know this, but I took your exact position when designing
JSON-LD. The first version of JSON-LD did not support blank nodes and I
took a pretty hard line stance because of the complexity that they
introduce. However, experience showed us that this was the wrong
position to take. There are many cases that we've hit during the
development of JSON-LD where it became pretty obvious that having blank
nodes would simplify the markup for the vast majority of Web developers.

>> In JSON-LD properties can be IRIs or blank nodes whereas in RDF 
>> properties (predicates) have to be IRIs.
> 
> Ditto.  Blank nodes already cause more grief than any other RDF
> feature. I do not think it makes sense to promote or condone the
> expansion of their use.  Where are the compelling use cases for
> this?

See above.

>> In JSON-LD lists are part of the data model whereas in RDF they are
>> part of a vocabulary, namely [RDF-SCHEMA].
> 
> That would make JSON-LD *not* be a superset of RDF.  While I agree
> with the goal of making lists easier to use in RDF -- I think that
> would be great -- I think it is important not to deviate from the RDF
> model.

JSON-LD is a super-set of RDF You can still express lists as rdf:first /
rdf:rest statements in JSON-LD. When you convert JSON-LD native lists to
RDF, the rdf:first / rdf:rest pattern is used.

>> The JSON-LD CG felt that these features were compelling enough to
>> keep them in the specification in the hopes that RDF will
>> eventually align with the data model. We tried to do this in a way
>> that was acceptable to the RDF community.
> 
> But failed "due to the colorful variety of opinions on the matter"?
> How can you have it both ways, except by acknowledging that this
> splinters the community?

Both the JSON-LD CG and the RDF WG have agreed on a compromise. If there
is agreement on the path forward, we are not splintering the community.

>> The other reason that we define a data model in JSON-LD is to make
>> it easier for developers to pick up on Linked Data concepts without
>> having to climb the very steep learning curve brought about by
>> having to read the myriad of RDF specifications.
> 
> That sounds fine and useful *provided* that JSON-LD is consistent
> with RDF.  At present it isn't.

That's an all-or-nothing strategy. We have opted for a strategy of
logical compromise and consensus. The consensus has led to a very simple
to understand data model (JSON-LD) that is a gateway into RDF for Web
developers. It solves a long-standing problem in the RDF community.

>> What we did find consensus around was to allow JSON-LD to deviate
>> in very specific ways in an attempt to gain some implementation
>> insight as to whether or not these extensions to RDF were worth
>> pursuing in RDF 2.0.
> 
> Field experience can and should be obtained by *vendor* *extensions*
> -- not by standardizing N competing RDF-like languages (even if N==2)
> and letting those standards fight it out in the marketplace.  I do
> not believe that it would be in the best interest of the RDF
> community or the W3C to fracture the market by standardizing multiple
> competing RDF-like languages.

RDF isn't a language, it's a data model. As for the languages, RDF
already has a variety of syntaxes that have different data models -
Microdata, N3, JSON-LD. That ship has already sailed, the important
thing is to make sure that these extensions are created and defined in a
way that can be folded back into the RDF data model if successful.

For example, the native lists datatype in TURTLE and JSON-LD is now a
strong indicator that the RDF 2.0 data model should probably have a
native lists type.

>> JSON-LD is about a JSON serialization for Linked Data. Linked Data 
>> typically asks that IRIs are dereferenceable so that more
>> information can be gleaned from the identifiers. The spec doesn't,
>> however, require that all IRIs used in JSON-LD are dereferencable.
> 
> Apologies, I was not clear.  The reason I said that a JSON
> serialization of RDF should not require IRIs to be dereferenceable --
> even as a "SHOULD" requirement -- is because I am distinguishing
> between a serialization of RDF *in* *general* -- not specifically for
> Linked Data -- and a serialization of RDF that is intended
> specifically for Linked Data.  As my original comment goes on to say,
> I think it is important to cleanly layer one spec on another, and
> there are *many* non-LD RDF uses that would benefit from a JSON
> serialization of RDF.

I don't see how the JSON-LD specification prevents uses?

>> We couldn't use JSON-RDF because a variation on the name was
>> already taken:
> 
> Sorry for being unclear.  My point was not so much about the name,
> but about the concept of defining a JSON serialization of RDF *in*
> *general* -- not just for Linked Data -- and then defining an LD
> version on top of that.

We had tried this approach at one point, but the "Linked Data" spec
ended up being so small that we just folded it back into JSON-LD. There
was no reason to have a tiny 10 page spec that just modified the
underlying "JSON-RDF" mechanism by effectively re-writing portions of
the spec. It would confuse Web developers and create much bouncing about
between the JSON-RDF and JSON-LD specs.

>> What prevents these applications from using JSON-LD?
> 
> If I have an RDF application, and I want it to accept a JSON 
> serialization of RDF, what must I tell my customers?  "It accepts 
> JSON-LD *except* that it does not support the following
> non-standard-RDF features, ... blah blah blah ... and furthermore the
> application does *not* expect all of your IRIs to be dereferenceable,
> because this is merely an RDF application -- not a Linked Data
> application -- but the W3C did not define a JSON serialization of
> RDF, so we had to use JSON-LD instead."   Fail.

That is an especially atrocious way to communicate with your customers. :)

You don't have to tell your customers any of this. You just tell them
that you accept JSON-LD and if you see a blank node in the graph or
predicate position, you either use a database that supports that, or you
skolemize if not.

> It would be far better if I could simply say: "the app accepts
> JSON-RDF" (where I'm using the term JSON-RDF to mean a JSON
> serialization of RDF, but I don't really care if it is called
> "JSON-RDF").  And it would be so easy for the working group to
> instead simply define a JSON serialization *of* *RDF*, and then
> define a Linked Data serialization on *top* of that.  This would
> provide a clean layering, a clean separation of concerns: plenty of
> upside, and almost no downside.

I disagree, we tried this and it ended up being a terrible approach.

>> We had explored this idea very early in the JSON-LD days and came
>> to the conclusion that JSON developers don't work with their data
>> in this way. That is, for the vast majority of the in-the-wild JSON
>> markup we looked at, JSON developers did not use any sort of
>> triple-based mechanism to work with their data. Rather, they used
>> JSON objects - simple key-value pairs to work with their data. This
>> design paradigm was the one that was used for JSON-LD because it
>> was the one that developers were (and still are) using when they
>> use JSON.
> 
> The fact that developers don't use triples is completely irrelevant.

If you think this, you are missing one of the core insights that the
JSON-LD builds upon.

> Developers are free to use any internal data representation they
> want when they use RDF -- including hash tables, objects, whatever.

Sure, but most of them just want something that works with their
language of choice. JSON-LD just works with their language of choice.

A common mistake that many smart developers and designers make is
assuming that since it's fairly obvious what a proper data
representation should look like that it's going to be just as obvious to
most Web developers. The reality is that most Web developers just want
something that works and don't want to think about the solution too
deeply because they have a thousand other things related to their
application that they have to think about.

JSON-LD is about reducing the cognitive load placed on developers that
want to build a Linked Data application (and effectively use RDF).
Saying that developers are "free to use any internal data representation
they want" ignores the the fact that it places an unnecessary cognitive
load on them when the choice doesn't need to be made by developers in
the vast majority of cases.

> The LDP working group can perfectly well define JSON-LD **in terms
> of** a JSON serialization of RDF.  The difference between the two:
> The JSON serialization of RDF would be just that -- a serialization
> of RDF, just as Turtle or NTriples are serializations of RDF.

But TURTLE has it's own native list type, doesn't it? So, it doesn't
even fit your definition you use earlier in this e-mail. What about N3,
which has Formulae and Literal subjects?

> Whereas JSON-LD would place further restrictions on that
> serialization to specifically support the needs of the Linked Data
> Platform, such as saying that every IRI SHOULD be de-referenceable to
> information about the identified resource.

If an application wants to break that suggestion, it can. That's why
it's a SHOULD and not a MUST.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny, G+: +Manu Sporny)
Founder/CEO - Digital Bazaar, Inc.
blog: Meritora - Web payments commercial launch
http://blog.meritora.com/launch/
Received on Tuesday, 9 April 2013 14:00:00 UTC