Re: Official response to RDF-ISSUE-132: JSON-LD/RDF Alignment

Hi Manu,

As promised, here is a more complete explanation of why the official 
response to my comments is not yet acceptable.  If I can somehow work 
directly with members of the working group to speed progress and craft 
specific changes that would satisfactorily resolve these issues I would 
be happy to do so.  They are important, and I am very willing to put in 
effort to help.

On 05/21/2013 02:19 PM, Manu Sporny wrote:
> Hi David,
>
> This is an official response to RDF-ISSUE-132: JSON-LD/RDF Alignment,
> which is being tracked here:
>
> https://www.w3.org/2011/rdf-wg/track/issues/132
>
> The group apologizes for missing your specific change requests until you
> prompted us to respond to them on May 21st, 2013. It was a fairly long
> thread, and we thought we had addressed all of your issues before you
> prompted us to respond to the specific change requests that you had
> outlined previously.
>
> Your general concerns revolve around the confusion that may result if
> JSON-LD does not more closely aligning itself with RDF. You had 5 major
> points that you raised. The JSON-LD group spent the majority of the call
> today discussing each of these points in detail:
>
> http://json-ld.org/minutes/2013-05-21/#topic-1
>
> Please read the entire conversation linked to above

Done.

> before going on to
> read the following responses, which are the official position of the group:
>
>> 1. Insert "based on RDF" to the definition of Linked Data, as
>> explained above.
>
> There was a very long series of conversations about whether or not RDF
> should be mentioned in the definition of Linked Data. There were a
> number of people arguing both for and against. These discussions
> happened over the course of 3 years, with the group deciding against
> including references to RDF in the Linked Data definition. Here are just
> a few of the discussions:
>
> http://json-ld.org/minutes/2011-07-04/#topic-3
> http://json-ld.org/minutes/2011-07-26/#topic-2
> http://json-ld.org/minutes/2012-11-20/#resolution-6
> http://json-ld.org/minutes/2012-11-20/#resolution-5
> http://json-ld.org/minutes/2012-12-04/#resolution-3
> http://json-ld.org/minutes/2012-12-11/#resolution-2
>
> Hopefully it is clear that the decision to leave "based on RDF" out of
> the Linked Data definition was thoroughly and carefully considered. In
> the end, the group decided not to tie RDF and Linked Data together
> because it would be conflating a data publishing concept (Linked Data)
> with an abstract data model (RDF).
>
> In the end, the group decided against tightly coupling Linked Data and
> RDF because:
>
> 1. It would conflate two different concepts.

It is extremely misleading to suggest that tightly coupling Linked Data 
and RDF "conflates" two different concepts, when the fact is that Linked 
Data -- in the established sense of the term -- is *based* on RDF.

It is clear from reading the JSON-LD group's discussion log
http://json-ld.org/minutes/2011-07-04/#topic-3
that the group wanted to avoid reference to RDF, and hence -- exceeding 
its authority -- the group invented a new definition for "Linked Data" 
to suit this purpose.  Some individuals even appear to have convinced 
themselves that this new definition is the *real* definition of the 
term!  It is not.

The term "Linked Data" has a well-established meaning within semantic 
web community.  The JSON-LD group would be *misleading* the public by 
stating or implying that Linked Data is not necessarily based on RDF.

As I pointed out to Kingsley a few weeks ago:
http://lists.w3.org/Archives/Public/public-rdf-comments/2013Apr/0086.html
[[
>   - Of the top 10 hits from in a google search for "Linked Data",
> **every one of them stated or implied that Linked Data is based on RDF.**
>
>   - Of the top 10 sites listed in a google search for '"Linked Data"
> is', **every one of them stated or implied that Linked Data is based on
> RDF.**
>
>   - Of the top 10 sites listed in a google search for '"Linked Data"
> definition', **every one of them stated or implied that Linked Data is
> based on RDF.**
>
> How much evidence do you need?  Shall we check the top 100 hits?  Or the
> top 1000 hits?  Shall we try other search engines?   If you search hard
> enough you might find a tiny fraction that supports your claim.  But the
> vast majority of the evidence does not.
>
> The vast majority of the evidence indicates that in established usage,
> the term "Linked Data" implies the use of RDF.  If you wish to propose a
> new definition that is contrary to this established usage, you are
> obviously free to do so.  But please do *not* make the patently false
> claim that your proposed new definition reflects accepted usage.  It
> very clearly does NOT.
]]

Furthermore, the official charter for the W3C Linked Data Platform 
Working Group states explicitly that: "RDF is the basis for Linked Data 
and the Semantic Web".
http://www.w3.org/2012/ldp/charter

If certain members of the JSON-LD group wish to re-architect Linked Data 
and the Semantic Web to be based on JSON instead of RDF, they are free 
to make that *proposal* on their own time, but that is *not* how Linked 
Data and the Semantic Web are currently architected, and that is not 
what the RDF working group was chartered to do.  The working group was 
chartered to "Define and standardize a JSON Syntax for RDF . . . an RDF 
serialization":
http://www.w3.org/2011/01/rdf-wg-charter

Why does the definition of "Linked Data" matter so much?  Messaging 
matters!  It can have a huge real-life impact.  (Colossal recent example 
in politics: The messaging that President Bush used to justify starting 
the Iraq war, which has ended up costing trillions of dollars and over 
100,000 civilians killed!)

The coining of the term "Linked Data" by TimBL was the single most 
important advance in messaging in the entire history of the Semantic 
Web.  One of the biggest problems the Semantic Web had was the term 
"Semantic Web" itself, because: (a) it is intimidating and confusing; 
and (b) it is misleading, because people wrongly associate it with the 
semantics of natural language processing.  It has been difficult over 
the years to get the messaging simple and clear -- and the ugliness of 
RDF/XML certainly didn't help -- and the term "Linked Data" helps 
substantially.

If the JSON-LD spec were to adopt a definition of "Linked Data" that 
differs in such a critical way from the established meaning of this 
term, it would be misleading the public and would create confusion in 
the community.

To be clear, the current resolution of this point is NOT satisfactory.

A simple and neutral way to resolve this problem would be to just quote 
TimBL's original definition of the term.  This is what other documents 
have done, and would not require endless wordsmithing debates.  I 
suggest doing that and linking to TimBL's original Linked Data document. 
  (Credit: thanks to Arnaud Le Hors for making this suggestion while we 
were talking at SemTech.)

> 2. It is the groups experience that Web developers have an aversion to
> RDF as a complex technology due to RDF/XML and other technologies that
> do not represent the current RDF world. It doesn't matter if these
> aversions are based on reality - the aversion exists, so we try to
> downplay RDF as much as possible in the JSON-LD spec.

I agree with the goal of keeping it simple for Web developers, but I 
think the downplaying has gone to the point of hiding it, and that is 
harmful.  If developers' view of RDF is going to change, they need to 
know that it *is* RDF that they are using when they use JSON-LD.  If 
they see how easy it is to use JSON-LD, it will stand on its own merits, 
even if it does say "RDF inside".  To my mind, the goal should not be to 
*hide* the fact that it is JSON-LD is RDF, but to make JSON-LD 100% 
usable by those who do not wish to learn anything *else* about RDF -- 
i.e., anything beyond what they learn in the JSON-LD spec.

> 3. There is no technical problem that is solved by referencing RDF in
> the definition of Linked Data.

No, but as explained above, it is a very important messaging issue.

> 4. If we were to add RDF to the definition of Linked Data, there would
> just be another set of objections to the inclusion of RDF in the
> definition of Linked Data.

Then those objections should be addressed head-on anyway, because the 
term "Linked Data" has an important and well-established meaning in the 
community, and that includes the fact that Linked Data is RDF. Otherwise 
those who wish to divorce Linked Data from RDF will be misleading the 
public when they talk about "Linked Data" and mean something else, or 
they talk about "conflating" Linked Data with RDF, when in fact Linked 
Data *is* RDF.

>
>> 2. Define a *normative* bi-directional mapping of a JSON profile to
>> and from the RDF abstract syntax, so that the JSON profile *is* a
>> serialization of RDF, and is fully grounded in the RDF data model and
>> semantics.
>
> We already do this here:
>
> http://www.w3.org/TR/json-ld/#transformation-from-json-ld-to-rdf

No, it doesn't.  That section explicitly says: "This section is 
non-normative".

In discussing this at SemTech with Gregg Kellogg -- BTW, thanks for 
getting together Gregg! -- he told me that it is the working group's 
intent that JSON-LD be a *normative* serialization of RDF.  So if that's 
the case, it sounds like this may be an editorial issue: the document 
needs to be much clearer about how the relationship is normative.  At 
present, it is not at all clear that it is normative.  Maybe what's 
needed would be something as simple as "This section is non-normative. 
However, JSON-LD is a normative serialization of RDF: the normative 
relationship between the JSON-LD syntax and the RDF model is defined in 
section @@@@."

>
> and here:
>
> http://www.w3.org/TR/json-ld/#relationship-to-rdf
>
> There have been arguments in the past to specify an additional subset of
> JSON-LD that is a direct mapping to the RDF Abstract Syntax, but no one
> has provided a compelling technical reason to do so.
>
> Additionally, creating two profiles of JSON-LD could have worse
> consequences than the ones you outline in your e-mails. For example,
> some implementers may only implement the subset and not the full version
> of JSON-LD, which would create a really bad interoperability problem.

The issue here was about alignment: JSON-LD saying that URIs "SHOULD" 
(RFC2119) be dereferenceable, while RDF makes no such requirement. 
However, in discussing this at SemTech with Greg Kellogg and Arnaud, 
Arnaud suggested that instead of defining a profile of JSON-LD that 
drops the "SHOULD", it would be better to encourage RDF to *include* 
such as a "SHOULD".  I think that's a great idea.

To be clear, I withdraw my suggestion that a separate profile of JSON-LD 
be defined.

>
> The extra features in JSON-LD, such as blank nodes as graph names, are a
> requirement for the Web Payments work as well as the RDF digital
> signatures work. So, we can't remove them without causing damage to
> those initiatives.
>
> If an author wants to use a version of JSON-LD that is fully grounded in
> the RDF data model, they should not use the JSON-LD features listed in
> those bullet points, or they should convert their non-RDF data to
> something that RDF can understand (more on this below).

It sounds like my suggestion to use skolemized URIs to avoid that 
problem was not understood, so I'll try to clarify.  The point is to 
ensure that JSON-LD is fully grounded in the RDF model, so that JSON-LD 
truly *is* an RDF serialization.  To achieve that in cases where a naive 
mapping from the JSON-LD syntax to the RDF model would produce a blank 
node in a position that RDF does not allow, I was suggesting that the 
JSON-LD spec *normatively* state that skolemized URIs MUST be used in 
the RDF model in those places.  I'll explain more below about how those 
skolem URIs are chosen.

>
>> 3. Use skolemized URIs in the normative mapping to prevent mapping
>> JSON syntax to illegal RDF.
>
> This is already stated as an option in a normative section:
>
> http://www.w3.org/TR/json-ld/#relationship-to-rdf
>
> We do not make this mandatory because there are several other legitimate
> ways to convert blank nodes to something that RDF can interpret. For
> example: 1) normalizing and getting a hash identifier for the subgraph
> attached to the blank node property or blank graph, 2) creating a
> counter-based solution for blank node naming, 3) minting a new global
> IRI for the blank node, 4) transforming to a data model that allows
> blank node properties and blank graphs, etc. There is no single correct
> approach.

That doesn't matter, as my next comment below will explain.

>
> Additionally, skolemization will not work unless all systems exchanging
> the skolem IRIs do so in a standard way, and there is currently no
> standard way of skolemizing.

*Some* standardization is needed, but it does not need to specify all 
details about the skolem URI.  All that's really important is that: (a) 
a skolem URI somehow be created; and (b) such skolem URIs can be 
reliably *recognized* as skolem URIs.  Beyond that, it doesn't matter if 
some implementations use counters, some using hashing techniques and 
some use other techniques.

The RDF 1.1 spec does specify how skolem URIs can be created so that 
they can be reliably recognized -- by use of a the .well-known 
convention -- so the JSON-LD could reference this technique in 
specifying how blank nodes are avoided in places where the RDF model 
does not allow them.

To be clear, the current resolution of this point is NOT satisfactory. 
Please further consider the suggestion of requiring skolem URIs in those 
circumstances.

> We do have an RDF Graph normalization
> specification, and it does seem to work:
>
> http://json-ld.org/spec/latest/rdf-graph-normalization/
>
> However, that spec isn't a REC and it would take quite a bit of work to
> get it there.
>
> So, the group believes that the spec does as much as it can to implement
> what you suggest above, while understanding that there are caveats to
> the approach you propose above.
>
>> 4. Make editorial changes to avoid implying that JSON-LD is not RDF.
>>   For example, change "Convert to RDF" to "Convert to Turtle" or
>> perhaps "Convert to RDF Abstract Syntax".
>
> The group agrees with changing the title of the section to "Convert to
> RDF Abstract Syntax".

Thank you.  But there are several other places also where the wording 
implies that JSON-LD is not RDF.  Appendix C is rife with them. I 
started to list them, but immediately ran into the problem that this 
section -- particularly the part before C.1 -- needs to be rewritten 
once JSON-LD is actually a normative serialization of RDF, and is fully 
grounded in the RDF model.

The whole discussion of the JSON-LD data model as distinct from the RDF 
data model also suggests that JSON-LD is not RDF.  It is also confusing 
to define a JSON-LD data model in addition to a JSON-LD document's RDF 
data model.  This confusion will be eliminated by making JSON-LD a 
normative serialization of RDF, fully grounded in the RDF model.

>
>> 5. Define normative names for, and clearly differentiate between, the
>> JSON serialization of RDF and JSON-LD, such that JSON-LD *is* a JSON
>> serialization of RDF, with additional constraints for Linked Data
>> (such as URIs use "http:" prefix, etc.). They do not necessarily have
>> to be defined in two separate documents. They could be defined in a
>> single document called "JSON-RDF and JSON-LD", for example.
>
> The group had discussed this a long time ago and re-hashed the
> discussion today during the call. We do not believe that having two
> different serializations of JSON-LD will help interoperability. Rather,
> the group believes that this is not the correct approach because:
>
> 1) Having two profiles will confuse developers, especially since the
> delta between the two profiles boils down to four bullet points. Why
> have two profiles where one of the profiles differ by around 4-5 sentences?
> 2) Having two profiles may make implementers only want to implement one
> profile or the other, which would open the door to interoperability
> problems.
> 3) There is no technical issue that is resolved by having two different
> serializations named JSON-RDF and JSON-LD.

As explained above, I now agree with defining a single JSON-LD language 
that is a serialization of RDF, so I withdraw my suggestion #5.

>
>> 6. Some small editorial fixes:
>>
>> "Since JSON-LD is 100% compatible with JSON" would be better phrased
>> as "Since JSON-LD is a restricted form of JSON", because saying that
>> JSON-LD is compatible with JSON wrongly suggests that JSON-LD is
>> *not* JSON, when in fact it is.
>
> We agree with your assessment, but would like to avoid the use of
> "restricted form of JSON". What we would like to convey is something to
> this effect:
>
> JSON-LD enables you to express portions of your JSON documents as Linked
> Data.
>
> We'll leave it up to the editors to select the proper language, but do
> agree with your general point.

Okay.

>
>> s/secrete agents/secret agents/
>
> Fixed:
>
> https://github.com/json-ld/json-ld.org/commit/0e5d5d122e702fa217c734cbe4a82a5b03401f5b
>
> The group understands that several of these responses are probably not
> what you would prefer. However, I hope that we've made it clear that we
> did discuss your concerns in depth, have previously discussed most of
> the issues that you raised, and our position has not changed due to more
> effective counter-points to the arguments that you made.
>
> Please respond to this e-mail and let us know if this response is
> acceptable to you. If the responses are not acceptable and you want to
> pursue the issues further, please make specific suggestions about the
> types of changes that you would like to see made to the specs.
>
> -- manu

Thanks,
David Booth

Received on Thursday, 6 June 2013 23:55:39 UTC