W3C home > Mailing lists > Public > public-rdf-comments@w3.org > June 2013

Re: Official response to RDF-ISSUE-132: JSON-LD/RDF Alignment

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 08 Jun 2013 12:19:38 -0400
Message-ID: <51B3599A.1040108@openlinksw.com>
To: public-rdf-comments@w3.org
On 6/8/13 8:28 AM, Markus Lanthaler wrote:
> On Friday, June 07, 2013 1:55 AM, David Booth wrote:
>> On 05/21/2013 02:19 PM, Manu Sporny wrote:
>>> Hopefully it is clear that the decision to leave "based on RDF" out of
>>> the Linked Data definition was thoroughly and carefully considered. In
>>> the end, the group decided not to tie RDF and Linked Data together
>>> because it would be conflating a data publishing concept (Linked Data)
>>> with an abstract data model (RDF).
>>> In the end, the group decided against tightly coupling Linked Data and
>>> RDF because:
>>> 1. It would conflate two different concepts.
>> It is extremely misleading to suggest that tightly coupling Linked Data
>> and RDF "conflates" two different concepts, when the fact is that Linked
>> Data -- in the established sense of the term -- is *based* on RDF.
> IMHO, RDF != Linked Data. Nothing in RDF requires IRIs to be dereferenceable
> - but of course you can use RDF to express Linked Data if you somehow
> communicate out-of-band that those *identifiers* in there are also locators.
>> It is clear from reading the JSON-LD group's discussion log
>> http://json-ld.org/minutes/2011-07-04/#topic-3
>> that the group wanted to avoid reference to RDF, and hence -- exceeding
>> its authority -- the group invented a new definition for "Linked Data"
>> to suit this purpose.  Some individuals even appear to have convinced
>> themselves that this new definition is the *real* definition of the
>> term!  It is not.
> I think we are talking about the text in the (non-normative) introduction.
> Let me quote it:
>     Linked Data is a technique for creating a network of inter-connected
>     data across different documents and Web sites. In general, Linked Data
>     has four properties:
>       1) it uses IRIs to name things;
>       2) it uses HTTP IRIs for those names;
>       3) the name IRIs, when dereferenced, provide more information about
>          the thing; and
>       4) the data expresses links to data on other Web sites.
>     These properties allow data published on the
>     Web to work much like Web pages do today. One can start at one piece
>     of Linked Data, and follow the links to other pieces of data that are
>     hosted on different sites across the Web.
> Here are TBL's (current, 2009) Linked Data principles:
>    1) Use URIs as names for things
>    2) Use HTTP URIs so that people can look up those names.
>    3) When someone looks up a URI, provide useful information,
>       using the standards (RDF*, SPARQL)
>    4) Include links to other URIs. so that they can discover more things
> So I think all we are arguing about here is the "(RDF*, SPARQL)" in (3),
> right?
> Now let's look at the at the original 2006 version of the Linked Data
> principles as Kingsley proposed:
>    1) Use URIs as names for things
>    2) Use HTTP URIs so that people can look up those names.
>    3) When someone looks up a URI, provide useful information.
>    4) Include links to other URIs. so that they can discover more things.
> http://web.archive.org/web/20061201121454/http://www.w3.org/DesignIssues/Lin
> kedData.html
> Surprisingly exactly that "(RDF*, SPARQL)" remark was missing when the term
> was coined. We can continue forever to argue about whether it is needed or
> not. We can also argue whether it is possible to "provide useful
> information" by using an abstract data model, i.e., RDF. When you
> dereference a URI, you'll get back a representation which is in a concrete
> syntax. So, it would be more correct to say
>    3) When someone looks up a URI, provide useful information,
>       using a standard format which can be interpreted as RDF
> Would that add any value given that you can interpret (convert) every format
> to RDF? I doubt so. This group (myself included) is convinced that doing so
> would scare of a large portion of the target group, i.e., average web
> developers.
>> The term "Linked Data" has a well-established meaning within semantic
>> web community.  The JSON-LD group would be *misleading* the public by
>> stating or implying that Linked Data is not necessarily based on RDF.
> RDF is an abstract data model whereas Linked Data is a concept. Everything
> can be expressed in RDF. In that paragraph we are describing the concept for
> people not familiar with it. Clearly, the "semantic web community" is not
> the intended target group of that paragraph. Not with the best will in the
> world can I see how this is "misleading the public".
>> If certain members of the JSON-LD group wish to re-architect Linked Data
>> and the Semantic Web to be based on JSON instead of RDF, they are free
>> to make that *proposal* on their own time, but that is *not* how Linked
>> Data and the Semantic Web are currently architected, and that is not
>> what the RDF working group was chartered to do.  The working group was
>> chartered to "Define and standardize a JSON Syntax for RDF . . . an RDF
>> serialization":
>> http://www.w3.org/2011/01/rdf-wg-charter
> We are definitely not trying to re-architect Linked Data. What we are trying
> to do is to bring it to the masses. I think we agree that the semantic web
> community you are talking about has a miserable track record for doing so.
> Mentioning RDF in the first paragraph of the spec would certainly not help
> us in that regard. Unfortunately, a lot of people simply stop listening when
> they hear the three magic letters R D F.
> We just try to explain them the underlying principles in simple terms to get
> them interested and motivated enough to read the rest. The end of the spec
> makes JSON-LD's relationship to RDF crystal clear (IMO at least) and
> contains a whole lot of examples for people from the semantic web community
> already familiar with e.g. Turtle or RDFa. Those people don't need to read
> the introduction, they know the basics already.
>> Why does the definition of "Linked Data" matter so much?  Messaging
>> matters!  It can have a huge real-life impact.  (Colossal recent example
>> in politics: The messaging that President Bush used to justify starting
>> the Iraq war, which has ended up costing trillions of dollars and over
>> 100,000 civilians killed!)
> I just ignore this remark.
>> The coining of the term "Linked Data" by TimBL was the single most
>> important advance in messaging in the entire history of the Semantic
>> Web.  One of the biggest problems the Semantic Web had was the term
>> "Semantic Web" itself, because: (a) it is intimidating and confusing;
>> and (b) it is misleading, because people wrongly associate it with the
>> semantics of natural language processing.  It has been difficult over
>> the years to get the messaging simple and clear -- and the ugliness of
>> RDF/XML certainly didn't help -- and the term "Linked Data" helps
>> substantially.
> Exactly, it is a marketing term. Dan wrote excellent piece on that so I
> won't rehash it here:
> http://lists.w3.org/Archives/Public/www-archive/2012Oct/0119.html
> The truth is that people strongly associate RDF with RDF/XML. In fact, it is
> difficult to have conversations without conflating RDF the data model and
> its serialization formats.
>> If the JSON-LD spec were to adopt a definition of "Linked Data" that
>> differs in such a critical way from the established meaning of this
>> term, it would be misleading the public and would create confusion in
>> the community.
> Sorry, but I just can't see how it is doing that.
>> To be clear, the current resolution of this point is NOT satisfactory.
>> A simple and neutral way to resolve this problem would be to just quote
>> TimBL's original definition of the term.  This is what other documents
>> have done, and would not require endless wordsmithing debates.  I
>> suggest doing that and linking to TimBL's original Linked Data document.
>> (Credit: thanks to Arnaud Le Hors for making this suggestion while we
>> were talking at SemTech.)
> I suppose by "TimBL's original definition" you don't really mean the
> original 2006 version, right?
>>> 2. It is the groups experience that Web developers have an aversion to
>>> RDF as a complex technology due to RDF/XML and other technologies that
>>> do not represent the current RDF world. It doesn't matter if these
>>> aversions are based on reality - the aversion exists, so we try to
>>> downplay RDF as much as possible in the JSON-LD spec.
>> I agree with the goal of keeping it simple for Web developers, but I
>> think the downplaying has gone to the point of hiding it, and that is
>> harmful.  If developers' view of RDF is going to change, they need to
>> know that it *is* RDF that they are using when they use JSON-LD. If
> And you think that developers won't understand that from the last paragraph
> in the introduction
>     Developers that require any of the facilities listed above or
>     need to serialize an RDF graph or dataset [RDF11-CONCEPTS] in a
>     JSON-based syntax will find JSON-LD of interest.
> or any of the sections specifically discussing the relationship of JSON-LD
> and RDF?
>> they see how easy it is to use JSON-LD, it will stand on its own merits,
>> even if it does say "RDF inside".  To my mind, the goal should not be to
>> *hide* the fact that it is JSON-LD is RDF, but to make JSON-LD 100%
>> usable by those who do not wish to learn anything *else* about RDF --
>> i.e., anything beyond what they learn in the JSON-LD spec.
> That's exactly what we try to do. By "hiding RDF" we try to increase the
> chances that they "see how easy it is to use JSON-LD" instead of stopping to
> read after the first paragraph because of an aversion to RDF.
>>> 3. There is no technical problem that is solved by referencing RDF in
>>> the definition of Linked Data.
>> No, but as explained above, it is a very important messaging issue.
>>> 4. If we were to add RDF to the definition of Linked Data, there would
>>> just be another set of objections to the inclusion of RDF in the
>>> definition of Linked Data.
>> Then those objections should be addressed head-on anyway, because the
>> term "Linked Data" has an important and well-established meaning in the
>> community, and that includes the fact that Linked Data is RDF. Otherwise
>> those who wish to divorce Linked Data from RDF will be misleading the
>> public when they talk about "Linked Data" and mean something else, or
>> they talk about "conflating" Linked Data with RDF, when in fact Linked
>> Data *is* RDF.
> Linked Data != RDF. RDF without a single dereferenceable IRI is still valid
> RDF but it certainly isn't Linked Data by any means.
>>>> 2. Define a *normative* bi-directional mapping of a JSON profile to
>>>> and from the RDF abstract syntax, so that the JSON profile *is* a
>>>> serialization of RDF, and is fully grounded in the RDF data model and
>>>> semantics.
>>> We already do this here:
>>> http://www.w3.org/TR/json-ld/#transformation-from-json-ld-to-rdf
>> No, it doesn't.  That section explicitly says: "This section is
>> non-normative".
> JSON-LD consists of two specs, the syntax spec and the algorithms and API
> spec. The normative transformation to RDF can be found here:
> http://www.w3.org/TR/json-ld-api/#convert-to-rdf-algorithm
>>> There have been arguments in the past to specify an additional subset of
>>> JSON-LD that is a direct mapping to the RDF Abstract Syntax, but no one
>>> has provided a compelling technical reason to do so.
>>> Additionally, creating two profiles of JSON-LD could have worse
>>> consequences than the ones you outline in your e-mails. For example,
>>> some implementers may only implement the subset and not the full version
>>> of JSON-LD, which would create a really bad interoperability problem.
>> The issue here was about alignment: JSON-LD saying that URIs "SHOULD"
>> (RFC2119) be dereferenceable, while RDF makes no such requirement.
> Exactly, and still you argue that Linked Data === RDF.
>> However, in discussing this at SemTech with Greg Kellogg and Arnaud,
>> Arnaud suggested that instead of defining a profile of JSON-LD that
>> drops the "SHOULD", it would be better to encourage RDF to *include*
>> such as a "SHOULD".  I think that's a great idea.
> I tried that already, see: https://www.w3.org/2011/rdf-wg/track/issues/103
>> To be clear, I withdraw my suggestion that a separate profile of JSON-LD
>> be defined.
> Great.
>>> The extra features in JSON-LD, such as blank nodes as graph names, are a
>>> requirement for the Web Payments work as well as the RDF digital
>>> signatures work. So, we can't remove them without causing damage to
>>> those initiatives.
>>> If an author wants to use a version of JSON-LD that is fully grounded in
>>> the RDF data model, they should not use the JSON-LD features listed in
>>> those bullet points, or they should convert their non-RDF data to
>>> something that RDF can understand (more on this below).
>> It sounds like my suggestion to use skolemized URIs to avoid that
>> problem was not understood, so I'll try to clarify.  The point is to
>> ensure that JSON-LD is fully grounded in the RDF model, so that JSON-LD
>> truly *is* an RDF serialization.  To achieve that in cases where a naive
>> mapping from the JSON-LD syntax to the RDF model would produce a blank
>> node in a position that RDF does not allow, I was suggesting that the
>> JSON-LD spec *normatively* state that skolemized URIs MUST be used in
>> the RDF model in those places.  I'll explain more below about how those
>> skolem URIs are chosen.
>>>> 3. Use skolemized URIs in the normative mapping to prevent mapping
>>>> JSON syntax to illegal RDF.
>>> This is already stated as an option in a normative section:
>>> http://www.w3.org/TR/json-ld/#relationship-to-rdf
>>> We do not make this mandatory because there are several other legitimate
>>> ways to convert blank nodes to something that RDF can interpret. For
>>> example: 1) normalizing and getting a hash identifier for the subgraph
>>> attached to the blank node property or blank graph, 2) creating a
>>> counter-based solution for blank node naming, 3) minting a new global
>>> IRI for the blank node, 4) transforming to a data model that allows
>>> blank node properties and blank graphs, etc. There is no single correct
>>> approach.
>> That doesn't matter, as my next comment below will explain.
>>> Additionally, skolemization will not work unless all systems exchanging
>>> the skolem IRIs do so in a standard way, and there is currently no
>>> standard way of skolemizing.
>> *Some* standardization is needed, but it does not need to specify all
>> details about the skolem URI.  All that's really important is that: (a)
>> a skolem URI somehow be created; and (b) such skolem URIs can be
>> reliably *recognized* as skolem URIs.  Beyond that, it doesn't matter if
>> some implementations use counters, some using hashing techniques and
>> some use other techniques.
> Another important aspect that's missing in your list above is that skolem
> IRIs have to be unique and that's exactly what makes it so difficult to
> create them in a distributed system.
>> The RDF 1.1 spec does specify how skolem URIs can be created so that
>> they can be reliably recognized -- by use of a the .well-known
>> convention -- so the JSON-LD could reference this technique in
>> specifying how blank nodes are avoided in places where the RDF model
>> does not allow them.
> The general problem I have with this approach is that a skolem IRI just
> allows you to work around a limitation in a serialization format or a
> concrete implementation. In RDF, the data model, it is still a blank node
> (that's why it is important to be able to reliably recognize them). And
> again we conflate the data model and the serialization formats..
>> To be clear, the current resolution of this point is NOT satisfactory.
>> Please further consider the suggestion of requiring skolem URIs in those
>> circumstances.
> Are skolem IRIs blank nodes or not according to you? If so, how does it help
> to require them?
>>>> 4. Make editorial changes to avoid implying that JSON-LD is not RDF.
>>>>    For example, change "Convert to RDF" to "Convert to Turtle" or
>>>> perhaps "Convert to RDF Abstract Syntax".
>>> The group agrees with changing the title of the section to "Convert to
>>> RDF Abstract Syntax".
>> Thank you.  But there are several other places also where the wording
>> implies that JSON-LD is not RDF.  Appendix C is rife with them. I
>> started to list them, but immediately ran into the problem that this
>> section -- particularly the part before C.1 -- needs to be rewritten
>> once JSON-LD is actually a normative serialization of RDF, and is fully
>> grounded in the RDF model.
> JSON-LD is not RDF. Turtle is neither. Both are serialization formats with a
> mapping to RDF, an abstract data model.
>> The whole discussion of the JSON-LD data model as distinct from the RDF
>> data model also suggests that JSON-LD is not RDF.  It is also confusing
>> to define a JSON-LD data model in addition to a JSON-LD document's RDF
>> data model.  This confusion will be eliminated by making JSON-LD a
>> normative serialization of RDF, fully grounded in the RDF model.
> We added the Data Model section since the RDF WG asked us to do so. I don't
> see compelling reasons to revisit that decision.
> Cheers,
> Markus
> --
> Markus Lanthaler
> @markuslanthaler


Thank You !!!!!



Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Received on Saturday, 8 June 2013 16:20:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:29:57 UTC