Re: JSON-LD terminology from Gregg Kellogg on 2012-08-27 (public-rdf-wg@w3.org from August 2012)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Mon, 27 Aug 2012 13:25:46 -0400
To: Richard Cyganiak <richard@cyganiak.de>
CC: Andy Seaborne <andy.seaborne@epimorphics.com>, "public-rdf-wg@w3.org" <public-rdf-wg@w3.org>
Message-ID: <9385A959-E0A1-4608-86BE-A73CCB1F4F0B@greggkellogg.net>
On Aug 27, 2012, at 4:52 AM, Richard Cyganiak <richard@cyganiak.de> wrote:

> On 27 Aug 2012, at 11:42, Andy Seaborne wrote:
>> I share your concern about the dangers of (what is perceived to be) a different data model.  I'm not so sure that it is as bad as you paint it though.
>> 
>> Looking at:
>> 
>> http://www.w3.org/TR/json-ld-syntax/#linking-data
>> 
>> how would you change or tighten up the language in that section?
> 
> As I said in my email, I can see three reasonable approaches:
> 
> 1. Conceptualise JSON-LD as an extension of JSON (that is, it allows identification and linking of JSON objects, and extends the JSON data model with some richer constructs. Vanilla JSON is a tree of JSON objects and JSON arrays. JSON-LD is a graph of JSON objects.)

I believe this is the general take we've taken on JSON-LD so far. The problems (IMO) have come when we need to cross over and talk about Linked Data issues.

> 2. Pick terms from RDF Concepts (that is, speak of “nodes” instead of “subjects and objects”, “literals” instead of “objects that are labelled with text”, “predicates” instead of “properties”)

Yes, I agree. talking about subjects and objects becomes pretty confusing. In the issue, I've suggested changing _subject definition_ and _subject reference_ to _node definition_ and _node reference_, where a node is defined using a JSON object.

Getting into the difference between nodes, subjects, resources, properties and predicates is best left to RDF Concepts.

> 3. Describe JSON-LD in terms of the claims it makes about the universe: Subject and objects are “resources”/“entities”/“things”. JSON-LD expresses “relationships” between them; “properties” are types of relationships. Strings, numbers and so on are “values”.
> 
> Given the goals of JSON-LD, the first seems most reasonable as it explains the benefits of using JSON-LD over vanilla JSON in terms that the target audience should already be familiar with and relate to. Personally, I think that the second is cleanest, but I suppose it would go a bit against the JSON-LD design goal of pretending that it has nothing to do with RDF.
> 
>> The doc does a reasonable job of saying "JSON Object" when it means a the concept from JSON - maybe there are some places it does not get the naming quite right (editorial).
> 
> Yes, the doc attempts to distinguish "JSON Object" and "object", but the target audience will be familiar with the term "object" in the vanilla JSON sense, so why go against the grain by re-defining "object" to mean something else? What's wrong with "node"?

Agreed, but I think we'll probably stick with _JSON object_ to keep it distinct from the concept of _object_ in the triple-positional sense.

>> EricP - You are noted in "Issue 2" as suggesting "that the definitions of subject and object, while being practical, are at odds with [RDF-CONCEPTS] use in their roles within a triple."  Care to say more?
> 
> I guess EricP's concern is that in RDF Concepts, subjects and objects are positions in a triple. In the JSON-LD data model, subjects and objects are what RDF Concepts calls nodes. RDF Concepts doesn't distinguish between “subject nodes” and “object nodes”; there's just nodes. RDF Concepts does distinguish between IRIs, blank nodes, and literals, unlike JSON-LD.

JSON-LD does distinguish between IRIs, blank nodes and values (literals).

>> Personally, I have always found that the data model of RDF is not too complicated.  The main issue is the total amount of technology a web developer has juggle rather than any specific technology.
>> 
>> When faced with the task of learning RDF, the Turtle-as-records clicks.  URIs and prefix names can be a early confusion - JSON-LD does not change for the better or worse.
> 
> Uh, no. JSON-LD *does* change it for the worse. In Turtle, we have absolute URIs, relative URIs, and prefixed names. In JSON-LD, we have these three, plus term expansion. So there's four different ways of writing a URI.

RDFa also has terms, and terms are really necessary for people to work with JSON-LD as JSON.

>> The other confusion is "what is a graph?" A graph is a set of nodes and a set of edges.  It can be drawn as a picture.  An RDF graph does not include an explicit set of nodes - it's just the edge set.  A node label can be an edge label.   This is again the same in JSON-LD and RDF Concepts.
> 
> Unlike RDF, JSON-LD allows literals as subjects and predicates, allows multiple nodes with the same IRI, doesn't allow string literals that happen to contain an IRI, and allows multiple identical edges (i.e. is a multiset of triples).

Actually, JSON-LD does not allow literals as subjects. A subject is always identified by the @id key of a JSON object (or an unnamed BNode if it does not exist). The value of an @id key is always interpreted as an IRI or BNode. I think there was some mis-communication about this earlier.

Other RDF representations, also allow the same IRI to be used in multiple contexts as a subject. In fact, when JSON-LD is turned into RDF, these are merged together, much as they would be in RDFa or Turtle. Part of the framing algorithm (not in a REC-track document just yet) does include a _flattening_ step, which reconciles multiple JSON objects containing the same @Id into a single JSON object. We should probably move _flattening_, or something like it, into the API document.

> JSON-LD also doesn't specify whether IRIs are absolute or relative, doesn't say what exactly an internationalized string is, doesn't say what range of datatypes are supported, doesn't say whether graphs can be merged or how, and doesn't say when two graphs are identical; so in all of these regards its data model might also differ from RDF.

I don't follow you on this. JSON-LD's use of IRIs is the same as Turtle or RDFa, for all practical purposes. Relative IRIs, not used as a property or type, are resolved relative to the document base.

As with JSON, strings come from UTF-8, with language identified using the @languge key, either as part of an expanded value, or defined within a context. This is similar to the @lang definition within RDFa or RDF/XML. If there's some normative text you believe should be added to clarify this, I would be fine with that.

WRT graphs, JSON-LD provides a syntax for specifying them, pretty similar to the same way they can be specified in TriG. Notions of graph equivalence fall to the data model, not the serialization format, don't they? Given that this is a normative transformation from JSON-LD to RDF, graph equivalence and other semantics leverage RDF.

> JSON-LD also contradicts itself regarding the possibility of unlabelled edges (not possible according to the definition, but possible according to a NOTE a bit later).

Yes, we could add something that prohibits a key in a JSON-LD object from having the form of a BNode; I don't think this would really loose anything. The grammar in A.1 should say:

[[[
Keys are IRIs, compact IRIs, terms defined within the active context which MUST evaluate to absolute IRIs, or one of the following keywords
]]]

or words to that effect.

I would support removing that note, but I think the original came as a result of lobbying by Kingsley

[[[
A property SHOULD be labeled with an IRI.
]]]

(from requirements: [1])

This was to be inclusive of notions of Linked Data that aren't RDF, but I think it's probably appropriate now to close the loop on this and settle that a property MUST be labeled with an (absolute) IRI.

> So I don't think it's the same data model. It's not just a difference in choice of words here and there.
> 
> (JSON-LD also contains a narrow definition of "Linked Data" that contradicts five years of existing W3C specifications, but that's another rant for another day.)
> 
>> RDF is not an RDF format!

B.1, where this is asserted, should be stricken, or moved someplace else. We need to be clear that the concepts outlined in JSON-LD Syntax are based on, and fully consistent with RDF Concepts.

> Right, and neither are B.4 Microformats nor B.5 Microdata.

As a syntax, Microformats can be turned into RDF, although not normatively. For this audience, both Microformats and Microdata are appropriate when discussing how they relate to JSON-LD.

Microdata is just as much of an RDF format as RDFa (like it or not): [2] However, as with JSON-LD, microdata does not necessarily need to be transformed to RDF to be useful.

Gregg

[1] http://json-ld.org/requirements/latest/
[2] http://www.w3.org/TR/microdata-rdf/

> Best,
> Richard
Received on Monday, 27 August 2012 17:26:30 UTC