Re: comments on the json-ld document from Gregg Kellogg on 2011-08-30 (public-linked-json@w3.org from August 2011)

From: Gregg Kellogg <gregg@kellogg-assoc.com>
Date: Tue, 30 Aug 2011 14:07:58 -0400
To: Ivan Herman <ivan@w3.org>
CC: Manu Sporny <msporny@digitalbazaar.com>, Gregg Kellogg <gregg@kellogg-assoc.com>, "public-linked-json@w3.org" <public-linked-json@w3.org>
Message-ID: <462B79DF-DB16-4803-B189-061F9C74139C@greggkellogg.net>
On Aug 30, 2011, at 10:02 AM, Ivan Herman wrote:

> Manu, Gregg,
> 
> I finally sat down to read the JSON-LD document. Unfortunately, I did not find the time to go to the very end. I have got as far (but not included) section 5. Here are my comments...
> 
> --------
> 
> "It is designed to be able to express key-value pairs, RDF data, RDFa [RDFA-CORE] data, Microformats [MICROFORMATS] data, and Microdata [MICRODATA]"
> 
> But... there is no such thing as RDFa data. RDFa is simply a serialization of RDF data... So I do not understand what is meant here for RDFa.
> 
> The same comment holds for the item on targeted audience ("Software developers that want to encode Microformats, RDFa, or Microdata in a way that is cross-language compatible via JSON."
> 
> --------

Agreed, this should just be RDF and not RDFa.

> "Linked Data is a set of documents, each containing a representation of a linked data graph"
> 
> Is the word 'document' really good here? Most people, when seeing this term, would consider a web page, or a PDF page... I would use 'resources' here.

This might be getting a bit technical for the intended audience. Perhaps a parenthetical that describes document as " (resources such as web pages, PDF files, or other JSON serializations)".

> --------
> 
> So what is the relationship of the semantic web and linked data? Section 2.4 says "The semantic web, just like the document-based web, uses IRIs for unambiguous identification." which then comes a little bit out of the blue…

Good point, we should be consistent in our terminology. I'd suggest we try to remove explicit use of "semantic web", except in some section that might describe the relationship between linked data and the semantic web. I think the main difference is that SemWeb has some imposed representational restrictions and does not necessarily require follow-your-nose retrieval of referenced IRIs.

> --------
> 
> It is the first time I see the term "Web Vocabulary" (later in 2.4). Why not simply "Vocabulary"?
> 
> --------
> 
> In 2.4.1: I presume it is correct that a context document is not exactly the same as a JSON-LD document, though it looks the same…

It is a sub-set, basically a JSON-LD document with a different starting production. It was suggested that we create an EBNF description of JSON-LD, and I've considered attempting that myself. This might make it more explicit.

> -------
> 
> Third example in 3.1 seems to be misleading in the text. This is not a prefix...
> 
> -------

Yes, should be "foaf:name".

> Beginning of 3.1, it says "In the example above, the key http://xmlns.com/foaf/0.1/name is interpreted as an IRI, as opposed to being interpreted as a string." What are the rules to turn this into an IRI instead of a string? Is it based on the regexp for URI-s? But, if those are used, then why needing a coersion rule for the similar functionality in case of objects?
> 
> ------

Keys are turned into IRIs if they are not keywords (i.e., @iri, @type, …) and result in an absolute IRI after term/prefix expansion. This should be made more explicit.

> I was looking for some precise rules on the where the definition of a @context would apply. I would expect something like "the rules in @context are valid in the enclosing object and the other descendents of the enclosing object". For example
> 
> {
>  @context { "a" : "http://..."}
>  ...
>  "a" : {
>           "a" : "http://...."
>        }
> }
> 
> means that "a" is expanded in both the top and the enclosed object.
> 
> However, you have an example saying
> 
> {
>  @context {
>      "name": "http://xmlns.com/foaf/0.1/name",
>      "homepage": "http://xmlns.com/foaf/0.1/homepage",
>      "@coerce": 
>      {
>         "@iri": "homepage"
>      }
>  }
>  ...
> }
> 
> ie, the expansion of "homepage" is also valid in @coerce, which is not explained by the rule above.
> 
> This needs precise spec. Or, alternatively, the @coerce is out of the validity range... (though I think a better spec is better)
> 
> ------
> 
> Reading 3.11, I do not understand it. What is the role of @subject in the outer object in the first example? 

This is certainly awkward, as has been noted by others. A key is necessary to begin the expansion of it's values. Note that in Manu's proposed change, the array of objects could be treated as multiple subjects to apply to the (nonexistent) key/values at the same level as the @subject definition. Note that we've also considered collapsing @subject and @iri (into just @iri, I would hope), which wouldn't help in this particular case, but it is consistent with the use of @iri to describe a single object.

> In general, I do not understand what framing tries to achieve. There is no explanation in the text; more precisely, I do not understand the introductory text. I am asking the question that I am also asking below re other features: do we need it for the majority of users? 
> 
> ------
> 
> Reading other emails, I must say I am begining to be in favour of dropping CURIE-s for the time being. It makes reading the document complicated for a non-expert (talking as somebody who uses prefixes all the time:-).
> 
> ---- 
> 
> Section 4.6 I am fairly opposed the idea of overriding keywords. That is really a rope to hang ourselves in terms of readability of a dataset. Also, what are the consequences if the context file is read from the web as an external file? Doesn't it radically change the interpretation of a specific JSON-LD file with or without an access to that context?

This is actually a bit wrong, we're not actually overriding the meaning of @subject or @type, but allowing aliases for those keys that affect the processing algorithm. I think that it should be an error to override @subject, @type, @iri, etc.

> ----
> 
> Section 4.7 I of course understand what normalization is used for, but I seriously question whether this is something we should put into this document. The goal is to reach javascript/web developer people. I think only a very small percentage will make use of this normalization, for the rest this is a complicated noise. I would propose to remove normalization altogether and, if there is a clear view on it, then put it into a separate document.
> 
> So, in summary: I believe the spec _is_ a bit too complicated for what we want to achieve. There are features that are probably useful for a minority of users but, I am afraid, if this document is put forward to the non-SW, non-LD, non-RDF community, then it will be rejected as too complex. Normalization, possibly framing should be put into separate documents for niche users…

I tend to agree that normalization (and probably framing) is too much complexity. They are both important, so I'd support placing them in an "Advanced JSON-LD Profiles" document. The same could be said for collections/lists, and graph literals.

> I am sorry if I sound negative:-(

Thanks for the constructive feedback!

Gregg

> Ivan
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
>
Received on Tuesday, 30 August 2011 18:08:55 UTC