Re: comments on the json-ld document from Ivan Herman on 2011-08-31 (public-linked-json@w3.org from August 2011)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 31 Aug 2011 10:16:24 +0200
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Cc: Manu Sporny <msporny@digitalbazaar.com>, "public-linked-json@w3.org" <public-linked-json@w3.org>
Message-Id: <9D296C6C-3E16-4162-A126-F0EE874A63AE@w3.org>
Gregg (also seeing the answers of Markus and Dave, and having refreshed to read the latest version),

just some remarks on your remarks

On Aug 30, 2011, at 20:07 , Gregg Kellogg wrote:

> On Aug 30, 2011, at 10:02 AM, Ivan Herman wrote:
> 

[snip]

> 
>> "Linked Data is a set of documents, each containing a representation of a linked data graph"
>> 
>> Is the word 'document' really good here? Most people, when seeing this term, would consider a web page, or a PDF page... I would use 'resources' here.
> 
> This might be getting a bit technical for the intended audience. Perhaps a parenthetical that describes document as " (resources such as web pages, PDF files, or other JSON serializations)".
> 

That would work for me


>> --------
>> 
>> So what is the relationship of the semantic web and linked data? Section 2.4 says "The semantic web, just like the document-based web, uses IRIs for unambiguous identification." which then comes a little bit out of the blue…
> 
> Good point, we should be consistent in our terminology. I'd suggest we try to remove explicit use of "semantic web", except in some section that might describe the relationship between linked data and the semantic web. I think the main difference is that SemWeb has some imposed representational restrictions and does not necessarily require follow-your-nose retrieval of referenced IRIs.

Whether I like it or not, this issue has become some sort of a religious issue. I think if there is a clearer relationship described in the document between JSON-LD and RDF, that is perfectly enough, and we keep away from that endless discussion. At the moment, you guys did all you could to ban the term RDF from the document except for the very end, ie, section 6.13. I must say that, even there, I miss some sort of a more explicit mapping from the terminology used in JSON-LD to RDF and vice versa.

> 
>> --------
>> 
>> It is the first time I see the term "Web Vocabulary" (later in 2.4). Why not simply "Vocabulary"?
>> 

I saw Markus agreed with me on that...


>> --------
>> 
>> In 2.4.1: I presume it is correct that a context document is not exactly the same as a JSON-LD document, though it looks the same…
> 
> It is a sub-set, basically a JSON-LD document with a different starting production. It was suggested that we create an EBNF description of JSON-LD, and I've considered attempting that myself. This might make it more explicit.

Well, that is fine, syntactically. But it creates a thin ice in the sense that the interpretation of @context is different than the rest of the JSON-LD content (eg, for RDF conversion). I _think_ it is fine but I have not gone into all details of the document...

[snip]

> 
>> Beginning of 3.1, it says "In the example above, the key http://xmlns.com/foaf/0.1/name is interpreted as an IRI, as opposed to being interpreted as a string." What are the rules to turn this into an IRI instead of a string? Is it based on the regexp for URI-s? But, if those are used, then why needing a coersion rule for the similar functionality in case of objects?
>> 
>> ------
> 
> Keys are turned into IRIs if they are not keywords (i.e., @iri, @type, …) and result in an absolute IRI after term/prefix expansion. This should be made more explicit.

Hm. The way I read this is that if I have

{
  "bla" : "bar"
}

a JSON-LD processor _must_ turn "bla" into an IRI: there is no context, ie, there is no mapping from "bla" to a IRI, so it must be an IRI of its own right. I am not sure what this means for non-RDF JSON LD processing (you know guys, I am an RDF person...) but it somehow smells strange to me.

It also raises another issue to me. A few days ago I had a discussion with Manu. The current JSON-LD document allows 

{
   "@context" : "http://bla.bla.bla" ,
   "bla" : "bar"
}

which reminds me (it is essentially the same) of the @profile features of RDFa1.1 which, lately has been removed because of the difficulties javascript programmers would have in dereferencing the vocabulary (a.k.a. context) file. One of the reasons Manu told me for keeping this is because there is a difference: in RDFa 1.1, if the @profile was not reachable, then the RDFa processing had to stop on the subtree whereas, says Manu:-), in JSON-LD a JSON-LD processor can still work with the file. I can surely understand that if, in the above example, "bla" remains a string as a key and most of the processors would not care. But if it is considered to be an IRI... then a processor, in a follow-your-nose action, might want to dereference this which will really lead to application on a wrong route... Isn't this a problem?

My approach would be that if a key cannot be expanded as a URI via @context, then it is turned into an IRI only if the string abides to the regular expression that defines an absolute IRI in the corresponding RFC (or maybe even a subset thereof to avoid issues with spaces). It is not perfect I believe, but it would cover 95% of the cases...


> 
>> I was looking for some precise rules on the where the definition of a @context would apply. I would expect something like "the rules in @context are valid in the enclosing object and the other descendents of the enclosing object". For example
>> 
>> {
>> @context { "a" : "http://..."}
>> ...
>> "a" : {
>>          "a" : "http://...."
>>       }
>> }
>> 
>> means that "a" is expanded in both the top and the enclosed object.
>> 
>> However, you have an example saying
>> 
>> {
>> @context {
>>     "name": "http://xmlns.com/foaf/0.1/name",
>>     "homepage": "http://xmlns.com/foaf/0.1/homepage",
>>     "@coerce": 
>>     {
>>        "@iri": "homepage"
>>     }
>> }
>> ...
>> }
>> 
>> ie, the expansion of "homepage" is also valid in @coerce, which is not explained by the rule above.
>> 
>> This needs precise spec. Or, alternatively, the @coerce is out of the validity range... (though I think a better spec is better)
>> 

I tried to look into section 6.3 on this. 

- a @context is handled as a JSON object; its local context is set to empty.
- it says, in 3.4, that a key-value pair is added to the local context
- it also says, in 3.3, that the @coerce mapping is performed using an IRI expansion

So that is _almost_ o.k. but, if I implement this, one has a hidden two pass algorithm here that is not properly documented: namely I have to go through the object by performing 3.4 on _all_ keys and _then_ perform the expansion of 3.3, or otherwise the IRI expansion may not work. Or is the definition of the context such that the @coerce MUST follow the previous keys? I would not think so.

Also, what this means that if I have 

{
  "@context" {
      "foo" : "http://www.datatype.org/d/"
  }
  "bla" : {
             @context {
                 @coerse {
                     "@iri" : "foo"
                 }
             }
             "foo" : "http://so.what.now"
          }
}

will the internal "http://so.what.now" turned into an @iri? Indeed, to handle the @context a new local context is set up, being empty but it is not clear whether the active context is valid for the IRI expansion within @context (I guess it should, but it is not explicit...)

As an aside, there is a small problem in 6.3.2: in the example @context->@coerce


>> ------
>> 
>> Reading 3.11, I do not understand it. What is the role of @subject in the outer object in the first example? 
> 
> This is certainly awkward, as has been noted by others. A key is necessary to begin the expansion of it's values. Note that in Manu's proposed change, the array of objects could be treated as multiple subjects to apply to the (nonexistent) key/values at the same level as the @subject definition. Note that we've also considered collapsing @subject and @iri (into just @iri, I would hope), which wouldn't help in this particular case, but it is consistent with the use of @iri to describe a single object.
> 
>> In general, I do not understand what framing tries to achieve. There is no explanation in the text; more precisely, I do not understand the introductory text. I am asking the question that I am also asking below re other features: do we need it for the majority of users? 
>> 

I saw there was a separate discussion an IRC dump on the frame & co. This deserves a separate thread...

>> ------
>> 
>> Reading other emails, I must say I am begining to be in favour of dropping CURIE-s for the time being. It makes reading the document complicated for a non-expert (talking as somebody who uses prefixes all the time:-).

It may well be that the issue is the way it is described. While one is reading that (and I tried to read it as if I did not know what was to be achieved) it sounded very complex. The typical way of alienating readers I am afraid. Maybe by downplaying this, not referring to the RDFa Core definition of curies, etc, but simply saying something like "we already have a key-to-IRI expansion; as an extra microsyntax, a key:bla is expanded as a, well, IRI with that "bla" added to the end of it. String concatenation and that is it. Maybe using key+bla or key.bla as an alternative microsyntax is also fine. My point is: let us not make a big deal out of it!

>> 
>> ---- 
>> 
>> Section 4.6 I am fairly opposed the idea of overriding keywords. That is really a rope to hang ourselves in terms of readability of a dataset. Also, what are the consequences if the context file is read from the web as an external file? Doesn't it radically change the interpretation of a specific JSON-LD file with or without an access to that context?
> 
> This is actually a bit wrong, we're not actually overriding the meaning of @subject or @type, but allowing aliases for those keys that affect the processing algorithm. I think that it should be an error to override @subject, @type, @iri, etc.

I see the new formulation, but I still have to be convinced about the necessity of that stuff. "This feature allows more legacy JSON content to be supported by JSON-LD": this is cryptic to me, what widely deployed JSON content are we talking about?


> 
>> ----
>> 
>> Section 4.7 I of course understand what normalization is used for, but I seriously question whether this is something we should put into this document. The goal is to reach javascript/web developer people. I think only a very small percentage will make use of this normalization, for the rest this is a complicated noise. I would propose to remove normalization altogether and, if there is a clear view on it, then put it into a separate document.
>> 

See the issue on a separate thread...

Cheers

Ivan

----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 31 August 2011 08:16:55 UTC