RE: second review of json-ld from Markus Lanthaler on 2013-04-01 (public-rdf-wg@w3.org from April 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Mon, 1 Apr 2013 23:31:34 +0200
To: "'W3C RDF WG'" <public-rdf-wg@w3.org>
Cc: "'Sandro Hawke'" <sandro@w3.org>
Message-ID: <01b401ce2f20$4b1920c0$e14b6240$@lanthaler@gmx.net>
On Friday, March 29, 2013 4:25 PM, Sandro Hawke wrote:

> This is a partial follow-up review of json-ld.
>
> Summary:  more of the same - mostly editorial - a few issues that will 
> hopefully be simple to review.   I'm not quite done, but may have to 
> stop for a day or two, so I'm sending this along now.

Thanks Sandro. I've tried to address most of them in
cbcd28960b2014dc45e4e98fb192278c99cd47ff [1]


> Details:
> 
> > Simply speaking, a context is used to map terms, to IRIs.
> 
> s/terms,/terms/

Fixed


> > and types that do not match a term or are neither a compact IRI nor
> 
> s/or are neither/and are neither/

Fixed


> > If multiple embedded JSON-LD documents are extracted as RDF, the
> > result is the RDF merge of the extracted datasets.
> 
> Alas, there is no defined way to merge RDF datasets.
> 
> The problem is that sometimes it's obvious that the merge of
>  
>     <g> { <a> <b> 1 }
> 
> and
> 
>      <g> { <a> <b> 2 }
> 
> is
>  
>     <g> { <a> <b> 1,2 }
> 
> and sometimes it's obvious the two can't be merged because they 
> contradict each other.
> 
> See: http://www.w3.org/2011/rdf-wg/track/issues/17
> 
> > RESOLVED: close issue-17 -- there is no general purpose way to
> > merge datasets; it can only be done with external knowledge.
> 
> Proposed solution is to define it here, something like:  If multiple 
> embedded JSON-LD documents are extracted as RDF, the result is a dataset 
> formed by merging all the graphs that have the same name (and thus 
> making a single named graph per graph name) and all the default graphs 
> (to make one resulting default graph).

I decided to just remove that sentence. I think it confuses more than it
helps.


> > Figure 1: An illustration of JSON-LD's data model.
>
> Broken image link.

Fixed


> More importantly, the diagram is both misleading and wrong.   It's 
> misleading in that each of the nodes is shown as being in exactly one 
> graph; nodes are actually allowed to be in multiple graphs, and nearly 
> always are.   It's wrong in that it shows two arcs that aren't in any 
> graph, when actually every arc has to be in one or more graphs.

Good spot. I removed the cross-graph arcs.


> I haven't managed to produce a good drawing of this.   Sometimes I think 
> of it as color-coding arcs, like this:
> 
>
http://www.w3.org/Consortium/Offices/Presentations/RDFTutorial/figures/AnimM
erge8.png
> 
> and somtimes I think of it as layers:
> 
> http://www.flickr.com/photos/danbri/3472944745/
> http://farm4.static.flickr.com/3613/3384528143_8304792836_b.jpg
> 
> although I image the layers closer together, like transparent sheets of 
> plastic, each with writing on them.

I didn't introduce layers to show that nodes might be in multiple nodes. I
think that would go beyond the scope of this simple, informative
illustration.


> > Whenever possible, the graph name /SHOULD/ be an IRI
> 
> s/possible/practical/      (I think)

Fixed


> > At Risk
> 
> I'm a little lost in the AT RISK features.   Can we do it like this: 
> http://www.w3.org/TR/2009/CR-owl2-syntax-20090611/#atRisk1  ?   So each 
> at-risk feature is identified separately from where it occurs in the 
> specs, on a wiki page (rdf-wg/wiki/JSON-LD_Features_at_Risk or 
> something).   And each time it comes up in the specs, that is 
> referenced, along with a clear explanation for people who've never heard 
> of this little feature of the W3C process.

Good idea. I will update the spec to this style tomorrow.


> > Within the JSON-LD syntax these edge labels are called properties.
> 
> Actually, you use the term somewhat inconsistently -- sometimes you call 
> those labels "property names" and sometimes you call them "property 
> labels".    I'm not sure this is worth fixing -- I'm probably being 
> overly pedantic to mention it -- but in RDF they'd be considered 
> property names.  The property itself is the thing denoted by the IRI.  I 
> think in general it's fine to call these things "properties" (and skip 
> over the detail that they are property names), but maybe in the formal 
> model it's better to be precise.

The only two occurrences where we used property names was when we talked
about "empty JSON keys". I fixed this as well.


> > Issue 217
> >
> > In contrast to the RDF data model as defined in [RDF11-CONCEPTS],
> > JSON-LD allows blank nodes as property labels and graph names. Thus,
> > some data that is valid JSON-LD cannot be converted to RDF. This
> > feature may be removed in the future.
> 
> This notion appears a few other times.  As I mention in my review of 
> json-ld-api, I think we should say it *can* be converted, it just 
> requires Skolemizing.

Added that info already when I updated json-ld-api.


> Also, the At Risk phrasing should be more clear about what the change 
> might be.   Something like:  "Based on implementor feedback, the Working 
> Group may decide to prohibit the use of blank nodes as property labels 
> and graph names."

Will do tomorrow.


> > A JSON-LD document /MUST/ be a single node object or a
> > JSON array containing a set of one or more node objects
> > at the top level.
> 
> How about:   ... or a JSON array whose elements are each node objects.

Fixed


> >  B.1 Terms
> >  A term is a short-hand string that expands to an IRI
> >  or a blank node identifier.
> >  A term /MUST NOT/ equal any of the JSON-LD keywords.
> >  To avoid forward-compatibility issues, a term
> >  /SHOULD NOT/ start with an @ character as future versions of
> >  JSON-LD may introduce additional keywords.
> >  Furthermore, the term /MUST NOT/ be an empty string ("")
> >  as not all programming languages are able to handle empty
> >  property names.
> 
> This whole section concerns me.   Can a term contain a colon? Can it be 
> a plain colon?   Can it be an apostrophe?   Can it be a string of 2^32 
> ASCII NUL characters?   I rather doubt every implementation will allow 
> all of these, but some might, so there could be interoperability 
> problems.    And there should be tests in the test suite of all the 
> weird ones (but maybe there already are).

A term can be any valid JSON string except the empty string. So yes, it can
contain a colon, it can also be a plain colon. Any control character needs
to be escaped.


> > A JSON object is a node object
> > if it exists outside of a JSON-LD context and:
> >   * it does not contain the @value, @list, or @set keywords, and
> >   * it is not the top-most JSON object
> >     in the JSON-LD document consisting of no other members than
> >     @graph and @context.
> 
> Ah, I've seen this text before.  :-)    Maybe you've replied on that 
> already.    Short version: it'd help to give a name to those things 
> mentioned in that last bullet point, at least.  Maybe call them "binder 
> objects" or "envelope objects" or something like that.     Actually, I 
> think they should have their own section in the Advanced Topics.   (And 
> I've already said I don't think they should use the `@graph` keyword, but 
> I gather you decided against me on that.    I'll go check old emails 
> later, I hope.)

Yes, replied to this already. Lets discuss it in the thread.


> > the keys of the different node objects
> > are merged to create the properties of the resulting node.
>
> maybe s/are merged/need to be merged/ ?

Fixed


> > Keys in a node object that are not keywords
> > /MAY/ expand to an absolute IRI using the active context.
> 
> That use of "MAY" technically means that implementations have the option 
> of expanding them or not, right?  Maybe something more like: "Each key 
> can be classified as one of: (1) a keyword, (2) a keyword alias, (3) an 
> absolute IRI, (4) a relative IRI, convertable to an absolute IRI using 
> the active base, (5) a term which expands to an absolute IRI according 
> to the active context, or (6) a term which does not expand to an 
> absolute IRI, (7) a string which does not conform to the term syntax.   
> Keys of type (6) and (7) are ignored."

Does it? This spec isn't talking about implementations, it's talking about
JSON-LD the format. I think in that context it is OK to say that keys MAY
expand to an absolute IRI. Please note that a key cannot be a relative IRI.


> Actually, writing that makes clear my concern about terms above. How can 
> you tell a term from a relative IRI?   Isn't "foo" both? I'd suggest 
> that in json-ld relative IRI's be required to contain a "/" character 
> and terms be limited to c-identifier syntax.

Keys are never relative IRIs. They are either terms, absolute or compact
IRIs (@vocab may be used to set an "implicit" prefix for all keys that are
neither terms, absolute or compact IRIs).


> Also, class (6) keys might well be due to a typo -- is it okay to issue 
> warnings on class (6) and class (7) keys, instead of just ignoring them?

Of course, every implementation is free to issue warnings. However, a
JSON-LD won't raise an error and stop processing. It will ignore them and
continue processing.


> The value associated with the `@type` key /MUST/ be a term a compact IRI
> an absolute IRI, a relative IRI, or null.
>
> What does it mean for a `@type` to be null?   I don't see anything in the
spec about this case.

Just as every other key that is set to null - it is ignored. It's the same
as if it wouldn't have been there.


> > This section is non-normative.
> 
> It seems like there are too many of these....   I think.  How can most 
> of the document be non-normative?   For example, how am I supposed to 
> know what to do with `@index`?   If I'm writing a generic JSON-LD display 
> tool, do I have to convert it to RDF first?    If not, I'm going to have 
> to know what I'm supposed to do with `@index`.

Depends on what your tool is supposed to do. I personally wouldn't mind
making both Basic Concepts and Advanced Concepts normative.


> > Summarized these differences mean that JSON-LD is capable of
> > serializing any RDF graph or dataset and most, but not all, JSON-LD
> > documents can be transformed to RDF.
> 
> Yeah, I guess every RDF graph can be converted to JSON-LD with explicit 
> use of the rdf:first and rdf:rest properties.   Ugly, but technically 
> correct.

Right.


> And (again), I'd suggest that every JSON-LD document can be transformed 
> to RDF, but with a few losses in the process -- you may need to 
> Skolemize, you lose `@index` information, and any other "ignored" bits.

Could you please provide some concrete text (given that you weren't
completely satisfied with my change in json-ld-api). Thanks


Cheers,
Markus


[1]
https://github.com/json-ld/json-ld.org/commit/a81cacb84da3633c028f77d6045e7b
1dd038cb11



--
Markus Lanthaler
@markuslanthaler
Received on Monday, 1 April 2013 21:32:06 UTC