second review of json-ld from Sandro Hawke on 2013-03-29 (public-rdf-wg@w3.org from March 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 29 Mar 2013 11:24:39 -0400
To: W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <5155B237.8000407@w3.org>

This is a partial follow-up review of json-ld. Here I'm reviewing:

JSON-LD 1.0
A JSON-based Serialization for Linked Data
[prepared as] W3C Working Draft 04 April 2013
https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html

Summary: more of the same - mostly editorial - a few issues that will
hopefully be simple to review. I'm not quite done, but may have to
stop for a day or two, so I'm sending this along now.

Details:

Simply speaking, a context is used to map terms
<cid:part1.09020209.02030904@w3.org>, to IRIs
<cid:part2.07000407.04050006@w3.org>.

s/terms,/terms/

and types that do not match a term
<cid:part1.09020209.02030904@w3.org> or are neither a compact IRI
<cid:part4.08090808.05050901@w3.org> nor

s/or are neither/and are neither/

If multiple embedded JSON-LD documents are extracted as RDF, the
result is the RDF merge of the extracted datasets.

Alas, there is no defined way to merge RDF datasets.

The problem is that sometimes it's obvious that the merge of
<g> { <a> <b> 1 }
and
<g> { <a> <b> 2 }
is
<g> { <a> <b> 1,2 }
and sometimes it's obvious the two can't be merged because they
contradict each other.

See: http://www.w3.org/2011/rdf-wg/track/issues/17
RESOLVED: close issue-17 <http://www.w3.org/2011/rdf-wg/track/issues/17>
-- there is no general purpose way to merge datasets; it can only be
done with external knowledge.

Proposed solution is to define it here, something like: If multiple
embedded JSON-LD documents are extracted as RDF, the result is a dataset
formed by merging all the graphs that have the same name (and thus
making a single named graph per graph name) and all the default graphs
(to make one resulting default graph).

Figure 1: An illustration of JSON-LD's data model.

Broken image link.

More importantly, the diagram is both misleading and wrong. It's
misleading in that each of the nodes is shown as being in exactly one
graph; nodes are actually allowed to be in multiple graphs, and nearly
always are. It's wrong in that it shows two arcs that aren't in any
graph, when actually every arc has to be in one or more graphs.

I haven't managed to produce a good drawing of this. Sometimes I think
of it as color-coding arcs, like this:

http://www.w3.org/Consortium/Offices/Presentations/RDFTutorial/figures/AnimMerge8.png

and somtimes I think of it as layers:

http://www.flickr.com/photos/danbri/3472944745/
http://farm4.static.flickr.com/3613/3384528143_8304792836_b.jpg

although I image the layers closer together, like transparent sheets of
plastic, each with writing on them.

Whenever possible, the graph name
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-graph-name>
/SHOULD/ be an IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-iri>.

s/possible/practical/ (I think)

At Risk

I'm a little lost in the AT RISK features. Can we do it like this:
http://www.w3.org/TR/2009/CR-owl2-syntax-20090611/#atRisk1 ? So each
at-risk feature is identified separately from where it occurs in the
specs, on a wiki page (rdf-wg/wiki/JSON-LD_Features_at_Risk or
something). And each time it comes up in the specs, that is
referenced, along with a clear explanation for people who've never heard
of this little feature of the W3C process.

Within the JSON-LD syntax these edge labels are called properties.

Actually, you use the term somewhat inconsistently -- sometimes you call
those labels "property names" and sometimes you call them "property
labels". I'm not sure this is worth fixing -- I'm probably being
overly pedantic to mention it -- but in RDF they'd be considered
property names. The property itself is the thing denoted by the IRI. I
think in general it's fine to call these things "properties" (and skip
over the detail that they are property names), but maybe in the formal
model it's better to be precise.

Issue 217 <https://github.com/json-ld/json-ld.org/issues/217>

In contrast to the RDF data model as defined in [RDF11-CONCEPTS
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#bib-RDF11-CONCEPTS>],
JSON-LD allows blank nodes as property labels and graph names. Thus,
some data that is valid JSON-LD cannot be converted to RDF. This
feature may be removed in the future.

This notion appears a few other times. As I mention in my review of
json-ld-api, I think we should say it *can* be converted, it just
requires Skolemizing.

Also, the At Risk phrasing should be more clear about what the change
might be. Something like: "Based on implementor feedback, the Working
Group may decide to prohibit the use of blank nodes as property labels
and graph names."

A JSON-LD document
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-json-ld-document>
/MUST/ be a single node object
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>
or a JSON array
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-array>
containing a set of one or more node objects
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>
at the top level.

How about: ... or a JSON array
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-array>
whose elements are each node objects
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>.

B.1 Terms

A term is a short-hand string
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-string>
that expands to an IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-iri>
or a blank node identifier
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-blank-node-identifier>.

A term
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-term>
/MUST NOT/ equal any of the JSON-LD keywords
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-keyword>.

To avoid forward-compatibility issues, a term
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-term>
/SHOULD NOT/ start with an |@| character as future versions of
JSON-LD may introduce additional keywords
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-keyword>.
Furthermore, the term /MUST NOT/ be an empty string
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-string>
(|""|) as not all programming languages are able to handle empty
property names.

This whole section concerns me. Can a term contain a colon? Can it be
a plain colon? Can it be an apostrophe? Can it be a string of 2^32
ASCII NUL characters? I rather doubt every implementation will allow
all of these, but some might, so there could be interoperability
problems. And there should be tests in the test suite of all the
weird ones (but maybe there already are).

A JSON object
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-json-object>
is a node object
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>
if it exists outside of a JSON-LD context
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-context>
and:

* it does not contain the |@value|, |@list|, or |@set| keywords, and
* it is not the top-most JSON object
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-json-object>
in the JSON-LD document consisting of no other members than
|@graph| and |@context|.

Ah, I've seen this text before. :-) Maybe you've replied on that
already. Short version: it'd help to give a name to those things
mentioned in that last bullet point, at least. Maybe call them "binder
objects" or "envelope objects" or something like that. Actually, I
think they should have their own section in the Advanced Topics. (And
I've already said I don't think they should use the @graph keyword, but
I gather you decided against me on that. I'll go check old emails
later, I hope.)

the keys of the different node objects
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>
are merged to create the properties of the resulting node
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node>.

maybe s/are merged/need to be merged/ ?

Keys in a node object
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-node-object>
that are not keywords
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-keyword>
/MAY/ expand to an absolute IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-absolute-iri>
using the active context
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-active-context>.

That use of "MAY" technically means that implementations have the option
of expanding them or not, right? Maybe something more like: "Each key
can be classified as one of: (1) a keyword, (2) a keyword alias, (3) an
absolute IRI, (4) a relative IRI, convertable to an absolute IRI using
the active base, (5) a term which expands to an absolute IRI according
to the active context, or (6) a term which does not expand to an
absolute IRI, (7) a string which does not conform to the term syntax.
Keys of type (6) and (7) are ignored."

Actually, writing that makes clear my concern about terms above. How can
you tell a term from a relative IRI? Isn't "foo" both? I'd suggest
that in json-ld relative IRI's be required to contain a "/" character
and terms be limited to c-identifier syntax.

Also, class (6) keys might well be due to a typo -- is it okay to issue
warnings on class (6) and class (7) keys, instead of just ignoring them?

The value associated with the |@type| key /MUST/ be a term
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-term>,
a compact IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-compact-iri>,
an absolute IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-absolute-iri>,
a relative IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-relative-iri>,
or null
<https://dvcs.w3.org/hg/json-ld/raw-file/a1bc3776ed3a/spec/WD/json-ld-syntax/20130404/index.html#dfn-null>.

What does it mean for a @type to be null? I don't see anything in the
spec about this case.

/This section is non-normative.
/

It seems like there are too many of these.... I think. How can most
of the document be non-normative? For example, how am I supposed to
know what to do with @index? If I'm writing a generic JSON-LD display
tool, do I have to convert it to RDF first? If not, I'm going to have
to know what I'm supposed to do with @index.

Summarized these differences mean that JSON-LD is capable of
serializing any RDF graph or dataset and most, but not all, JSON-LD
documents can be transformed to RDF.

Yeah, I guess every RDF graph can be converted to JSON-LD with explicit
use of the rdf:first and rdf:rest properties. Ugly, but technically
correct.

And (again), I'd suggest that every JSON-LD document can be transformed
to RDF, but with a few losses in the process -- you may need to
Skolemize, you lose @index information, and any other "ignored" bits.

-- Sandro

Received on Friday, 29 March 2013 15:25:00 UTC