More review of JSON-LD syntax from Charles Greer on 2013-03-13 (public-rdf-wg@w3.org from March 2013)

From: Charles Greer <cgreer@marklogic.com>
Date: Wed, 13 Mar 2013 11:38:21 -0700
To: W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <5140C79D.9000803@marklogic.com>

Hi all,

This email is a review of JSON-LD-SYNTAX as of 3/13/2013

https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#data-indexing

Overall:

The document presents the syntax in a reasonably clear way. The one
exception to this is the intersection of terms, absolute IRIs, compact
IRIs and relative IRIs. In particular one might wish to rethink the use
of relative IRIs in here at all, they seem to be confusing or
problematic every time they come up, and don't seem to add to JSON-LD in
any significant way. I've noted these places below.

Until I came to flattening, I thought that JSON-LD was subject to a lot
of the same problems as RDF/XML. My concern had to do with manipulating
structures as JSON - if there are a lot of ways to represent something,
then one gets into a lot of issues with finding data within the
structure. Flattening seems to get rid of most of those concerns - it
should probably be foregrounded as a good canonical representation if
you can go that far.

Otherwise this review is mainly editorial nits:

The Nits:

"a way to disambiguate the keys used between multiple JSON documents by
mapping them to IRIs
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-iri>
via a context
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-context>,"

"keys used between sounds awkward to me (conflates identify with
reference) how about "shared among"?

1.1 I think this characterization of JSON-LD is incorrect:
"a serialization of Linked Data in JSON."
From what I'm reading, JSON-LD is a method for encoding linked data
within JSON documents and generating RDF from them. While it's possible
to create JSON-LD documents that are serializations of linked data, the
focus of this document presents JSON-LD as a superset of RDF. Many
things about JSON-LD rely on document scope, and a JSON-LD can contain
much more than just the RDF within. You've probably gone over this point
many times before, but JSON-LD seems to be much more about authoring or
incrementally creating Linked-Data-ready JSON than it is about writing
out Linked Data as JSON.

2. Design Goals
Expressiveness: Repetitive use of 'to be able to express.' You'll want
to reword one of those. My sense is that syntax expresses a graph, but
graphs don't express a data model.

Zero-edits You have a missing reference "(see )."

5. Basic concepts
A note on 'serialization' above -- dereferencing contexts make JSON-LD
really different from other serializations of RDF. Perhaps that's why
you've shied away from the term "RDF." Maybe only documents that are
fully expanded/dereferenced actually conform to RDF. It means that
without the ability to dereference a context, the JSON-LD document has
different data in it than it would were the context fully realized.

5.2 I find the introduction of relative IRIs disorienting here. It's
taken up later in the document, but not completely; this paragraph has
the only mention of "base IRI" in the document, and the reference to
'directory path' seems to just muddy the issue further. In general the
interaction between relative IRIs and other terms seems to be a
difficult part of this document to understand. As an example, it would
seem that using @vocab would rid a document of relative IRIs -- you
might want to state that explicitly as a #5 at the end of this section
"unmatched terms are relative IRIs"

6 Advanced Concepts

6.1 "vocabulariess" typo.

On Compact IRIs, it surprises me that this is part of the normative
section. I can see why it is, but nonetheless it might be useful to
point out why a separate syntax is part of this document, as opposed to
an updated version of CURIE. (Please disregard this comment if I'm
being silly).

If a prefix:suffix pattern is not matched in the context, is it a
relative IRI? (in 6.3 this is prohibited - we have a hole)

6.2 "native JSON type such as number
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-number>,
true
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-true>,
or false
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-false>."
Shouldn't this read "number or boolean" true and false aren't types but
values.

"A value type specifies the unit of measurement" This wording seems
wrong. A date isn't a unit of measurement but it's still a range. I
can't think of a better way of putting this though. Also, I've never
thought of 'meters' as a value type. I'd use a decimal-typed number to
represent meters. Something is wrong with this notion.

6.3 You mention correctly that the homepage property is ordered in
example 21. It reads strangely because there's no mention yet in the
doc about how to order items. Just parenthetically mentioning @list
would help:

" property which explicitly represents an ordered list (with the
@container key)"

6.4
"last-defined-wins mechanism." This looks more like a "most recently
defined" mechanism, because of nested scopes. I could be
misinterpreting "last-defined-wins" though.

6.5 application/ld+json is introduced in a slightly jarring way.
Moreover, there's a MUST stipulation attached to its usage, but later in
the document its usage is MAY identify a node. I'm just confused by
this paragraph.

Does use of @language in the context mean that it will be applied to ALL
strings in the document? It looks like yes. I'd put a big warning on
this; it's risky to assume.

6.6
Example 29 provides a method for identifying languages within key
names. I see why this works, but you might consider removing it to
encourage more uniform language-tagging practice. In other words, I'd
prefer to see just "occupation" as a key with the @container method.
I'm uncomfortable with so many ways to handle language tags, even though
what you've got is internally consistent.

Note -- "Language associations can only be applied to plain literal
strings. Typed values or values that are subject to 6.3 Type Coercion
cannot be language tagged." Does this mean that these invalid language
keys are ignored or raise an error?

6.14 Expanded Document Form and 6.15 compact form. in api doc these are
non normative. Perhaps you don't mean that the API doc defines them,
just refers to them?

Appendix A
I don't think a JSON-LD document serializes a collection of graphs.
Maybe you can define a subset of JSON-LD that does, however.
Restrictions on JSON-LD that make it serialized RDF might also help with
document identity/signing (no references to external contexts, no blank
node identifiers as graph names)

Just for my own edification, why MUST NOT? "A JSON-LD graph must not
contain unconnected nodes, i.e., nodes which are not connected by an
edge to any other node."

"A blank node is a node"... neither, nor, or. There's some unclear
parallelism among these prepositions.

In Issue 217 box, please remove 'controversial' in favor or a less
controversial word.

"JSON-LD documents
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-json-ld-document>
/may/ contain data that cannot be represented by the data model
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-json-ld-data-model>
defined above. Unless otherwise specified, such data is ignored when a
JSON-LD document
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-json-ld-document>
is being processed. This means, e.g., that properties which are not
mapped to an IRI
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-iri>
or blank node
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-blank-node>
will be ignored." This statement seems to allow for nodes without
edges, but I guess the point is you won't know they're nodes in that case?

Appendix B

"All keys which are not IRIs
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-iri>,
compact IRIs
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-compact-iri>,
terms
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-term>
valid in the active context
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-active-context>,
or one of the following keywords
<https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/index.html#dfn-keyword>
/must/ be ignored when processed:" This points to some problem with the
concept of a relative IRI again.

I don't understand B.4. Like Sandro I feel that there's something amiss
with data indexing. It looks suspiciously like @rdf:resource.

I really appreciate the effort put into 'flattened view' and think it
should be foregrounded in the main body of the document. It's even more
important than compaction I think.

B6 - must a list + set contain objects of all the same type? You might
want to be explicit about an error if so.

I appreciate all of the examples in Appendix D a lot.

That wraps up what I've to say overall. It was a pleasure to review
this document.

Charles

--
Charles Greer
Senior Engineer
MarkLogic Corporation
charles.greer@marklogic.com
Phone: +1 707 408 3277
www.marklogic.com

Received on Wednesday, 13 March 2013 18:38:47 UTC