RE: More review of JSON-LD syntax

Thanks a lot Charles!

I’ve created ISSUE-230 [1] to keep track of this. I will make sure that all
discussion is also mirrored to the RDF WG mailing list.

[1] https://github.com/json-ld/json-ld.org/issues/230


--
Markus Lanthaler
@markuslanthaler



------ Original message -----
From: Charles Greer [mailto:cgreer@marklogic.com] 
Sent: Wednesday, March 13, 2013 7:38 PM
To: W3C RDF WG
Subject: More review of JSON-LD syntax

Hi all,

This email is a review of JSON-LD-SYNTAX as of 3/13/2013

https://dvcs.w3.org/hg/json-ld/raw-file/default/spec/latest/json-ld-syntax/i
ndex.html#data-indexing

Overall:

The document presents the syntax in a reasonably clear way.  The one
exception to this is the intersection of terms, absolute IRIs, compact IRIs
and relative IRIs.  In particular one might wish to rethink the use of
relative IRIs in here at all, they seem to be confusing or problematic every
time they come up, and don't seem to add to JSON-LD in any significant way. 
I've noted these places below.

Until I came to flattening, I thought that JSON-LD was subject to a lot of
the same problems as RDF/XML.  My concern had to do with manipulating
structures as JSON - if there are a lot of ways to represent something, then
one gets into a lot of issues with finding data within the structure. 
Flattening seems to get rid of most of those concerns - it should probably
be foregrounded as a good canonical representation if you can go that far.

Otherwise this review is mainly editorial nits:

The Nits:

"a way to disambiguate the keys used between multiple JSON documents by
mapping them to IRIs via a context,"

"keys used between sounds awkward to me (conflates identify with reference)
how about "shared among"?

1.1  I think this characterization of JSON-LD is incorrect:
"a serialization of Linked Data in JSON."  
>From what I'm reading, JSON-LD is a method for encoding linked data within
JSON documents and generating RDF from them.  While it's possible to create
JSON-LD documents that are serializations of linked data, the focus of this
document presents JSON-LD as a superset of RDF.  Many things about JSON-LD
rely on document scope, and a JSON-LD can contain much more than just the
RDF within.  You've probably gone over this point many times before, but
JSON-LD seems to be much more about authoring or incrementally creating
Linked-Data-ready JSON than it is about writing out Linked Data as JSON.

2. Design Goals
Expressiveness:  Repetitive use of 'to be able to express.'  You'll want to
reword one of those.  My sense is that syntax expresses a graph, but graphs
don't express a data model.

Zero-edits You have a missing reference "(see )."

5. Basic concepts
A note on 'serialization' above -- dereferencing contexts make JSON-LD
really different from other serializations of RDF.  Perhaps that's why
you've shied away from the term "RDF."  Maybe only documents that are fully
expanded/dereferenced actually conform to RDF.  It means that without the
ability to dereference a context, the JSON-LD document has different data in
it than it would were the context fully realized.

5.2 I find the introduction of relative IRIs disorienting here.  It's taken
up later in the document, but not completely; this paragraph has the only
mention of "base IRI" in the document, and the reference to 'directory path'
seems to just muddy the issue further.  In general the interaction between
relative IRIs and other terms seems to be a difficult part of this document
to understand.  As an example, it would seem that using @vocab would rid a
document of relative IRIs -- you might want to state that explicitly as a #5
at the end of this section "unmatched terms are relative IRIs"

6 Advanced Concepts

6.1 "vocabulariess" typo.  

On Compact IRIs, it surprises me that this is part of the normative
section.  I can see why it is, but nonetheless it might be useful to point
out why a separate syntax is part of this document, as opposed to an updated
version of CURIE.  (Please disregard this comment if I'm being silly).

If a prefix:suffix pattern is not matched in the context, is it a relative
IRI? (in 6.3 this is prohibited - we have a hole)

6.2 "native JSON type such as number, true, or false." Shouldn't this read
"number or boolean"  true and false aren't types but values.

"A value type specifies the unit of measurement"  This wording seems wrong. 
A date isn't a unit of measurement but it's still a range.  I can't think of
a better way of putting this though.  Also, I've never thought of 'meters'
as a value type.  I'd use a decimal-typed number to represent meters. 
Something is wrong with this notion.

6.3 You mention correctly that the homepage property is ordered in example
21.  It reads strangely because there's no mention yet in the doc about how
to order items.  Just parenthetically mentioning @list would help:

" property which explicitly represents an ordered list (with the @container
key)"

6.4
"last-defined-wins mechanism."  This looks more like a "most recently
defined" mechanism, because of nested scopes.  I could be misinterpreting
"last-defined-wins" though.

6.5  application/ld+json is introduced in a slightly jarring way.  Moreover,
there's a MUST stipulation attached to its usage, but later in the document
its usage is MAY identify a node.  I'm just confused by this paragraph.

Does use of @language in the context mean that it will be applied to ALL
strings in the document?  It looks like yes.  I'd put a big warning on this;
it's risky to assume.

6.6
Example 29 provides a method for identifying languages within key names.  I
see why this works, but you might consider removing it to encourage more
uniform language-tagging practice.  In other words, I'd prefer to see just
"occupation" as a key with the @container method.  I'm uncomfortable with so
many ways to handle language tags, even though what you've got is internally
consistent.

Note -- "Language associations can only be applied to plain literal strings.
Typed values or values that are subject to 6.3 Type Coercion cannot be
language tagged."  Does this mean that these invalid language keys are
ignored or raise an error?

6.14 Expanded Document Form and 6.15 compact form.  in api doc these are non
normative.  Perhaps you don't mean that the API doc defines them, just
refers to them?

Appendix A
I don't think a JSON-LD document serializes a collection of graphs.  Maybe
you can define a subset of JSON-LD that does, however.  Restrictions on
JSON-LD that make it serialized RDF might also help with document
identity/signing (no references to external contexts, no blank node
identifiers as graph names)

Just for my own edification, why MUST NOT? "A JSON-LD graph must not contain
unconnected nodes, i.e., nodes which are not connected by an edge to any
other node."

"A blank node is a node"... neither, nor, or.  There's some unclear
parallelism among these prepositions.

In Issue 217 box, please remove 'controversial' in favor or a less
controversial word.

"JSON-LD documents may contain data that cannot be represented by the data
model defined above. Unless otherwise specified, such data is ignored when a
JSON-LD document is being processed. This means, e.g., that properties which
are not mapped to an IRI or blank node will be ignored."  This statement
seems to allow for nodes without edges, but I guess the point is you won't
know they're nodes in that case?

Appendix B

"All keys which are not IRIs, compact IRIs, terms valid in the active
context, or one of the following keywords must be ignored when processed:"
This points to some problem with the concept of a relative IRI again.

I don't understand B.4.  Like Sandro I feel that there's something amiss
with data indexing.  It looks suspiciously like @rdf:resource.

I really appreciate the effort put into 'flattened view' and think it should
be foregrounded in the main body of the document.  It's even more important
than compaction I think.

B6 - must a list + set contain objects of all the same type?  You might want
to be explicit about an error if so.

I appreciate all of the examples in Appendix D a lot.

That wraps up what I've to say overall.  It was a pleasure to review this
document.

Charles

-- 
Charles Greer
Senior Engineer
MarkLogic Corporation
charles.greer@marklogic.com
Phone: +1 707 408 3277
www.marklogic.com

Received on Wednesday, 13 March 2013 19:09:57 UTC