Re: Review: JSON-LD Syntax from Gregg Kellogg on 2012-06-25 (public-rdf-wg@w3.org from June 2012)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Sun, 24 Jun 2012 22:00:32 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: RDF-WG <public-rdf-wg@w3.org>
Message-ID: <24BB513B-41B1-452B-B992-45142F4788C5@greggkellogg.net>

Andy, thanks for your feedback! I've addressed all these issues in the working copy of the syntax spec [1]. This was marked as issue #135, and the details can be found at [2].

[1] http://json-ld.org/spec/latest/json-ld-syntax/
[2] https://github.com/json-ld/json-ld.org/issues/135#issuecomment-6538396

The text of the response is repeated here:

I've addressed these issues in the noted commits, see below.

These are official RDF Syntax review comments by Andy Seaborne (@afs) via the RDF WG:

Major:

1/ Definitions

I agree with the intention of of making it accessible to the typical JSON application developer, but a narrative without clearly identified definitions means that it is difficult to look into the document to check specific details. It is also easily inconsistent as it is not clear when differentiating text is being descriptive or definitional. Example below.

I suggest keeping the syntax doc as-is and a separate formal-only document (or a separate top level section) for the times when arguing over details matters. Maybe this is a a proper appendix A but I think this is more EBNF; it would not be an appendix.
Problems will inevitablly come when the definitions differ. We do have an issue (#114) regarding expressing JSON-LD in EBNF, which should probably go in appendix A, which already contains an informal description of JSON-LD.

Example: the text in 4.5 and A.2 about @id are different.

4.5 The value of the @id key must be either a term, a compact IRI, or an absolute IRI.

A.2: "The value of @id must be null, a term, a compact IRI, or an IRI."
A.2 is wrong, I've updated to remove null as an acceptable value.

I actually see that one of the tests (compact-17) uses this form, which IMO is incorrect. To remove a property definition within a context, the property value should be null, not an object having an @id key which is null:

{
"@context": [
{
"comment": { "@id": "http://www.w3.org/2000/01/rdf-schema#comment", "@language": "en" }
},
{
"comment": { "@id": null },
"comment_en": { "@id": "http://www.w3.org/2000/01/rdf-schema#comment", "@language": "en" }
}
]
}

should be

{
"@context": [
{
"comment": { "@id": "http://www.w3.org/2000/01/rdf-schema#comment", "@language": "en" }
},
{
"comment": null,
"comment_en": { "@id": "http://www.w3.org/2000/01/rdf-schema#comment", "@language": "en" }
}
]
}

I've fixed this test as well. Fixed in commit 7c2b3e6.

Example: Is this a legal JSON-LD doc:

{ "@id" : "http://example/thing" }
It is valid according to the processing rules, but does not express a triple or quad.

where do I look?
The EBNF should make this clear, something like the following:

SubjectDefinition ::= '{' OptContext DefPropertyObjectList '}'
DefPropertyObjectList ::= (PropertyObjectList ',')* '"@id"' ':' string (',' PropertyObjectList)*
PropertyObjectList ::= property ':' object (',' PropertyObjectList)*

Although, that's not LL(1).

I've updated the informatl Authoring Guidelines to use the more accurate subject definition rather than JSON object to clarify this.

Fixed in commit d86ea1f.

As the document stands, sorting this out is, for me, a block on LC - too much risk of having to make a substantive and having to restart the LC cycle.

2/ The split between basic concepts and advanced concepts did not work for me.

2a/ Integers as an advanced concept but sets and lists as basic.
Agreed, moved this to advanced concepts.

Fixed in commit dbd00cc.

2b/ Using HTTP header Link header seems very important.
This is actually a secondary usage, for taking a normal JSON document and having it interpreted as JSON-LD. The primary use does not involve the use of a describedby link header.

Other comments

Apologies that the comments are not in document order nor in priority order. In checking them I found myself having to jump about the doc to try to find definitions (see major comment). As different, and seeming identical pieces of text were different in the details, it got messy.

I'm sure I've got some of these comments wrong because of the difficulty in being able to find reference material and so running out of time.

3/ Is the test suite also transferring? It cover both material that is to be migrated and material that is not.

compact: 20
expand: 29
frame: 23
from-RDF: 8
to-RDf: 31
normalization: 57

168 tests; 50% (~80) of which are framing and normalization.
Given the state of the spec, the fact that we have any tests in a test suite is a pretty good thing. AFAIK, the Turtle test suite hasn't changed substantively for this version.

We need to surface the individual tests better on json-ld.org so that they can serve as examples.

At this point, only Compact, Expand, fromRDF, and toRDF suites will come over.

4/ Status of bNodes.

Where are BNode labels allowed? BNodes labels don't get discussed much (fine) but for some of the text that lists possible syntax forms at a given point, don't include them.
For the purposes of this spec, we will consider an unlabeled node (or blank node) identifier to be an Absolute IRI, making a BNode legal anywhere an absolute IRI is expected.

Note that the document says this in section 4.1 on Compact IRIs:

If the prefix is an underscore (_), the IRI remains unchanged. This effectively means that every term containing a colon will be interpreted by a JSON-LD processor as an IRI

4.5 The value of the @id key must be either a term, a compact IRI, or an absolute IRI.
does not include a bNode label (unless "_:a" is an absolute IRI, which it isn't).
As noted, it's treated as an IRI within the spec.

and

A subject definition that does not contain an @id property is called an unlabeled node.
is confusing as there is another way to be an unlabeled node.
Yes, a subject definition without an @id is called an unlabeled node, and the _:a form is called an unlabeled node identifier.

It's not correct to say that a subject definition without an @id is an unlabeled node, as the subject definition mearly defines node properties, where the node may be identified using @id.

I changed this to say "A subject definition that does not contain an @id property defines properties of an unlabeled node."

This is fixed in commit 9fe6282.

5/ Sec 3.1 Linking Data

We as a group need to review this section

e.g. ""A property should be labeled with an IRI.""

Are there any examples of a Linked Data document that are not RDF or which can't be viewed as RDF?
Anything which uses an unlabeled node as a type, property or datatype, but I don't think it's worth calling that out.

6/ sec 3.1.1

The Web uses IRIs for unambiguous identification. The idea is that these terms mean something that may be of use to other developers and that it is useful to give them an unambiguous identifier. That is, it is useful for terms to expand to IRIs so that developers don't accidentally step on each other's vocabulary terms.
"vocabulary term" is confusing - I read that as properties and classes, not all things. Unambiguity of things matters.
I changed this to vocabulary term or other resource.

7/ "Linked Data document" isn't a defined term

An IRI that is a label in a linked data graph should be dereferencable to a Linked Data document describing the labeled subject, object or property.
Section 3.1 on linking data starts of by saying:

Linked Data is a set of documents, each containing a representation of a linked data graph.
>From this, I think a reasonable interpretation of "Linked Data document" comes from this set. As it's only used a couple of times, I'm not sure it warrents it's own definition.

and datatype?
Changed to type in 5daa9c7.

8/ It's big.

The Syntax doc is as big as RDF/XML by page count currently. I know this has been said before but the concept of "JSON API" leads me to expect something shorter.

The change in ReSpec means it prints badly. It grew 6 pages for me just on the ReSpec change. I know that isn't in the CG control but it does not help the impression that it's a big spec. It is very bad in the JSON-API doc - the method descriptions are forced into cols of about 10 chars.
This has been discussed elsewhere, and is something that should be further discussed before LC. No specific action in the document at this point.

9/ Compact IRIs

Terms are interpreted as compact IRIs if they contain at least one colon and the first colon is not followed by two slashes (//, as in http://example.com).
Why are http URIs handled diferently to URNs?

"urn:isbn:978-0-521-87625-4"
"urn:uuid:7962241c-2a01-11b2-8057-b443860cde7a"
"og:video:type"
"_:a"
This comes from work in RDFa, where it was found that the unintentional definition of a prefix which was the same as an IRI scheme could cause unexpected behavior. This was a mechanism adopted to resolve this issue, we we're consistent with RDFa.

But BNode labels are IRIs: In "Compact IRIs" it says:

If the prefix is an underscore (_), the IRI remains unchanged.
Again, pretty much the same as how RDFa handles CURIEs.

10/ Sec 3.3: example:

This looks exactly like the situation in the previous section around "homepage".
A complete example would be better.
Expanded based on previous example in c4a65a7.

12/

The value of a @graph property must be null, an IRI, or a JSON object.
Was a compact IRI also intended? I assume so but it does not say that. Another way to put it, when is the spec language about syntax and when is it about concepts?
This should be subject definition or array of zero or more subject definitions. We removed the option to use an IRI earlier.

Fixed in 2066a2c.

Ditto @context - can a @context take a compact IRI (layering of @contexts)? Maybe it's a odd case but why make it asymmetric - an implementation wants a "convert this" function, not "convert1", "convert2", etc.
"string expanding to an IRI". Note that it can be relative.

Fixed in b1b253b.

13/ Sec 4.9: Named Graphs

The definition is for "graph" not "named graph".
The first example isn't a named graph.
This is use of @graph to describe resources in the default graph.

I made this more explicit in 7d323da.

14/ The longer example in 4.9:

What is the subject of the asOf?
(If it's the graph URI, we have the problem with naming of g-snaps and g-boxes).
asOf is a property having the value of @id as a subject in the default graph.

The value of @id is also the name of the named graph.

These examples could be expanded using TriG, but we've avoided doing that so far. I added an issue marker to54844da.

15/ I found the use of "@type" for datatypes confusing. I prefer @dtype.
The spec previously used @datatype, but the group decided to unify these to a single @type. The discussion and resolution are described here: http://json-ld.org/minutes/2012-04-24/#resolution-3.

16/ Appendix B: To and From JSON-LD

>From and To what?

s/proof/evidence/
Changed to "Relationship to other RDF Formats" in 95c1f104bf991093790f7f189a9d1bc4af4f2483.

17/ Correction:

s/@subject/@id/
Thanks! That's been around a while!

Syntax error:

second example of 4.3 has several missing or misplaced commas:

{
"@context":
[
"http://json-ld.org/contexts/person.jsonld",
{
"foaf": "http://xmlns.com/foaf/0.1/"
},
"http://json-ld.org/contexts/event.jsonld" ,
Remove:^^^
] ,
Add^^^
"name": "Manu Sporny",
"homepage": "http://manu.sporny.org/",
"depiction": "http://twitter.com/account/profile_image/manusporny" ,
Add^^^
"celebrates":
{
"@type": "Event",
"description": "International Talk Like a Pirate Day",
"date": "R/2011-09-19"
}
}

and it does not use "foaf:" which is a bit confusing.
Fixed in 72223bd8803f9c39aa75cc54b3e71cabf00d06ec.

We need to add a pass which evaluates the examples to validate they're legal; which would be simpler if we could use the script tag, but we add formatting to the examples which makes this more difficult.

Received on Monday, 25 June 2012 02:01:20 UTC