RE: sandro's review of json-ld-api from Markus Lanthaler on 2013-03-29 (public-rdf-wg@w3.org from March 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Fri, 29 Mar 2013 12:43:38 +0100
To: "'Sandro Hawke'" <sandro@w3.org>, "'W3C RDF WG'" <public-rdf-wg@w3.org>
Message-ID: <00b301ce2c72$a8ea62f0$fabf28d0$@lanthaler@gmx.net>
On Friday, March 29, 2013 4:58 AM, Sandro Hawke wrote:

I reviewed:
JSON-LD 1.0 Processing Algorithms and API

Thanks a lot for the review Sandro. I've created ISSUE-234 [1] for it. Sorry
for the HTML formatting but this was the easiest way to reuse my GitHub
comment

Non-editorial points:

1) I'm concerned about the restriction on lists of lists. I don't like the
idea that some RDF graphs can't be serialized in JSON-LD. I could see how
compacting them could be hard (nested type information...?) but why not at
least allow them in expanded form? 

This restriction exists only if you want to use arrays to represent the list
(@list). It does not exist if you represent the list as a linked list (I
assume that's what you mean by expanded form), i.e., a set of blank nodes
with first/rest properties. We have been discussing (ISSUE-75 [2]) whether
we allow identifiers for @list but decided to not do that. So currently the
only way around that restriction is really to represent the list as a set of
interlinked node objects.

Suggested fix: let's at least make this restriction At Risk, add some test
cases, and see how implementers fare with it. We don't even need to modify
the algorithms in the spec; we can just say "In the interest of space and
simplicity, the steps necessary for handling lists of lists have been
omitted. Such lists and their elements must, recursively, be handled like
other lists. NOTE this is an AT RISK feature. The Working Group might either
require handling of lists-of-lists or forbid them in JSON-LD. Implementers
please send reports of whether you are able to implement handling for
lists-of-lists or would instead request such structures be disallowed."

Sounds like a reasonable thing to do. 

2) The conformance classes don't seem quite right. Every "JSON-LD
Implementation" has to implement conversion to and from RDF? I don't really
see a need to force them to do that (and I don't think they will). Every
"JSON-LD Processor" has to be written in JavaScript (or some other language
for which a WebIDL binding currently exists)? That seems like a rather
counter-intuitive use of the word "processor"....

Suggested fix:
A JSON-LD Processor is a system which can perform the Expansion, Compaction,
and Flattening operations. JSON-LD Processors providing interfaces to
languages for which W3C Recommended WebIDL bindings exist ?MUST?SHOULD? use
the API defined in this specification [etc].

A JSON-LD Processor With RDF Conversion is a JSON-LD Processor that can also
perform Conversion to RDF and Conversion from RDF.

+1 very good point. Would like to hear the opinion of more people before I
make the change. I think we nevertheless would like to keep the two products
(Implementions and Processors). Naming is always is difficult and we
discussed this extensively. I would be ok with changing Implementations to
Processors and Processors to something else. The problem is that I can't
think of a good name for a "Processor exposing the specified JSON-LD API". 

(Note that WebIDL is still in CR; I've just asked what we're supposed to do
about that.)

OK. There are heaps of W3C specs using WebIDL, so that shouldn't be a
problem I think.

3) In Conformance it says:

This specification does not define how JSON-LD Implementations or Processors
handle non-conforming input documents. This implies that JSON-LD
Implementations or Processors MUST NOT attempt to correct malformed IRIs or
language tags; however, they MAY issue validation warnings.

But, um, no, I don't think it does imply that. If you don't say how systems
are to handle non-conforming input documents, then they are free to handle
it however they want, including by "repairing" them in various ways. If
you're forbidding repairing IRIs or language tags, then you're very much
saying how systems have to handle non-conforming input documents. Which is
it?

Good point. What we tried to say here was that IRIs and language-tagged tags
are not checked, not even in "well-formed" input documents. I think we could
simply drop the first part of this sentence, the algorithms validate the
input and throw errors if the input is non-conforming (except IRIs and
language tags).  <https://github.com/dlongley> @dlongley do you agree? 

Editorial points:

title: JSON-LD 1.0 Processing Algorithms and API 

Having read it now, I think I would title it "JSON-LD Operations" and have
the shortname be "json-ld-ops". The given algorithms are one way of
specifying the operations, but the key thing is the operations themselves,
not the particular algorithms used. I wouldn't mention the API in the title,
because it's kind of a natural thing to include with the operations, so it
doesn't need to be in the title. I don't really expect you to take this
advice, given how much is invested in the current framing, but I thought I
should share it.

What should I say? Maybe quoting Phil Karlton is best I can do here :-) 

"There are only two hard things in Computer Science: cache invalidation and
naming things."

This document outlines an Application Programming Interface and 
s/outlines/specifies/

s/an/a WebIDL/ 

a set of algorithms for programmatically transforming JSON-LD documents to
make them easier to work with in programming environments like those that
use JavaScript, Python, and Ruby.
I couldn't understand the end of this sentence until I came back to it
later. How about just:

"a set of operations for transforming JSON-LD documents into forms suitable
for different uses. "

I've rewritten the abstract recently. Could you please have a look at
<http://json-ld.org/spec/FCGS/json-ld-api/20130328/>
http://json-ld.org/spec/FCGS/json-ld-api/20130328/ and tell me if it is any
better. Thanks

How about another sentence like, "This document is a companion to [JSON-LD]
which should be read first."

It would be OK to me to add it, however, we already have the following
sentence in the introduction: "You must also understand the JSON-LD Syntax
[JSON-LD]"

The way JSON-LD allows Linked Data to be expressed in a way that is
specifically tailored to a particular person or application is by providing
context.

Awkward sentence. How about: JSON-LD uses "contexts" to allows Linked Data
to be expressed in a way that is specifically tailored to a particular
person or application.

Definitely clearer. Fixed in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

Similarly, another algorithm can be specified to subsequently apply any
context.

This is a very confusing sentence. I wonder if it wouldn't be helpful to
introduce a term like context-free. I dunno....

Would like to hear more opinions on this. 

localizing all information

This was utterly baffling until after I'd finished reading this section. I
suggest just dropping this phrase

Done in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

above mapped  <http://xmlns.com/foaf/0.1/nam> http://xmlns.com/foaf/0.1/nam
to name

missing an "e"

Fixed in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

Please note that the flattened and compacted result always explicitly
designates the default graph by the @graph member in the top-level JSON
object.

Difficult sentence. Took me about four tries to parse it. How about:
Please note that the result of flattening and compaction is always a JSON
object which contains an @graph key whose value is the default graph.

Fixed in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

While order is preserved in regular JSON arrays, it is not in regular
JSON-LD arrays unless specific markup is provided (see ).

s/markup/guidance/ (I don't think json data is "markup")

the "see" link is missing. I expect you mean:
<http://www.w3.org/TR/json-ld-syntax/#sets-and-lists>
http://www.w3.org/TR/json-ld-syntax/#sets-and-lists

Changed to "While order is preserved in regular JSON arrays, it is not in
regular JSON-LD arrays unless specifically defined (see Sets and Lists in
the JSON-LD specification [JSON-LD])" in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388. 

A a set of rules for interpreting a JSON-LD document as specified in The
Context of the

s/A a/

s/The Context/the Context/ (maybe?)

Fixed in
<https://github.com/json-ld/json-ld.org/commit/e2c6783845e945d2e5fedd11a56a1
c4b50acb901> e2c6783 and
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

blank node
A node in a JSON-LD graph that does not contain a de-referenceable
identifier

s/de-referenceable/global-scope/ or something like that. Consider the case
of tag: or urn:uuid: URIs, which are not de-referenceable but also would
make a node be non-blank.

Changed to "A node in a JSON-LD graph that is neither an IRI, nor a JSON-LD
value, nor a list." in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

in the JSON-LD Syntax specification [JSON-LD]
in The Context of the [JSON-LD] specification.

of the JSON-LD syntax specification [JSON-LD].

Fixed in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

to the syntax defined in [RFC3987].
language tag as defined by [BCP47]

It'd be nice to use a consistent style. Officially, W3C specs are supposed
to use this style:
... in JSON-LD 1.0 [JSON-LD] ...

 <http://www.w3.org/2001/06/manual/#citation>
http://www.w3.org/2001/06/manual/#citation

but it's not enforced and personally I think it's okay to just say "as
defined by [BCP47]" instead of "as defined by > Tags for Identifying
Languages [BCP 47]".

So I'd just say "in [JSON-LD]", I think.

I will try to make this consistent in both specs. 

A JSON object is a node object if it exists outside of the JSON-LD context
and: 
. it does not contain the @value, @list, or @set keywords, or
. it is not the top-most JSON object in the JSON-LD document consisting of
no other members than @graph and @context.

Wow. That's a serious IQ-test sentence.

I'm not sure this needs to be defined, but if it does, how about breaking it
down, like:

"Every JSON object in JSON-LD is classified as exactly one of: a node
object, a value object, a list object, a value object, a graph object, a
context, or ... [whatever else there might be]."

It's written this way because there's the "default graph object". I will
leave it as is for the time being. Please protest if you think this is
important to fix.

General Solution (many times in the document)

This term really threw me off, and doesn't seem right. I think you mean
"Algorithm Overview" or "Algorithm Summary" or "Algorithm Sketch". Since
it's always in an Algorithm section it could just be "Sketch" or "Informal
Summary".

Would be fine with changing it to something else. I like "Algorithm
Overview", "Informal Summary" or also just "Overview". Would like to hear
the opinion of more people before I make a change. 

Issue 217
RDF does not currently allow a blank node identifier to be used as a graph
name.

This shouldn't be an issue any more should it? How about make it a NOTE, and
add another line about how JSON-LD Processors can convert such blank nodes
to IRIs as per  <http://www.w3.org/TR/rdf11-concepts/#section-skolemization>
http://www.w3.org/TR/rdf11-concepts/#section-skolemization if they need to
produce valid RDF.

Would be fine for me. The reason there's still an issue marker in there is
to avoid a potential second last call. Thoughts? 

(Personal aside: this restriction in RDF is in my top-10 list of mistakes
make by Working Groups I've been a part of. I do my best to put them out of
my mind, but when I'm reminded of them, .... grrrr. Oh well.)

+1

In some cases, data exists natively in the form of triples or triples

I can't quite figure out what's meant. Maybe quads?

Neither do I.  <https://github.com/gkellogg> @gkellogg ?? In the meantime I
dropped the sentence in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

10.6 Data Round Tripping

This whole section was very confusing. Maybe add a paragraph at the start
saying what you're talking about. I could never figure out if you meant
round tripping (1) from RDF to JSON-LD and back to RDF or (2) from JSON-LD
to RDF and back to JSON-LD.

Not sure I understand the difference!?

There was also a lot of duplication of XSD -- where you're spelling out the
canonical forms -- but it's not clear whether you are just rephrasing the
other spec or mean to be changing something about it. I suggest in generally
it's best to not try to rephrase what other specs say.

We are just rephrasing it. Since this spec is addressing JSON developers we
wanted to avoid that they have to read the XML spec. What do others think
about this? 

The bits of javascript are nice, but are they really examples? Hm.

I would say so. It's just an example how this could be done in one specific
programming language.

Trying to make sense of this..... The point of this section seems to be to
say in going JSON->RDF you need to use the canonical form. Why would that
matter? I guess it would matter if when going from RDF->JSON you only
convert to native types when the lexical representation is in canonical
form. If that rule were in place, then I think datatypes would roundtrip
perfectly. I think. I'm not seeing that rule, though, in either this section
or the algorithm.

It is there to ensure that the result is deterministic and testing is
simplified (you can verify the result using simple string comparison).

Considering that, do you think we need to change something? 

When data such as decimals need to be normalized, JSON-LD authors should not
use values that are going to undergo automatic conversion. This is due to
the lossy nature of xsd:double values.

I can't quite make sense of this.

Is the word "normalized" confusing you? That's probably a left over from the
normalization algorithm. What we are trying to say here is: if you have
decimal values (e.g. money) you shouldn't use JSON number or a xsd:double
but a string. Maybe we can just drop this sentence!? 

When JSON-native numbers, are type coerced, lossless data round-tripping can
not be guaranteed as rounding errors might occur.

You mean in going RDF-JSON-RDF, if you have a literal like
"1.99999999999999999999999999999999E0"^^xs:double that it's like to get
messed up while in JSON double form? That's true. But what are you saying to
do about it? How about saying RDF->JSON converters MUST leave things like
that in expanded form? Then we'd have round-tripping RDF-JSON-RDF. However,
it would break JSON-RDF-JSON round tripping, if the JSON in question had a
number like 1.999999999999999999999999999999E0 in it. (of course, many JSON
parsers would mess that up right away; that's not really our fault that we
can't round trip that.)

Yes, we mean exactly that. You should use strings instead. In most cases
this won't matter and consequently I don't think the MUST you propose makes
much sense. JSON developers want numbers and not strings. Just out of
curiosity, isn't the same true in Turtle for instance?

1.	The Application Programming Interface

This API provides a clean mechanism that enables developers to convert
JSON-LD data into a a variety of output formats that are often easier to
work with

That sentence is a bit odd. How about:

This section defines an Application Programming Interface (API) using
WebIDL, so that software modules in languages for which WebIDL bindings
exist have a standard way to access a provided JSON-LD Processor. Processors
providing APIs for other languages SHOULD use an API similar to this one.

Would like to discuss this with others. I'm a bit concerned about the second
part. We had something like this in there before but dropped it based on a
feedback from Robin Berjon, see ISSUE-200 [3]. 

Pat Hayes, Sandro Hawke, and Richard Cyganiak or their input on the
specification.
s/or their/for their/

Wouldn't have expected someone actually reads that :-P
Fixed in
<https://github.com/json-ld/json-ld.org/commit/09c4388fe0edd6ae8a42170cf0223
636fe5c3f29> 09c4388

That's it. I hope these comments are helpful. I'll try to check out json-ld
next, and to stay attentive if you want to talk about any of my points, so
maybe this can still be published on the 4th.

Definitely, thanks a lot Sandro

 

Cheers,

Markus

 

 

[1] https://github.com/json-ld/json-ld.org/issues/234

[2] https://github.com/json-ld/json-ld.org/issues/75

[3] https://github.com/json-ld/json-ld.org/issues/200

 

 

 

--

Markus Lanthaler

@markuslanthaler
Received on Friday, 29 March 2013 11:44:16 UTC