RE: regarding action 241 [review JSON-LD 1.0 Processing Algorithms and API] from Markus Lanthaler on 2013-04-01 (public-rdf-wg@w3.org from April 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Mon, 1 Apr 2013 15:40:55 +0200
To: <public-rdf-wg@w3.org>
Cc: "'Zhe Wu'" <alan.wu@oracle.com>
Message-ID: <00d401ce2ede$8a720a30$9f561e90$@lanthaler@gmx.net>
On Monday, April 01, 2013 6:22 AM, Zhe Wu wrote:

> Hi,
> 
> I have scanned through the following document a few times.
> 
> http://json-ld.org/spec/latest/json-ld-api/index.html
> 
> I feel that it is difficult for me to conduct an effective review of
> the algorithms explained
> in this document. It's quite obvious that the authors and editors have
> put in an impressive
> amount of work in this document and they have produced quality
> contents.
> One problem is, however, that the algorithm outlines are a bit flat and
> a bit too long in
> structure. Take theExpansion algorithm for example, the print out (even
> with a very small
> font) spans across three pages.There are many, many lines of bullets in
> this algorithm alone.
> Given such a structure,I don't think I have an effective way to
> validate the soundness or completeness of this algorithm.

First of all, thanks for your efforts. I've created ISSUE-236:

https://github.com/json-ld/json-ld.org/issues/236

Are the algorithms really that difficult to understand? Did you try to read
them? We provide a short, high-level description of what's supposed to
happen for each algorithm. After reading that, it should be clear what's
supposed to happen. I agree that the algorithms itself are long, but that's
because we wanted to be very precise.


> It is probably a good idea, in my opinion, to re-organize the
> algorithms in a more modularized fashion. Each module may take
> half a page or 3/4 of a page (or less than 30~40 lines long).
> Such a re-organization will likely make reviews easier. More
> importantly, it will be easier and less error-prone for
> end-users and vendors to understand and implement the algorithms.
> A couple of high-level architecture diagrams will also help.

That might be true, but at the same time you will have to jump between
different algorithms. You will have to remember what data was (possible)
passed and what data might come back from a (sub)algorithm. It's of course
always a trade-off. In a lot of place we preferred to structure the
algorithm using if-blocks with indented sub-steps. In a complete
implementation you might wanna put those statements in a separate function
but when reading it I generally find it easier is everything is in one
place.


> If the authors and editors are willing to re-organize the algorithm
> outlines, I will be happy to review those algorithms in depth.

I doubt that such a reorganization will improve things but are open to be
convinced to the contrary.


> I noticed a few issues along the way. Most of them are editorial in
> nature.
> 
> - In Abstract,
>    "make them easier to work with"  ==> "make them easy to work with"

The abstract has already been changed based on Sandro's feedback.


> 
> - In Section 2,
>    "outlines a syntax that may be used to express Linked Data in JSON"
> 
>    This sounds a bit too soft. The following might be more precise.
>    "defines a syntax to express Linked Data in JSON"

Fixed in 5fbf4af [1]


> - In 2.1
>    "express a property and array to ..."==>
>    "express a property and an array to ..."
> 
>    "against the examples provided above results in" ==>
>    "against the above examples results in" ==>

Fixed in 5fbf4af [1]


> - Section 3 Conformance states that
>    "This specification does not define how JSON-LD Implementations ...
> handle non-conforming input documents.
>    This implies that JSON-LD Implementations ... MUST NOT attempt to
> correct ..."
> 
>    It is normal that a SPEC does not define behaviors for out-of-scope
> or illegal input.
>    I fail to see it implies that a specific implementation must not do
> something extra to handle illegal input.

We will likely drop the first sentence because the algorithms do handle all
corner cases. However, we'll keep the second sentence (MUST NOT attempt).
This ensures that the outcome of every implementation is the same. Specific
implementations are of course free to handle illegal input (malformed IRIs
and language-tags) but that has to be done at a different layer.


> - In Section 4, I got the feeling that an edge (property) can be
> labeled with a bNode identifier.
>    Is this a good idea? If so, for what applications?

This has been discussed several times now. JSON-LD is more liberal than RDF
in this regard. I don't wanna reopen this discussion here again. If you want
to do so, please create a separate thread. Thanks for you understanding.


> - "A compact IRI is has the form" =>
> "A compact IRI has the form"

Fixed in 5fbf4af [1]


> - In Section 6.1,
>    "Then we normalize the form the passed local context to an array."
> =>
>    "Then we normalize the form of the passed local context to an
> array."

Fixed in 5fbf4af [1]


> - In Section 7.2
>    "whose values is the result of ..." =>
>    "whose values are the result of ..."

Fixed to "whose value is" in 5fbf4af [1]


> - Section 9.1 describes Flattening Algorithm states "This resulting
> uniform shape
>    of the document, may drastically simplify the code required to
> process JSON-LD data..."
> 
>    This makes me wonder, out of personal preference, isn't N-TRIPLE the
> ultimate uniform
>    (and consistent) shape?

It is similarly uniform (and consistent) but can't, obviously, be parsed as
JSON.


> - Section 10.4,
>    "... in the form of triples or triples" =>
>    "... in the form of triples" =>

Has already been fixed based on Sandro's feedback.


> - Section 11 talks about API. It may be a good idea to move API before
> all those detailed algorithms.
>    I suspect there will be more users who are interested in APIs than
> in those algorithms.

I completely agree and that's what we did initially. When we brought the
spec into the RDF WG concerns were raised that the RDF WG shouldn't
standardize APIs and the algorithms should be the main aspect of the spec.
We tried to find a compromise because we (the JSON-LD CG) believes that an
API is crucial for adoption and the result is that the API has been moved to
the end of the document.


> - 11.1,
>   "The JSON-LD Processor interface is the high-level ..." =>
>   "The JSON-LD Processor interface is a high-level ..." =>

No change. In this context (the JSON-LD API) I believe the "the" is more
appropriate. Here's the complete sentence:

"The JSON-LD Processor interface is the high-level programming structure
that developers use to access the JSON-LD transformation methods."


> - Section 11 uses crosses and checks for Nullable and Optional
> settings.
> It is a bit more formal to
>    use TRUE/FALSE, YES/NO, or just Y/N.

This is generated completely by ReSpec and is, as such, consistent with what
a lot of other W3C specifications do. I don't think we'll need to change
this.


> - 11.2, is it possible to combine multiple callbacks (to save round
> trips)?

No. I'm not sure I understand what you mean by round-trips.. perhaps
invocations. The JsonLdCallback is called exactly once per operation; the
LoadContextCallback is called once per remote context (it would technically
be possible to pass it a number of remote contexts to resolve at once, but
that would just complicate things); and the ContextLoadedCallback is called
once as soon as a remote context has been loaded resp. an error while
loading it was detected.

Does this address your question?


> - 11.3,
>    "the value of input if it is a IRI"=>
>    "the value of input if it is an IRI"

Fixed in 5fbf4af [1]


> - For JsonLDErrorCode, it may be a good idea to assign numeric error
> codes to be used together with
>    descriptive error codes. A developer or a user can then say "hey, I
> got a 500!"

That's might be true. In the end, the result will be the same. Robin Berjon
reviewed the API spec [2] some time ago and advocated the use of descriptive
strings (instead of constant like strings like INVALID_VALUE). In the end,
it's a person preference I would say.


Thanks again for your efforts. Let me know if my answers did resolve you
concerns (apart the algorithm re-organization).


Cheers,
Markus

[1]
https://github.com/json-ld/json-ld.org/commit/5fbf4af2d480e8ae6e4a7c59869f06
96c5405cb3
[2] https://github.com/json-ld/json-ld.org/issues/200



--
Markus Lanthaler
@markuslanthaler
Received on Monday, 1 April 2013 13:41:29 UTC