Re: regarding action 241 [review JSON-LD 1.0 Processing Algorithms and API]

Hi Markus,

Please see my response inline.

On 4/1/2013 6:40 AM, Markus Lanthaler wrote:
> On Monday, April 01, 2013 6:22 AM, Zhe Wu wrote:
>
>> Hi,
>>
>> I have scanned through the following document a few times.
>>
>> http://json-ld.org/spec/latest/json-ld-api/index.html
>>
>> I feel that it is difficult for me to conduct an effective review of
>> the algorithms explained
>> in this document. It's quite obvious that the authors and editors have
>> put in an impressive
>> amount of work in this document and they have produced quality
>> contents.
>> One problem is, however, that the algorithm outlines are a bit flat and
>> a bit too long in
>> structure. Take theExpansion algorithm for example, the print out (even
>> with a very small
>> font) spans across three pages.There are many, many lines of bullets in
>> this algorithm alone.
>> Given such a structure,I don't think I have an effective way to
>> validate the soundness or completeness of this algorithm.
> First of all, thanks for your efforts. I've created ISSUE-236:

You are quite welcome!

>
> https://github.com/json-ld/json-ld.org/issues/236
>
> Are the algorithms really that difficult to understand? Did you try to read
> them? We provide a short, high-level description of what's supposed to
> happen for each algorithm. After reading that, it should be clear what's
> supposed to happen. I agree that the algorithms itself are long, but that's
> because we wanted to be very precise.

Each bullet in the algorithms was written clearly. However, once there are so many bullets, it becomes
difficult to see if there is anything missing or if there is anything redundant.


>> It is probably a good idea, in my opinion, to re-organize the
>> algorithms in a more modularized fashion. Each module may take
>> half a page or 3/4 of a page (or less than 30~40 lines long).
>> Such a re-organization will likely make reviews easier. More
>> importantly, it will be easier and less error-prone for
>> end-users and vendors to understand and implement the algorithms.
>> A couple of high-level architecture diagrams will also help.
> That might be true, but at the same time you will have to jump between
> different algorithms. You will have to remember what data was (possible)
> passed and what data might come back from a (sub)algorithm. It's of course
> always a trade-off. In a lot of place we preferred to structure the
> algorithm using if-blocks with indented sub-steps. In a complete
> implementation you might wanna put those statements in a separate function
> but when reading it I generally find it easier is everything is in one
> place.
>

This is becoming a bit subjective I guess. For someone who have designed such an
algorithm and possibly even implemented it, a long list of details may work the best.
For others, I suspect a break-down into modules may be a bit more intuitive.

Speaking from my own experience, in enterprise software development, I rarely see an
algorithm description or a design document structured in a similar fashion.

>> If the authors and editors are willing to re-organize the algorithm
>> outlines, I will be happy to review those algorithms in depth.
> I doubt that such a reorganization will improve things but are open to be
> convinced to the contrary.

If such a reorganization incurs a big headache (for which you and other authors/editors are the judge),
then we probably can skip it :)  Looks like Sandro has reviewed the same document so I am willing
to put my trust in him ;)

Thanks,

Zhe

>
>> I noticed a few issues along the way. Most of them are editorial in
>> nature.
>>
>> - In Abstract,
>>     "make them easier to work with"  ==> "make them easy to work with"
> The abstract has already been changed based on Sandro's feedback.
>
>
>> - In Section 2,
>>     "outlines a syntax that may be used to express Linked Data in JSON"
>>
>>     This sounds a bit too soft. The following might be more precise.
>>     "defines a syntax to express Linked Data in JSON"
> Fixed in 5fbf4af [1]
>
>
>> - In 2.1
>>     "express a property and array to ..."==>
>>     "express a property and an array to ..."
>>
>>     "against the examples provided above results in" ==>
>>     "against the above examples results in" ==>
> Fixed in 5fbf4af [1]
>
>
>> - Section 3 Conformance states that
>>     "This specification does not define how JSON-LD Implementations ...
>> handle non-conforming input documents.
>>     This implies that JSON-LD Implementations ... MUST NOT attempt to
>> correct ..."
>>
>>     It is normal that a SPEC does not define behaviors for out-of-scope
>> or illegal input.
>>     I fail to see it implies that a specific implementation must not do
>> something extra to handle illegal input.
> We will likely drop the first sentence because the algorithms do handle all
> corner cases. However, we'll keep the second sentence (MUST NOT attempt).
> This ensures that the outcome of every implementation is the same. Specific
> implementations are of course free to handle illegal input (malformed IRIs
> and language-tags) but that has to be done at a different layer.
>
>
>> - In Section 4, I got the feeling that an edge (property) can be
>> labeled with a bNode identifier.
>>     Is this a good idea? If so, for what applications?
> This has been discussed several times now. JSON-LD is more liberal than RDF
> in this regard. I don't wanna reopen this discussion here again. If you want
> to do so, please create a separate thread. Thanks for you understanding.
>
>
>> - "A compact IRI is has the form" =>
>> "A compact IRI has the form"
> Fixed in 5fbf4af [1]
>
>
>> - In Section 6.1,
>>     "Then we normalize the form the passed local context to an array."
>> =>
>>     "Then we normalize the form of the passed local context to an
>> array."
> Fixed in 5fbf4af [1]
>
>
>> - In Section 7.2
>>     "whose values is the result of ..." =>
>>     "whose values are the result of ..."
> Fixed to "whose value is" in 5fbf4af [1]
>
>
>> - Section 9.1 describes Flattening Algorithm states "This resulting
>> uniform shape
>>     of the document, may drastically simplify the code required to
>> process JSON-LD data..."
>>
>>     This makes me wonder, out of personal preference, isn't N-TRIPLE the
>> ultimate uniform
>>     (and consistent) shape?
> It is similarly uniform (and consistent) but can't, obviously, be parsed as
> JSON.
>
>
>> - Section 10.4,
>>     "... in the form of triples or triples" =>
>>     "... in the form of triples" =>
> Has already been fixed based on Sandro's feedback.
>
>
>> - Section 11 talks about API. It may be a good idea to move API before
>> all those detailed algorithms.
>>     I suspect there will be more users who are interested in APIs than
>> in those algorithms.
> I completely agree and that's what we did initially. When we brought the
> spec into the RDF WG concerns were raised that the RDF WG shouldn't
> standardize APIs and the algorithms should be the main aspect of the spec.
> We tried to find a compromise because we (the JSON-LD CG) believes that an
> API is crucial for adoption and the result is that the API has been moved to
> the end of the document.
>
>
>> - 11.1,
>>    "The JSON-LD Processor interface is the high-level ..." =>
>>    "The JSON-LD Processor interface is a high-level ..." =>
> No change. In this context (the JSON-LD API) I believe the "the" is more
> appropriate. Here's the complete sentence:
>
> "The JSON-LD Processor interface is the high-level programming structure
> that developers use to access the JSON-LD transformation methods."
>
>
>> - Section 11 uses crosses and checks for Nullable and Optional
>> settings.
>> It is a bit more formal to
>>     use TRUE/FALSE, YES/NO, or just Y/N.
> This is generated completely by ReSpec and is, as such, consistent with what
> a lot of other W3C specifications do. I don't think we'll need to change
> this.
>
>
>> - 11.2, is it possible to combine multiple callbacks (to save round
>> trips)?
> No. I'm not sure I understand what you mean by round-trips.. perhaps
> invocations. The JsonLdCallback is called exactly once per operation; the
> LoadContextCallback is called once per remote context (it would technically
> be possible to pass it a number of remote contexts to resolve at once, but
> that would just complicate things); and the ContextLoadedCallback is called
> once as soon as a remote context has been loaded resp. an error while
> loading it was detected.
>
> Does this address your question?
>
>
>> - 11.3,
>>     "the value of input if it is a IRI"=>
>>     "the value of input if it is an IRI"
> Fixed in 5fbf4af [1]
>
>
>> - For JsonLDErrorCode, it may be a good idea to assign numeric error
>> codes to be used together with
>>     descriptive error codes. A developer or a user can then say "hey, I
>> got a 500!"
> That's might be true. In the end, the result will be the same. Robin Berjon
> reviewed the API spec [2] some time ago and advocated the use of descriptive
> strings (instead of constant like strings like INVALID_VALUE). In the end,
> it's a person preference I would say.
>
>
> Thanks again for your efforts. Let me know if my answers did resolve you
> concerns (apart the algorithm re-organization).
>
>
> Cheers,
> Markus
>
> [1]
> https://github.com/json-ld/json-ld.org/commit/5fbf4af2d480e8ae6e4a7c59869f06
> 96c5405cb3
> [2] https://github.com/json-ld/json-ld.org/issues/200
>
>
>
> --
> Markus Lanthaler
> @markuslanthaler
>
>

Received on Tuesday, 2 April 2013 18:34:08 UTC