Re: Request for FPWD via RDF WG of JSON-LD Syntax and API specs

On 05/24/2012 04:04 PM, Pat Hayes wrote:
> Sorry if this is rather late, but I have only now read through these
>  documents (not being a JSON maven, I had thought to leave this task
>  to others.) Unfortunately there are some serious issues with the use
>  of standard terminology, which I would urge you to consider
> carefully before publication, as they could have a seriously
> deleterious affect on the understanding of some readers and will
> likely  generate a great deal of needless confusion and
> misunderstanding.

Hi Pat,

Thanks for the extensive review - much appreciated. :)

Let me start off by saying that we do think the terminology was crafted
to a particular audience that was not the RDF WG (at the time) and as a
result, there is confusion that we do intend to clean up during the
course of feedback by this WG.

There are three communities in mind here:

1) Web developers that need to use a form of Linked Data that is not
    as extensive as requiring TURTLE, SPARQL, or a Graph store.
2) The Linked Data folks who see RDF as too complicated and desire
    something simpler.
3) RDF folks that would like a JSON serialization, but don't have
    any of the issues with bringing themselves up to speed that
    group #1 has currently.

Quite obviously, we're not going to be able to make each of these three
communities happy with every decision we have made. Additionally, we
shouldn't try to make everyone happy, but we also shouldn't create
confusion in the process.

Also note that the JSON-LD API document has not undergone extensive
review and grammar/flow/editing. We had not intended to submit the
document to this WG yet, but after the last week of discussion, it seems
that doing so would be the most rational way to proceed. So, you
shouldn't take anything in that document as a concerted effort to
undermine the RDF Concepts document. It is most likely an authoring
error due to a bleary-eyed late-night hacking session.

> The most serious is a repeated confusion of the syntactic
> distinctions such as subject, object and property with semantic
> notions such as resource.

We should certainly fix occurrences of that.

> For example, in [API] 3.2, we read
>
> "blank node a blank node is a resource which is neither an IRI nor a
>  literal. Blank nodes may be named or unnamed and often take on the
> role of a variable that may represent either an IRI or a literal."
>
> This is all quite wrong. A blank node is not itself a resource. It is
> a NODE which is neither IRI nor literal, and it definitely is NOT
> itself named.  It *refers to* a resource which may be named or
> unnamed, but that is quite different from saying it IS one. Again, it
> may refer to the same thing as an IRI or literal refers to, but it
> itself does not *represent* an IRI or a literal.

Noted... Richard Cyganiak raised an issue on terminology not matching
RDF Concepts, I've added all of your concerns there as well. We will
need to make a editorial pass to address both of your concerns:

https://github.com/json-ld/json-ld.org/issues/131

> ------
>
> [LD] does not have such egregious misuses of terminology, but it does
> have some confusing remarks. The Introduction says:

We'll try to word smith to be more accurate. The concern here was that
we didn't want to lose the general message to the reader (who is a web
developer with no idea about RDF) with specifics about RDF or
HTTPRange-14 issues or other important, but distracting details.

> Section 3.1, line 11: "A value is an object with a label that is not
>  an IRI"
>
> This is not technically wrong, but I would urge you in the strongest
>  possible terms to reconsider using the term "value" for a graph node
>  of any kind.

This terminology came out of a very long series of discussions where
folks repeatedly stated that the RDF terminology was confusing and
overly pedantic. That said, we can word smith this to try and find a
common ground...

For example - the word "predicate" results in confused looks from Web
developers where the word "property", while less accurate, does not.
Similarly, "literal" is harder for beginner developers to understand
than "value".

Read this e-mail to understand the general tone of those discussions
about RDF vs. Linked Data:

http://lists.w3.org/Archives/Public/public-linked-json/2011Jul/0010.html

There were a series of discussions about terminology (minutes and audio
available):

http://json-ld.org/minutes/2011-07-04/
http://json-ld.org/minutes/2011-07-26/
http://json-ld.org/minutes/2011-08-08/

... that resulted in this requirements document where we define exactly
what Linked Data, Structured Data, and JSON-LD means:

http://json-ld.org/requirements/latest/

I'm not making a particular point - just stating that the word choice
came from a desire to simplify RDF so that it was accessible to Web
developers.

> Finally, a pet peeve of mine regarding blank nodes.
>
> [LD]  3.1 says
>
> " Unlabeled nodes are not considered Linked Data."
>
> Says who? I didnt know there was a Ministry (Church?) of Linked
> Data.

Based on the definition of Linked Data that the group found consensus on:

http://json-ld.org/requirements/latest/#linked-data

Specifically:

"An IRI that is a label in a linked data graph should be dereferencable
to a Linked Data document describing the labeled subject, object or
property."

A blank node doesn't allow you to create links between documents and
since you can't create links, it doesn't fit the definition of Linked
Data that we have consensus on. However, we did find that it does
constitute Structured Data (also defined in the Requirements document).

We needed to define these terms and what constituted a Linked Data
document and a Structured Data document because the group was spinning
its wheels on definitions for almost two months...

> But 3.1.2 says: "The example above does not use the @id keyword to
> set the subject of the node being described above. This type of node
>  is called an unlabeled node and is considered to be a weaker form of
>  Linked Data."

Yes, we should fix that.

> So, which is right? Is LD with blank nodes just weakly linked, or is
>  it totally persona non grata, and excluded from consideration by
> definitional fiat?

Much of that depends on what this group thinks as well. Here is where we
got to over the past year in the JSON-LD group:

1. Blank nodes are necessary.
2. Blank nodes are not Linked Data.
3. Blank nodes are Structured Data.
4. Structured Data is a super-set of Linked Data.

> As you can probably tell, I find this aversion to the (useful and
> harmless) notion of blank nodes rather silly, as well as simply
> false: a great deal of actual linked data does have blank nodes in
> it, and this is likely to continue and even increase as time goes on.
> It is telling that even you, in this short document, weren't able to
> conveniently avoid them.

I think you're reading too much into text that may contain editorial
issues. There are a number of people in the JSON-LD community, myself
included, that think that blank nodes are vital.

There are also a number of people in the JSON-LD community, while
accepting that blank nodes are useful at times, do not think they
constitute Linked Data in the way the group has defined Linked Data.

> But whatever your beliefs on this topic, my point is that the
> document ought to be consistent about it.

Agreed.

I think you have made a number of excellent points - we need to fix the
issues that are fairly obvious spec bugs. We should try to stay in line
with the RDF Concepts without confusing our primary audience for these
specs, which are Web developers... not people that find themselves in
groups like the RDF WG. We'll do a round of edits and try to at least
address all of your issues as far as the JSON-LD CG is concerned and
then see if those changes are acceptable to the RDF WG.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: PaySwarm Website for Developers Launched
http://digitalbazaar.com/2012/02/22/new-payswarm-alpha/

Received on Friday, 25 May 2012 02:48:02 UTC