Re: Branding? from Dave Longley on 2011-07-28 (public-linked-json@w3.org from July 2011)

From: Dave Longley <dlongley@digitalbazaar.com>
Date: Wed, 27 Jul 2011 20:27:55 -0400
To: public-linked-json@w3.org
Message-ID: <4E30AD0B.1070102@digitalbazaar.com>
On 07/27/2011 01:44 PM, Kingsley Idehen wrote:
>> In order to contribute to this branding discussion and 
>> JSON-SD/unlabeled nodes debate I'll start with what the whole point 
>> of JSON-LD is from my point of view -- and then offer my position on 
>> the various arguments.
>>
>> My view of JSON-LD is that it was a technology that was created to 
>> represent graphs in JSON in the same way that other technologies like 
>> RDF express graphs. 
>
> Yes, but RDF is but an option for creating Linked Data. It isn't the 
> only syntax for achieving the goal. The fact that the syntax is based 
> on a graph model doesn't mean it owns graph creation. The fact that it 
> allows expression of semantics doesn't give it ownership of semantic 
> expression. Sadly, the aformentioned fallacies are being pushed as 
> facts in very careless ways.

I'm unaware of who is pushing those view points, but I'm certainly not. 
I apologize for giving you that impression, it was not my intention.

> Linked Data is a specific kind of structure with specific 
> characteristics. If those characteristics aren't met its simply unwise 
> to tag the end product as being Linked Data since the net effect is 
> yet another broken narrative via yet another syntax. A new JSON syntax 
> won't fix conceptual flaws (confining semantically rich expression to 
> a specific syntax) or broken narratives.

So far, here's what I've seen on the mailing list:

1. Some objection to using the name "Linked Data" if we support 
unlabeled nodes.
2. Agreement that whatever markup we come up with ought to somehow 
include support for unlabeled nodes because they have a natural place in 
JSON data and are useful in Linked Data.
3. Concern that particular syntaxes seem to "claim ownership" over 
concepts like "Linked Data" instead of just being limited to defining a 
particular way to express Linked Data.

When I read that list, the exact name I think we ought to attach to our 
spec is "JSON with Linked Data", or "JSON-LD". The leading adjective 
"JSON" makes it clear that we're talking explicitly about how Linked 
Data is expressed using JSON syntax; we are not trying to claim 
ownership over the term itself. Since any practical JSON markup that can 
express Linked Data will also include support for unlabeled nodes, it 
means that there's no other useful "thing" that we ought to call "JSON 
with Linked Data". To me, that makes "JSON-LD" an ideal name for the 
spec that defines how you ought to work with Linked Data if you're using 
JSON. If we wanted to claim ownership over "Linked Data", we would 
simply call this spec "Linked Data", which I agree, would be a mistake.

So, let's add the JSON adjective to Linked Data and understand how that 
ties together what is natural to do in JSON with the Linked Data world: 
JSON-LD. To think of it another way, JSON-LD is the combination of JSON 
and Linked Data. I think that understanding of the name should be 
agreeable. Another suggestion that would work, bit looks a bit uglier 
IMO, would be JSON+LD.

>
>> More than this, it was also intended to make it possible to 
>> contextualize JSON data (including existing JSON) such that it could 
>> be made sense of as a subgraph of the greater graph of data on the 
>> web. This is where Linked Data comes in.
>
> Yes, but Linked Data is a form of directed graph based Structured Data 
> (SD). Thus, if you want broad coverage don't start with LD, use 
> something else. Hence, the SD suggestion. JSON-SD or JSON-CD (as 
> project names at worst) gives you natural flow (from the generic to 
> the specific) that ultimately provides foundation for coherent 
> narratives and ultimately audience comprehension.

I disagree. I think those names ultimately lead to audience confusion, 
not comprehension. JSON is inherently "structured". Calling something 
"JSON Structured Data" is redundant and I would not expect it to 
engender the meaning you're looking for in the minds of most people. 
Perhaps we also shouldn't be trying to claim ownership over the term 
"structured data". The above approach would also, IMO, introduce an 
unnecessary and increased cognitive load for understanding the Venn 
Diagram behind various layers of JSON-* vernacular. Since there is 
agreement that any practical markup to support Linked Data in JSON will 
include support for unlabeled nodes, then it simply follows that this is 
exactly what "JSON with Linked Data" would entail.

> The suggestion is a Name or Moniker that implies a larger definition 
> space for construction graphs using JSON. The goal is not to add to 
> the expensive conflation bandwagon that continues to add complexity to 
> a pretty simple concept, once all the confusion is set aside.

I think the concept of naturally working with Linked Data in JSON is a 
pretty simple concept already. Adding extra layers of names and subsets 
to conceptually model what is going on at a theoretical level doesn't 
ease the understanding how to use Linked Data in JSON. If someone wants 
to turn their JSON data into Linked Data, they should follow a spec 
called "JSON-LD". That that spec might allow for them to do other things 
that are natural in JSON and, at the very least, compatible with Linked 
Data, is not relevant to its name. That people sometimes want to use 
unlabeled nodes, because to do otherwise would betray the purity of 
their data by adding superficial, essentially meaningless information, 
has little to do with the fact that their data, when expressed in 
JSON-LD, will be largely Linked Data. That is the main point of the 
spec. Just because we make it a little more powerful and natural than 
some otherwise restrictive and less practical method shouldn't require 
us to adopt an entirely new name.

>>
>> First, I understand that some have argued that supporting unlabeled 
>> nodes (or nodes with "blank node identifiers") would result in 
>> supporting the markup of graphs that are not considered Linked Data. 
>> The main argument being that the name "JSON-LD" (JSON Linked Data) 
>> would betray the conceptual purity of "Linked Data". The reasoning 
>> behind this, as I understand it is as follows:
>>
>> In order for a graph to be Linked Data, it has been argued that its 
>> subjects (nodes that are not literal values) and edges must be 
>> resolvable to representations of their referents. This simply means 
>> that there must be a Resolver that can resolve a subject or edge to a 
>> representation of the graph that refers to it. The contrapositive for 
>> this is that if no such Resolver exists, then a graph is not Linked 
>> Data.
>
> If Names do not Resolve to Representation of their Referents it isn't 
> Linked Data.

This is not a counter-point to my argument.

> It isn't Linked Data today re. Web context and wasn't Linked Data in 
> the past when working on your local computer using any system level 
> programming language that offered you de-reference (indirection) and 
> address-of operations via language specific operators. Programmers 
> have been creating and exploiting Linked Data structures since the 
> advent of computing.
>
>>
>> The argument that a graph containing a blank node is not Linked Data 
>> has not been very explicit in my view.
>
> Its hybrid Linked Data via skolemization, bottom line. The real issue 
> boils down to the "deceptively simple" doctrine where you provide the 
> simplest entry point into a nuanced realm of subjective complexity.
> Do we want skolemization at the front door? I don't think so.

If we're talking about the simplest entry point in a realm of subjective 
complexity, then the answer is that the naming of anything in Linked 
Data is "skolemization". It is perfectly simple to say that all Linked 
Data is skolemized to some degree. It isn't "simpler" to draw the 
"Linked Data line" after the manual naming of subjects but before the 
automatic naming of subjects, but such an approach is subjective. But 
picking the simplest entry point might not be the best idea.

So it seems like we're picking and choosing where the lines are drawn. 
In that case, I think whether or not we want skolemization "at the front 
door", I believe, actually depends on the target syntax (and what you 
mean by "front door"). If you mean "clearly supported in the spec" then, 
in the case of JSON, using unlabeled nodes is perfectly natural and 
sometimes more desirable than forcing an IRI where it adds no meaning. 
But this just reinforces my point that JSON-LD is not an attempt to 
"own" Linked Data, but rather an attempt to define how you can express 
Linked Data naturally in JSON. If by "front door" you meant "in your 
face" to a Web developer, then I agree with you, we don't want it there.

> Blank nodes (as Richard explains nicely) ultimately degrades Linked 
> Data meshes.
>
>> To try and make it more explicit, here is what it appears to me to 
>> be: An HTTP Resolver can resolve HTTP subject URIs, but if a subject 
>> uses a blank node identifier, then an HTTP Resolver can't resolve it. 
>> Therefore, a graph with a blank node is not Linked Data.
>
> You have a local Name with local Resolution (via skolemization), 
> whereas the Web aspect of Linked Data is about a Global Data Space 
> i.e., a Web of Linked Data of Web of Data.

And, in my view, the whole point of Linked Data is to make data play 
nice on the web. That includes an effort to reduce meaningless 
information. If there exists some data that must have meaningless 
information added to it in order for it to be welcomed into the Linked 
Data realm, then that is in contradiction with the goals of Linked Data. 
I consider it a pollution of the web to add "Global Data Space IRIs" to 
subjects that have no business being accessed at a global level. 
Instead, there is absolutely nothing wrong with a model where you begin 
at a global level, but may drill down at a local level in the few cases 
where this is appropriate. This concept has worked in computer science 
for decades.

>
>>
>> It is possible that I've misunderstood the argument, but that's what 
>> I've been hearing. Of course, when written out, the conclusion 
>> doesn't follow from the definition offered for Linked Data. In fact, 
>> a much simpler Resolver than an HTTP Resolver could be devised for 
>> blank node identifiers: All it must do is return the local graph that 
>> it is a part of -- where you found the blank node in the first place.
>>
>> Side note: It is worth mentioning that a blank node identifier can be 
>> a URI, even if the scheme is not HTTP. Furthermore, if an algorithm 
>> can be devised to automatically and canonically label unlabeled 
>> subjects, then that algorithm need only be added to the 
>> aforementioned Resolver in order for it to comply with the given 
>> Linked Data definition. In several current JSON-LD implementations, 
>> such an algorithm has been implemented and it is part of JSON-LD 
>> "normalization".
>>
>> At this point it could be counter-argued that the above definition of 
>> Linked Data simply needs some work in order to exclude unlabeled 
>> nodes properly. (In fact, perhaps the definition is so lacking that 
>> one could rationalize that all data is Linked Data).
>
> You can't rationalize that all Data is Linked Data.

You can if the definition is lacking. That is the only point I was 
making. The arguments against considering unlabeled nodes as some kind 
of Linked Data have fallen short because the given definitions simply do 
not preclude them from being considered such. This was my argument.

We could try to be more explicit in an effort to exclude unlabeled 
nodes, but my point is: Why are we trying to do that when we have 
already acknowledged their usefulness in JSON data -- including within 
Linked Data in JSON? Since we agree that a practical markup for 
expressing Linked Data in JSON would also include the ability to express 
unlabeled nodes, why are we having this argument? Why not just say: 
"This is the markup you should use to express Linked Data in JSON, and, 
by the way, if you have unlabeled nodes the markup won't force you to 
change your data. Since the main purpose of the markup is for expressing 
Linked Data in JSON, it is called JSON-LD". That's it. Seems simple 
enough to me.

> Trouble is making it clear can lead to confusion since blank nodes and 
> skolemization != good items for the front door of a lightweight 
> mechanism for creating graphs (or specifically Linked Data graphs) in 
> JSON.
>
> The issues really have more to do with the following aimed at Web 
> Developers, I believe:
>
> 1. What is JSON-LD?
> 2. Why is it important?
> 3. How do I use it?
>> It seems to me like this is a better approach than trying to figure 
>> out how to define Linked Data so that they have no place and so we 
>> need to rename our technology. 
> You mean spec :-)
>
>> Furthermore, this approach speaks to what seems to be another 
>> somewhat latent argument here against unlabeled nodes:
>>
>> There seems to be some concern that if people are able to use blank 
>> nodes, then they will abuse them.
>
> No, they'll be confused if skolemization algorithims hit them at the 
> front door.

Ok, now I think I understand what you meant by "front door" before. I'm 
not suggesting that a Web developer has to see a skolemization algorithm 
at the front door, far from it. I'm suggesting that anyone who wants to 
check graph equality, diff two graphs, or digitally sign a graph to 
ensure non-repudiation will have the ability to do so. I'm also 
suggesting that anyone who wants to put an unlabeled node in their JSON 
data because it simply makes sense to do so -- can do it. Furthermore, 
they might *never* have to skolemize it. That's where framing comes in. 
This pushes all of the skolemization into the back corner where only an 
expert might have to know something about it. But it's there to cover 
the cases where it's needed. It also means that average Web developer 
can create an appropriate unlabeled node and not have to worry about 
their JSON-LD processor rejecting their data.

> What is the graph to you? Where are its boundaries?
>
> Linked Data is about a WWW of Linked Data. The Web's Global Data Space 
> dimension. Everything Name (irrespective of URI scheme) has to resolve 
> to a Representation of its Referent that accessible from an Addresss.

Like I argued, I can devise a simple Resolver that resolves a blank node 
URI to a Representation of its Referent. However, it would not be at the 
so-called "Global Data Space" dimension, but I addressed this issue above.

>
>> That may be covered by the recent adoption of the "SHOULD" text when 
>> talking about labeling nodes, but perhaps it could be clearer.
>>
>> I expect users of JSON-LD to encounter situations where they think 
>> they should be using unlabeled nodes. They shouldn't get the 
>> impression that they must abandon JSON-LD all together if this 
>> happens -- or that there's no solution to their use case in the 
>> specification. I also don't think that a JSON-LD processor that is 
>> generating triples or normalized JSON-LD should fail someone who is 
>> contextualizing all of their JSON data and some of it simply needs to 
>> use unlabeled nodes.
>>
>> All of this being said, if we still feel the need to adopt a new name 
>> I can live with that. I just want to see that we don't cut support 
>> for unlabeled nodes and would prefer that their use not be 
>> discouraged, but rather put in its appropriate place.
>
> Yes, re. putting skolemization in its appropriate place i.e., not the 
> front door of a lightweight spec for construction of graph based data 
> representation using JSON :-)

I agree that skolemization algorithms shouldn't be at the front door. 
But I don't think that I'm arguing for that and I don't think that 
calling the spec "JSON-LD" would require this.

>
>
> Kingsley
>
>>
>> On 07/25/2011 11:08 PM, Manu Sporny wrote:
>>> JSON-SD doesn't really roll off of the tongue... neither did JSON-LD 
>>> or RDFa. HTML is only used because it's been around forever... but 
>>> it's a pretty crappy brand name. Any thoughts on what this 
>>> technology should be called as we ready it for public consumption?
>>>
>>> I was thinking: Structure
>>>
>>> "Structure allows you to express Linked Data in JSON"
>>>
>>> Yes, I realize that isn't entirely accurate, but tag-lines rarely 
>>> are accurate. Thoughts on branding the technology so that it's easy 
>>> to drop into a conversation without scaring Web developers away or 
>>> making people feel as if the conversation is going to take a scary 
>>> turn toward geek-speak?
>>>
>>> -- manu
>>>
>>
>>
>
>


-- 
Dave Longley
CTO
Digital Bazaar, Inc.
Received on Thursday, 28 July 2011 00:28:23 UTC