Re: Branding? from Kingsley Idehen on 2011-07-28 (public-linked-json@w3.org from July 2011)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 28 Jul 2011 00:20:49 -0400
To: public-linked-json@w3.org
Message-ID: <4E30E3A1.40309@openlinksw.com>
> On 07/27/2011 01:44 PM, Kingsley Idehen wrote:
>>> In order to contribute to this branding discussion and 
>>> JSON-SD/unlabeled nodes debate I'll start with what the whole point 
>>> of JSON-LD is from my point of view -- and then offer my position on 
>>> the various arguments.
>>>
>>> My view of JSON-LD is that it was a technology that was created to 
>>> represent graphs in JSON in the same way that other technologies 
>>> like RDF express graphs. 
>>
>> Yes, but RDF is but an option for creating Linked Data. It isn't the 
>> only syntax for achieving the goal. The fact that the syntax is based 
>> on a graph model doesn't mean it owns graph creation. The fact that 
>> it allows expression of semantics doesn't give it ownership of 
>> semantic expression. Sadly, the aformentioned fallacies are being 
>> pushed as facts in very careless ways.
>
> I'm unaware of who is pushing those view points, but I'm certainly 
> not. I apologize for giving you that impression, it was not my intention.
>
>> Linked Data is a specific kind of structure with specific 
>> characteristics. If those characteristics aren't met its simply 
>> unwise to tag the end product as being Linked Data since the net 
>> effect is yet another broken narrative via yet another syntax. A new 
>> JSON syntax won't fix conceptual flaws (confining semantically rich 
>> expression to a specific syntax) or broken narratives.
>
> So far, here's what I've seen on the mailing list:
>
> 1. Some objection to using the name "Linked Data" if we support 
> unlabeled nodes.
> 2. Agreement that whatever markup we come up with ought to somehow 
> include support for unlabeled nodes because they have a natural place 
> in JSON data and are useful in Linked Data.
> 3. Concern that particular syntaxes seem to "claim ownership" over 
> concepts like "Linked Data" instead of just being limited to defining 
> a particular way to express Linked Data.
>
> When I read that list, the exact name I think we ought to attach to 
> our spec is "JSON with Linked Data", or "JSON-LD". The leading 
> adjective "JSON" makes it clear that we're talking explicitly about 
> how Linked Data is expressed using JSON syntax; we are not trying to 
> claim ownership over the term itself. 

I am saying that when you give something a Name the idea is that it 
conveys meaning, ideally with clarity. The trouble with JSON-LD is that 
it implies a spec for constructing Linked Data using JSON.

> Since any practical JSON markup that can express Linked Data will also 
> include support for unlabeled nodes, it means that there's no other 
> useful "thing" that we ought to call "JSON with Linked Data". To me, 
> that makes "JSON-LD" an ideal name for the spec that defines how you 
> ought to work with Linked Data if you're using JSON. If we wanted to 
> claim ownership over "Linked Data", we would simply call this spec 
> "Linked Data", which I agree, would be a mistake.

"Linked Data" is the end product of a specific kind of directed graph 
based structure. This structure can be used as a powerful vehicle for 
whole data representation.
>
> So, let's add the JSON adjective to Linked Data and understand how 
> that ties together what is natural to do in JSON with the Linked Data 
> world: JSON-LD. To think of it another way, JSON-LD is the combination 
> of JSON and Linked Data. I think that understanding of the name should 
> be agreeable. Another suggestion that would work, bit looks a bit 
> uglier IMO, would be JSON+LD.

You are treating Linked Data and graphs as the same thing.

>
>>
>>> More than this, it was also intended to make it possible to 
>>> contextualize JSON data (including existing JSON) such that it could 
>>> be made sense of as a subgraph of the greater graph of data on the 
>>> web. This is where Linked Data comes in.
>>
>> Yes, but Linked Data is a form of directed graph based Structured 
>> Data (SD). Thus, if you want broad coverage don't start with LD, use 
>> something else. Hence, the SD suggestion. JSON-SD or JSON-CD (as 
>> project names at worst) gives you natural flow (from the generic to 
>> the specific) that ultimately provides foundation for coherent 
>> narratives and ultimately audience comprehension.
>
> I disagree. I think those names ultimately lead to audience confusion, 
> not comprehension. JSON is inherently "structured". 

Of course JSON is inherently structured, so are XML, HTML, ASN.1,  and 
many others. That doesn't mean adding "-LD" to end each delivers Linked 
Data, now remember, they too are all mechanisms for graph construction. 
Yes, and don't do the Tree vs Graph thing since a Tree is a rooted graph.

> Calling something "JSON Structured Data" is redundant and I would not 
> expect it to engender the meaning you're looking for in the minds of 
> most people. 

Yes, there is tautology in play there for sure. But I thought we had 
moved on to JSON-CD or even JSON-*D?

Why is "LD" so vital beyond the fact (as I've claimed repeatedly) that 
this is an awkward attempt to ride the Linked Data wave.

> Perhaps we also shouldn't be trying to claim ownership over the term 
> "structured data". 

I don't think a spec claims ownership.

When I criticize RDF overreach, for instance, I am not criticizing the 
spec. I am criticizing the confusion that RDF (an implementation option) 
brings to the much simpler concept of Linked Data.

When I criticize JSON-LD it's for the very same reasons outlined above 
i.e., killing confusion that arises from conflation.


> The above approach would also, IMO, introduce an unnecessary and 
> increased cognitive load for understanding the Venn Diagram behind 
> various layers of JSON-* vernacular. 

I don't think so at all. We need more Venn Diagrams as collateral for 
helping people understand concepts and boundaries.

> Since there is agreement that any practical markup to support Linked 
> Data in JSON will include support for unlabeled nodes, then it simply 
> follows that this is exactly what "JSON with Linked Data" would entail.

It is supposed to be about JSON as a vehicle for constructing and 
serializing Linked Data. I don't quite get the "practical markup to 
support Linked Data in JSON" statement. Are other markups (including 
those that are JSON based) for creating Linked Data impractical then?

>
>> The suggestion is a Name or Moniker that implies a larger definition 
>> space for construction graphs using JSON. The goal is not to add to 
>> the expensive conflation bandwagon that continues to add complexity 
>> to a pretty simple concept, once all the confusion is set aside.
>
> I think the concept of naturally working with Linked Data in JSON is a 
> pretty simple concept already. 

See my comment above. The spec was supposed to be about constructing 
Linked Data using JSON based markup.

> Adding extra layers of names and subsets to conceptually model what is 
> going on at a theoretical level doesn't ease the understanding how to 
> use Linked Data in JSON. 

Why on earth do you deem anything view contrary to yours (or Manu's for 
that matter) as "theoretical" or "impractical" ? Dare I ask about your 
ownership of "pragmatism" and all things goal oriented and achievable 
i.e., practical?
> If someone wants to turn their JSON data into Linked Data, they should 
> follow a spec called "JSON-LD". 

No problem there. But the "LD" should mean "Linked Data" rather than 
inject confusion.

> That that spec might allow for them to do other things that are 
> natural in JSON and, at the very least, compatible with Linked Data, 
> is not relevant to its name. 

In my world view Names matter a lot. I prefer unambiguity over ambiguity 
via Naming conventions, especially when the goal is to be "deceptively 
simple" .

> That people sometimes want to use unlabeled nodes, because to do 
> otherwise would betray the purity of their data by adding superficial, 
> essentially meaningless information, has little to do with the fact 
> that their data, when expressed in JSON-LD, will be largely Linked Data. 

Put differently, the issue is about how many levels of indirection from 
Name to the Representation of its Referent.  You have <Name> --> 
<Address> -- [Actual Data that Represents the Referent of the Name] . 
Everything has a Name. Every Name Resolves. That's what Linked Data is 
about. That doesn't mean you can't construct other kinds of data 
representations using directed graphs. Linked Data is very specific and 
you are refusing to accept its specificity.

> That is the main point of the spec. Just because we make it a little 
> more powerful and natural than some otherwise restrictive and less 
> practical method shouldn't require us to adopt an entirely new name.

Clearly I don't agree with the comments above as already explained above.

>
>>>
>>> First, I understand that some have argued that supporting unlabeled 
>>> nodes (or nodes with "blank node identifiers") would result in 
>>> supporting the markup of graphs that are not considered Linked Data. 
>>> The main argument being that the name "JSON-LD" (JSON Linked Data) 
>>> would betray the conceptual purity of "Linked Data". The reasoning 
>>> behind this, as I understand it is as follows:
>>>
>>> In order for a graph to be Linked Data, it has been argued that its 
>>> subjects (nodes that are not literal values) and edges must be 
>>> resolvable to representations of their referents. This simply means 
>>> that there must be a Resolver that can resolve a subject or edge to 
>>> a representation of the graph that refers to it. The contrapositive 
>>> for this is that if no such Resolver exists, then a graph is not 
>>> Linked Data.
>>
>> If Names do not Resolve to Representation of their Referents it isn't 
>> Linked Data.
>
> This is not a counter-point to my argument.

If no such Resolver exists then you don't have Linked Data delivered by 
the graph type you are proposing.

Again, Linked Data is specific. That doesn't mean that Linked Data is 
the only option for data representation using directed graphs. It just 
means that when you use the phrase "Linked Data" we have very specific 
meaning and expectations.

>
>> It isn't Linked Data today re. Web context and wasn't Linked Data in 
>> the past when working on your local computer using any system level 
>> programming language that offered you de-reference (indirection) and 
>> address-of operations via language specific operators. Programmers 
>> have been creating and exploiting Linked Data structures since the 
>> advent of computing.
>>
>>>
>>> The argument that a graph containing a blank node is not Linked Data 
>>> has not been very explicit in my view.
>>
>> Its hybrid Linked Data via skolemization, bottom line. The real issue 
>> boils down to the "deceptively simple" doctrine where you provide the 
>> simplest entry point into a nuanced realm of subjective complexity.
>> Do we want skolemization at the front door? I don't think so.
>
> If we're talking about the simplest entry point in a realm of 
> subjective complexity, then the answer is that the naming of anything 
> in Linked Data is "skolemization". It is perfectly simple to say that 
> all Linked Data is skolemized to some degree.

Yes, but without blank nodes in the conversation we don't even introduce 
the term: skolemization.

> It isn't "simpler" to draw the "Linked Data line" after the manual 
> naming of subjects but before the automatic naming of subjects, but 
> such an approach is subjective. But picking the simplest entry point 
> might not be the best idea.

I beg to differ. "Deceptively Simple" is all about picking the simplest 
entry point into a realm that a little more complex than presumed. If 
you need a real world example then revisit the WWW and its exploitation 
of AWWW. Also look at the HTTP protocol, it too is "Deceptively Simple".

"Deceptively Simple" works cos it scales. "Simply Simple" doesn't work 
cos it doesn't scale.

What kind of "simple" do you and Manu seek re. JSON-LD ?

>
> So it seems like we're picking and choosing where the lines are drawn. 

Yes, we both are :-)

> In that case, I think whether or not we want skolemization "at the 
> front door", I believe, actually depends on the target syntax (and 
> what you mean by "front door").

Front door is the apex of a value pyramid. A convention pyramid depicts 
"deceptively simple". An inverted pyramid depicts "simply simple" i.e., 
sneeze and it will keel over.
> If you mean "clearly supported in the spec" then, in the case of JSON, 
> using unlabeled nodes is perfectly natural and sometimes more 
> desirable than forcing an IRI where it adds no meaning. 

I don't understand how an IRI that resolves to the representation of its 
referent can in any way be construed as being devoid of meaning. "I am 
who I am" dates back to biblical times :-)

> But this just reinforces my point that JSON-LD is not an attempt to 
> "own" Linked Data, but rather an attempt to define how you can express 
> Linked Data naturally in JSON.

It isn't an attempt to "own". It's an attempt to "associate" with Linked 
Data in a way that doesn't actually add value in the form of clarity 
about the core concept. Basically, it muddies the waters when that's the 
very last thing it needs.

> If by "front door" you meant "in your face" to a Web developer, then I 
> agree with you, we don't want it there.

Yes, that's what I mean. Web developers are human and they prefer cures 
over prescriptions.

>
>> Blank nodes (as Richard explains nicely) ultimately degrades Linked 
>> Data meshes.
>>
>>> To try and make it more explicit, here is what it appears to me to 
>>> be: An HTTP Resolver can resolve HTTP subject URIs, but if a subject 
>>> uses a blank node identifier, then an HTTP Resolver can't resolve 
>>> it. Therefore, a graph with a blank node is not Linked Data.

Problematic to Linked Data ( a specific kind of directed graph), but 
acceptable under the right circumstances re., data representation via 
directed graphs, in general.

The key point is still boils down to Linked Data being a specific kind 
of directed graph based structure. Its expectations are very clear.

>>
>> You have a local Name with local Resolution (via skolemization), 
>> whereas the Web aspect of Linked Data is about a Global Data Space 
>> i.e., a Web of Linked Data of Web of Data.
>
> And, in my view, the whole point of Linked Data is to make data play 
> nice on the web. 

Linked Data as espoused in TimBL's meme is about how you add a Data 
Space dimension to the Web, without disruptions that arise when you 
compromise AWWW. The information space dimension of the WWW does have 
Names (in the form of Addresses/URLs) that resolve. Thus, the same has 
to hold true re. the data space dimension of the WWW albeit with finer 
granularity where Names and Addresses are distinct.

> That includes an effort to reduce meaningless information. If there 
> exists some data that must have meaningless information added to it in 
> order for it to be welcomed into the Linked Data realm, then that is 
> in contradiction with the goals of Linked Data. 

Yes, and I don't understand how an IRI that resolves to the 
Representation of its Referent is meaningless. And that applies to any 
WWW dimension btw.

> I consider it a pollution of the web to add "Global Data Space IRIs" 
> to subjects that have no business being accessed at a global level. 

If something shouldn't be accessed at any level you protect it with 
ACLs. That's what WebID facilitates.

> Instead, there is absolutely nothing wrong with a model where you 
> begin at a global level, but may drill down at a local level in the 
> few cases where this is appropriate. 

Yes, and you can do that with ACLs. Assuming "local" is about stuff that 
doesn't need to be globally accessible.
> This concept has worked in computer science for decades.

Yes, not disputing that.

>
>>
>>>
>>> It is possible that I've misunderstood the argument, but that's what 
>>> I've been hearing. Of course, when written out, the conclusion 
>>> doesn't follow from the definition offered for Linked Data. In fact, 
>>> a much simpler Resolver than an HTTP Resolver could be devised for 
>>> blank node identifiers: All it must do is return the local graph 
>>> that it is a part of -- where you found the blank node in the first 
>>> place.
>>>
>>> Side note: It is worth mentioning that a blank node identifier can 
>>> be a URI, even if the scheme is not HTTP. Furthermore, if an 
>>> algorithm can be devised to automatically and canonically label 
>>> unlabeled subjects, then that algorithm need only be added to the 
>>> aforementioned Resolver in order for it to comply with the given 
>>> Linked Data definition. In several current JSON-LD implementations, 
>>> such an algorithm has been implemented and it is part of JSON-LD 
>>> "normalization".
>>>
>>> At this point it could be counter-argued that the above definition 
>>> of Linked Data simply needs some work in order to exclude unlabeled 
>>> nodes properly. (In fact, perhaps the definition is so lacking that 
>>> one could rationalize that all data is Linked Data).
>>
>> You can't rationalize that all Data is Linked Data.

Please read my comments. I've never held that position. I think that's 
back to front. I am saying: Linked Data isn't the only kind of Data. 
Also saying Linked Data is just a kind of directed graph used for whole 
data representation. In no case am I inferring that its the sole option. 
I don't do "sole" anything. I just want clarity over unclarity re., what 
constitutes Linked Data.

>
> You can if the definition is lacking. That is the only point I was 
> making. The arguments against considering unlabeled nodes as some kind 
> of Linked Data have fallen short because the given definitions simply 
> do not preclude them from being considered such. This was my argument.

Resolvability via one level of indirection, at WWW scale is vital to 
Linked Data as espoused in TimBL's meme. Drop WWW from the concept then 
we still have the issue with levels of indirection from Name to actual 
Data.

>
> We could try to be more explicit in an effort to exclude unlabeled 
> nodes, but my point is: Why are we trying to do that when we have 
> already acknowledged their usefulness in JSON data -- including within 
> Linked Data in JSON? Since we agree that a practical markup for 
> expressing Linked Data in JSON would also include the ability to 
> express unlabeled nodes, why are we having this argument? Why not just 
> say: "This is the markup you should use to express Linked Data in 
> JSON, and, by the way, if you have unlabeled nodes the markup won't 
> force you to change your data. Since the main purpose of the markup is 
> for expressing Linked Data in JSON, it is called JSON-LD". That's it. 
> Seems simple enough to me.

Yes-ish, but we have this front door matter re. skolemization. Blank 
nodes pull in skolemization.

Here is another way of looking at this matter: clickable data. When a 
user clicks on a URI they get something back. Doing that consistently at 
InterWeb scale with blank nodes in the mix is increasingly problematic 
as data moves across data spaces. This brings use back to complex 
skolemization alogrithms (for cross data space consistency) that simply 
detract from the main goal ie., follow-your-nose through a graph that 
represents the description of a subject.

>
>> Trouble is making it clear can lead to confusion since blank nodes 
>> and skolemization != good items for the front door of a lightweight 
>> mechanism for creating graphs (or specifically Linked Data graphs) in 
>> JSON.
>>
>> The issues really have more to do with the following aimed at Web 
>> Developers, I believe:
>>
>> 1. What is JSON-LD?
>> 2. Why is it important?
>> 3. How do I use it?
>>> It seems to me like this is a better approach than trying to figure 
>>> out how to define Linked Data so that they have no place and so we 
>>> need to rename our technology. 
>> You mean spec :-)
>>
>>> Furthermore, this approach speaks to what seems to be another 
>>> somewhat latent argument here against unlabeled nodes:
>>>
>>> There seems to be some concern that if people are able to use blank 
>>> nodes, then they will abuse them.
>>
>> No, they'll be confused if skolemization algorithims hit them at the 
>> front door.
>
> Ok, now I think I understand what you meant by "front door" before. 
> I'm not suggesting that a Web developer has to see a skolemization 
> algorithm at the front door, far from it.

Yes, we both agree here.

> I'm suggesting that anyone who wants to check graph equality, diff two 
> graphs, or digitally sign a graph to ensure non-repudiation will have 
> the ability to do so.

Yes, but this doesn't have to be in a Linked Data spec., since it leads 
to problems at both the comprehension and implementation levels re., 
bootstrap.


> I'm also suggesting that anyone who wants to put an unlabeled node in 
> their JSON data because it simply makes sense to do so -- can do it.

Sure, but the spec itself doesn't have to cover that. Spec utilization 
examples could cover such matters. The context would clearly state 
"advanced use" for instance.

> Furthermore, they might *never* have to skolemize it. That's where 
> framing comes in. This pushes all of the skolemization into the back 
> corner where only an expert might have to know something about it. But 
> it's there to cover the cases where it's needed. 

Fine, as examples and tucked away in the advanced box :-)

> It also means that average Web developer can create an appropriate 
> unlabeled node and not have to worry about their JSON-LD processor 
> rejecting their data.

It won't reject their data. But it could mess up their expectations. 
Messed up expectations ultimately lead to adoption inertia.

>
>> What is the graph to you? Where are its boundaries?
>>
>> Linked Data is about a WWW of Linked Data. The Web's Global Data 
>> Space dimension. Everything Name (irrespective of URI scheme) has to 
>> resolve to a Representation of its Referent that accessible from an 
>> Addresss.
>
> Like I argued, I can devise a simple Resolver that resolves a blank 
> node URI to a Representation of its Referent. However, it would not be 
> at the so-called "Global Data Space" dimension, but I addressed this 
> issue above.

Yes, but the Global Data Space dimension (the GGG basically) is what 
TimBL's Linked Data meme is aimed at.
>
>>
>>> That may be covered by the recent adoption of the "SHOULD" text when 
>>> talking about labeling nodes, but perhaps it could be clearer.
>>>
>>> I expect users of JSON-LD to encounter situations where they think 
>>> they should be using unlabeled nodes. They shouldn't get the 
>>> impression that they must abandon JSON-LD all together if this 
>>> happens -- or that there's no solution to their use case in the 
>>> specification. I also don't think that a JSON-LD processor that is 
>>> generating triples or normalized JSON-LD should fail someone who is 
>>> contextualizing all of their JSON data and some of it simply needs 
>>> to use unlabeled nodes.
>>>
>>> All of this being said, if we still feel the need to adopt a new 
>>> name I can live with that. I just want to see that we don't cut 
>>> support for unlabeled nodes and would prefer that their use not be 
>>> discouraged, but rather put in its appropriate place.
>>
>> Yes, re. putting skolemization in its appropriate place i.e., not the 
>> front door of a lightweight spec for construction of graph based data 
>> representation using JSON :-)
>
> I agree that skolemization algorithms shouldn't be at the front door. 
> But I don't think that I'm arguing for that and I don't think that 
> calling the spec "JSON-LD" would require this.

It depends on when constitutes the spec. re. use of Linked Data.

BTW - I really thought we had closed this matter re. LD. Gregg and 
Bradley converted a lot of what I've been pushing for into a 
requirements doc [1] that I don't have issues with. I believe this 
document cleverly leaves room for blank nodes without any distracting 
Ads at the front door :-)


Links:

1. http://json-ld.org/requirements/latest/ -- JSON-LD requirements .

Kignsley
>
>>
>>
>> Kingsley
>>
>>>
>>> On 07/25/2011 11:08 PM, Manu Sporny wrote:
>>>> JSON-SD doesn't really roll off of the tongue... neither did 
>>>> JSON-LD or RDFa. HTML is only used because it's been around 
>>>> forever... but it's a pretty crappy brand name. Any thoughts on 
>>>> what this technology should be called as we ready it for public 
>>>> consumption?
>>>>
>>>> I was thinking: Structure
>>>>
>>>> "Structure allows you to express Linked Data in JSON"
>>>>
>>>> Yes, I realize that isn't entirely accurate, but tag-lines rarely 
>>>> are accurate. Thoughts on branding the technology so that it's easy 
>>>> to drop into a conversation without scaring Web developers away or 
>>>> making people feel as if the conversation is going to take a scary 
>>>> turn toward geek-speak?
>>>>
>>>> -- manu
>>>>
>>>
>>>
>>
>>
>
>


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Thursday, 28 July 2011 04:21:12 UTC