Re: Branding? from Dave Longley on 2011-07-28 (public-linked-json@w3.org from July 2011)

From: Dave Longley <dlongley@digitalbazaar.com>
Date: Thu, 28 Jul 2011 02:58:56 -0400
To: public-linked-json@w3.org
Message-ID: <4E3108B0.5020605@digitalbazaar.com>
On 07/28/2011 12:20 AM, Kingsley Idehen wrote:
> The trouble with JSON-LD is that it implies a spec for constructing 
> Linked Data using JSON.

This is confusing to me. My view is that the spec we're talking about is 
for constructing Linked Data using JSON. That it may also offer a few 
more features that handle what you don't strictly consider Linked Data 
doesn't seem to me to be a reason to drop the name entirely. I'm just 
not convinced by your argument that naming the spec JSON-LD would create 
a narrative that would mislead people and wreak havoc.

> "Linked Data" is the end product of a specific kind of directed graph 
> based structure. This structure can be used as a powerful vehicle for 
> whole data representation.

Ok, I agree.

> You are treating Linked Data and graphs as the same thing.

That's not my intention. My intention is to see Linked Data as a 
directed graph that is primarily populated with URIs that resolve to 
representations of their referents at web scale, but may also contain 
URIs that resolve to representations of their referents at local scale 
for the cases where creating a web-scale URI would be a detriment to the 
data or the Web developer. I'm interested in keeping forced 
skolemization out of the way of Web developers by allowing them to use 
unlabeled nodes without them having to care about it. And, if they do 
have a need for skolemization, I'd like there to be an algorithm that is 
part of the spec that solves the problem for them automatically and in 
one place. I think that's, more or less, what we have so far and what is 
reflected in the latest implementations of JSON-LD.

> Why is "LD" so vital beyond the fact (as I've claimed repeatedly) that 
> this is an awkward attempt to ride the Linked Data wave.

I've said that it isn't vital -- in my first message. My position is 
quite the opposite of yours, though, in that it seems very awkward to me 
that it is vital to you that we change the name. Please keep in mind 
that the spec was already named JSON-LD and now you're trying to make a 
case that we should rename it; not the other way around.

 From my perspective, we've written a specification that is primarily 
for people to start contextualizing their JSON so that it can be used at 
web scale. A lot of people think of that as "Linked Data". Therefore, 
it's just easier if we include that in the name. If there's some small 
part of the spec that pollutes what constitutes a more perfect Linked 
Data definition to you, then I honestly don't see that as a serious 
reason to rename the spec. Meaning, if I felt the same way, I don't 
think I'd consider it vital to rename the spec. Furthermore, I've argued 
that the loose definitions you previously offered don't preclude the use 
of unlabeled nodes in Linked Data. This has since changed a little once 
you alluded to a requirement that URIs be web-scale rather than local, 
but prior to that, your argument was not valid.

Since you've added the web-scale requirement, I've argued that I don't 
see a reason to create a concept (or try to pin down an amorphous one 
that has been floating about) that doesn't allow for people to express 
data as it already naturally occurs. That is, some data is local and 
some is global. Some global data references local data. Some currently 
local data should become global, but it doesn't make sense for *all* 
local data to be considered global. Therefore, an effort to attempt to 
get people to make their local data that *should* be global ought not 
force or coerce them to make their rightly local data also global. That 
would produce the same original mistake: the creation of improperly 
scoped data.

All of this means that whatever concept we come up with to describe this 
emerging world of web-scale information ought to accurately include all 
of these types of data. If that isn't Linked Data and you don't see any 
way that it could be, then you're right that Linked Data is the wrong 
name. It also means that Linked Data is just the same trap, but only in 
reverse, that our data has been caught in previously.

> I don't think so at all. We need more Venn Diagrams as collateral for 
> helping people understand concepts and boundaries.

I don't think we have to make this that complicated. I'm not against 
someone drawing up as many Venn Diagrams as they want to, but I don't 
think that they should be necessary to get moving with JSON-LD.

> It is supposed to be about JSON as a vehicle for constructing and 
> serializing Linked Data. I don't quite get the "practical markup to 
> support Linked Data in JSON" statement. Are other markups (including 
> those that are JSON based) for creating Linked Data impractical then?

Here's what is impractical: Telling a JSON Web developer that some of 
their data won't work with a JSON-LD processor and some of it will -- 
when the only parts that won't work are unlabeled nodes. They exist 
naturally in JSON and could be handled by the processor appropriately 
(as the current implementations do). It's impractical to coerce a Web 
developer to write a custom skolemization algorithm to create 
web-accessible URIs for types of data where it simply makes no sense to 
do so. A Web developer is going to try and make the bulk of their data 
"Linked Data" and then run into corner-cases where it just doesn't make 
any sense -- and it's going to be impractical for them to have to handle 
it. This will happen because *not all data is global, but a lot of 
global data links to non-global data*. We shouldn't say that this data 
has to be discarded or changed right out of the gate. It is impractical 
to force developers into a box when they just want to use Linked Data 
for most of their needs and get going.

> Why on earth do you deem anything view contrary to yours (or Manu's 
> for that matter) as "theoretical" or "impractical" ? Dare I ask about 
> your ownership of "pragmatism" and all things goal oriented and 
> achievable i.e., practical?

This is hyperbole. We're having a specific discussion about your 
perceived need to rename a technical specification that is presently 
called "JSON-LD". I'm simply not buying your argument is all.

> In my world view Names matter a lot. I prefer unambiguity over 
> ambiguity via Naming conventions, especially when the goal is to be 
> "deceptively simple" .

There is nothing "deceptively simple" about requiring every Web 
developer with an unlabeled subject to come up with their own customized 
skolemization algorithm for assigning it an arbitrary name and 
web-accessible URI that will likely never be dereferenced. In my view, 
this just generates superfluous management overhead for Web developers. 
It is better to pollute the concept of Linked Data than it is to force 
Web developers to pollute the Web with arbitrary URIs. If unambiguity is 
needed for unlabeled nodes in order to compare graphs, then let's 
indicate what the Naming convention is for Web developers and give them 
an API call to do it so the problem only has to be solved once.

> If no such Resolver exists then you don't have Linked Data delivered 
> by the graph type you are proposing.

Right, but I said that the Resolver exists. The Resolver for a blank 
node identifier just returns the graph of which it is a part. That is 
the only graph that references it. This Resolver is simpler than an HTTP 
Resolver -- which must use the HTTP protocol to go out and fetch a 
referring graph. But, I grant you, this is not a web-scale Resolver.

> Yes, but without blank nodes in the conversation we don't even 
> introduce the term: skolemization.

Just because we decide not to talk about unlabeled nodes doesn't mean 
that a Web developer won't run into them. What it does mean is that they 
will have no direction as to what to do in that situation. They'll just 
have to come up with their own customized skolemization algorithm. Let's 
solve the problem once -- and, we won't even have to mention 
skolemization to a Web developer just like we'd both prefer. This is 
because they'll be able to use their data, including unlabeled nodes, 
without ever having to think about assigning a name or using an 
automatically assigned name. The JSON-LD tools will just "work" -- which 
is what I consider "deceptively simple".

> I don't understand how an IRI that resolves to the representation of 
> its referent can in any way be construed as being devoid of meaning. 
> "I am who I am" dates back to biblical times :-)

When I say that there are examples of local data that don't have meaning 
outside of that which they are a part, I don't mean entirely devoid of 
meaning. I mean that if you don't know what the data is for then you 
can't use the data in the way it was intended.

Consider the URI: 'http://foo.com/z123'

When you resolve this URI, you get a graph containing the subject 
'http://foo.com/z123', a digital signature predicate (a URI), and a 
base64-encoded literal value as the object. Of what use is this 
information? Clearly it's a digital signature for something, but for 
what? Why would I ever refer to this subject without including the thing 
that it is a signature for? How does giving it a name so that I can 
refer to it in this way a useful exercise? Isn't there a further 
requirement to force local data to be considered useful Linked Data? It 
seems to me that I must not only provide arbitrary names and 
web-accessible URIs, but I also have to create reverse references. And 
that means I'll need to create more predicates (and thus, more URIs). 
Sounds like a headache to me, when I could have just said: No one is 
ever going to make sense of this digital signature outside of the thing 
it is a signature for; therefore, only the signed thing needs a 
web-accessible URI. And, if I could do that, everything would "just 
work", there's no extra effort needed by me.

Another example of where this would be a pain would be the modeling of 
any ordered list. Especially when the order might be subject to change. 
I'm sure there are plenty of other examples where some data that should 
be at web-scale is linked to data that has no place being there. 
Directly referring to that kind of data at web-scale simply doesn't make 
any sense.

If, for some reason, I need to refer to data at web-scale, then I should 
be using a URI. But when I don't, please don't make me jump through 
hoops just because it is purer to make *everything* have a URI. Seems 
like we're using a hammer to drive in a screw. JSON-LD should let me use 
an unlabeled node and not complain about it. And, if I need that node 
named in order to compare graphs, I'd like JSON-LD to do it for me. Keep 
the skolemization algorithm out of my way by giving me an API call to 
handle it. I believe that's what we've spec'd out and implemented so 
far, and I think it's the right approach.

>
>> I consider it a pollution of the web to add "Global Data Space IRIs" 
>> to subjects that have no business being accessed at a global level. 
>
> If something shouldn't be accessed at any level you protect it with 
> ACLs. That's what WebID facilitates.

This is misinterpretation of what I meant by "accessible", my apologies. 
I'm not talking about security. I meant what I mentioned above; it makes 
no sense to directly refer to certain subjects at web-scale. They are 
local to other subjects that do make sense when referred to at web-scale.

> Here is another way of looking at this matter: clickable data. When a 
> user clicks on a URI they get something back. Doing that consistently 
> at InterWeb scale with blank nodes in the mix is increasingly 
> problematic as data moves across data spaces.

I think that I would personally be ok with restricting blank nodes so 
that they don't "move across data spaces". I see their primary use case 
as being strictly local; there's nothing to click on, and no where to go 
with a blank node because you're already there.

>
>> I'm also suggesting that anyone who wants to put an unlabeled node in 
>> their JSON data because it simply makes sense to do so -- can do it.
>
> Sure, but the spec itself doesn't have to cover that. Spec utilization 
> examples could cover such matters. The context would clearly state 
> "advanced use" for instance.

I think the spec should be clear that it is permitted, but I have no 
problem with it being under "advanced use".

>
>> Furthermore, they might *never* have to skolemize it. That's where 
>> framing comes in. This pushes all of the skolemization into the back 
>> corner where only an expert might have to know something about it. 
>> But it's there to cover the cases where it's needed. 
>
> Fine, as examples and tucked away in the advanced box :-)

I can agree with this. Like I said, I'm not looking to front page the 
skolemization algorithm, but it should be in the spec under an advanced 
section.

> It depends on when constitutes the spec. re. use of Linked Data.
>
> BTW - I really thought we had closed this matter re. LD. Gregg and 
> Bradley converted a lot of what I've been pushing for into a 
> requirements doc [1] that I don't have issues with. I believe this 
> document cleverly leaves room for blank nodes without any distracting 
> Ads at the front door :-)

Well, my original message had to do with questioning the reasoning 
behind changing the spec name -- which seemed to have a lot to do with 
the use of unlabeled nodes.

-- 
Dave Longley
CTO
Digital Bazaar, Inc.
Received on Thursday, 28 July 2011 06:59:24 UTC