Re: a simplified Turtle-like profile for JSON-LD from Manu Sporny on 2024-03-31 (public-linked-json@w3.org from March 2024)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Sun, 31 Mar 2024 16:51:59 -0400
To: Michael Thornburgh <zenomt@zenomt.com>
Cc: public-linked-json@w3.org
Message-ID: <CAMBN2CR=fFmMe4FVQDNZNcVy-SHyr==f1yXxrMrKCrvwWZfctw@mail.gmail.com>
On Sun, Mar 31, 2024 at 3:31 PM Michael Thornburgh <zenomt@zenomt.com> wrote:
> > On Mar 31, 2024, at 8:24 AM, Manu Sporny <msporny@digitalbazaar.com> wrote:
> i was recently inspired by Tao Xin's [VanJS](https://vanjs.org). though
> not quite to his extreme — i still want my code to be pretty. :)

Yeah, stuff like that takes us in a good direction, IMHO. Things that
are easy for vanilla JS folks w/o a great deal of training is what we
were initially trying to shoot for with JSON-LD.

> i think that so long as JSON-LD wants to be "you can have your ad hoc
> JSON format and eat your RDF cake too" (or the other way around), most
> of the features will need to remain.

Perhaps, but we've already done that as JSON-LD v1.1... and it's not
clear to me how much of JSON-LD's usage is "convert your ad-hoc JSON
format to RDF" vs. "I want to simply express linked data using JSON".
I'd say its more about the latter than the former.

JSON-LD v1.1 isn't going away, and enables the "convert your ad-hoc
JSON format to RDF" use case. Done and shipped.

Maye we declare victory and make JSON-LD v2.0 more about making it
easier for vanilla JS devs to express linked data using simpler rules
(a profile)?

More thoughts on this below (again, treating this as a thought
exercise more than a call to arms)...

> as Ivan suggests in his next message, i think considering this profile
> as a replacement for Turtle (that also happens to be valid JSON-LD)
> might be more fruitful than trying to convince the JSON-LD community
> to give up most of their features.

Hmm, ok... right, so the group of people that would need to be
convinced would be the people using Turtle today? Do we know how many
Turtle developers there are... or how disgruntled they are? Most
everyone I know that uses Turtle/TRiG likes using the language. It
would be useful to have something like the TIOBE index for RDF
languages. :)

> triggering an externally-visible action (like loading an essential external
> context that's marked "do not cache") reveals that i'm parsing some document
> right now, even if getting that document didn't involve a network fetch or i
> got it some time ago. in other words, external contexts could be used as
> activity-tracking beacons.

Hmm, don't think I buy that argument.

Simply caching the document, or using something like Oblivious HTTP
mitigates this concern. If a server is insisting "do not cache" on a
context file, you don't have to listen to them, especially because
JSON-LD Context files shouldn't be changing that often. schema.org is
an example of one that does, but is also an example of one that you
can aggressively cache.

We have discussed this at length in groups using JSON-LD and most
every group comes to the same conclusion: suggest that context files
are permanently or aggressively cached (search for "cach").

https://w3c.github.io/vc-data-model/#base-context

> the main point of this claim is that, if external contexts are allowed, then
> processing any document is at least a _potentially_ asynchronous operation
> instead of a synchronous and purely local manipulation of data on hand. and
> that potential asynchronous operation entails the intrinsic complexity of the
> web and the Internet (IP, DNS, TCP and/or QUIC, TLS, HTTP), vs "these bytes here".

Sure, but it seems that you're analyzing that in a way that ignores
the benefits. There are benefits to using externally referenced
files... one of them being readability, no? Programming languages have
#include / import for this very reason.

Perhaps a post-processing step that "localizes" external references
would achieve what you want? This is effectively what "expansion"
does, though that's not a friendly syntax to use.

You might be searching for "JSON-LD expanded form that is terse and
intuitively readable by most programmers"?

> caching solves the problem as long as you've had a chance to (know to) retrieve
> them already. i find the notion of "permanent caching (for well-known contexts)"
> a little distasteful just on principle (since it divides contexts into
> "well-known" and "hoi polloi" classes) and implies some contexts as de facto
> parts of the language.

I wouldn't go that far. There are application domains that will share
common terminology, as that's one of the reasons to use linked data
(sharing common vocabularies/contexts). In fact, the more we move
towards one-off context files, the higher the chances that we've
failed to take advantage of linked data and vocabularies. Putting
everything into a local file increases the chances of using custom
vocabulary, which reduces interop... the opposite direction than the
one we want to head in.

There are two arguments in here that feel off:

1. That systems are offline when they process these things.
2. That we should be optimizing for LOTS of unknown JSON-LD Contexts.

I find both arguments tenuous. I can't remember the last time any
development or production system we had was truly offline and needed
to process an uncache-able JSON-LD document. Optimizing for
non-well-known JSON-LD Context files seems to march away from
increasing interoperability. Isn't the ideal outcome here a library of
well-known vocabularies and JSON-LD Context files?

> but at least not being able to retrieve a context (and
> therefore not being able to parse a document) is no worse than not being able
> to retrieve a CSS for your HTML document in practice (while your HTML _should_
> still be usable without the CSS, that's not actually the case for modern web
> pages anymore).

Well, it's a bit different, isn't it? If you can't fetch a JSON-LD
Context, you can't process the document. You just stop. That said, how
often is that a problem? Isn't that the question that needs to be
asked?

> the part of this issue that concerns me more is the "HTML and CSS are out of
> sync" problem, which to be sure is an edge case. it's just an edge case that
> doesn't need to exist in the first place. :)

Ah, but that argument exists in a vacuum, doesn't it? It's presuming
that there isn't a strong argument for that decoupling, which there is
-- reusability. If everyone has to become a JSON-LD Context author,
we've failed to make this stuff easy to use, right?

I know the argument is: "Well, let's simplify the JSON-LD Context so
anyone can write one in embedded form!", but that presumes a level of
sophistication that I'd argue is still out of reach for most
developers.

It's an equivalent argument to getting rid of #include / import in
programming languages, isn't it?

> i'm hoping my implementation's approach of "Plain Ol' JavaScript Objects that
> point directly to each other" could be both palatable and a convenient way to
> work with a graph in JS. i know this doesn't conform to the standard APIs.

Oh, I believe we very much still want something like that. We tried
(and failed) from 2010 to today to achieve that holy grail. :)

It would be really neat to have something like VanJS for RDF that
let's you build/express a graph and then possibly "compile it down" to
JSON-LD (which might use an external context, or it might embed a
simplified context).

> i hope so, and i'm glad i'm not the only one thinking about this problem.

I think many of us still think there is plenty of room to improve,
perhaps even go at this from a wildly different direction. Clarifying
who the target audience for this new approach/language will be
important to do early on so it's clear who the stakeholders are.

-- manu

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
https://www.digitalbazaar.com/
Received on Sunday, 31 March 2024 20:52:40 UTC