Re: [JSON] Some general serialization "things" from Nathan on 2011-03-23 (public-rdf-wg@w3.org from March 2011)

From: Nathan <nathan@webr3.org>
Date: Wed, 23 Mar 2011 14:44:03 +0000
To: Sandro Hawke <sandro@w3.org>
CC: RDF WG <public-rdf-wg@w3.org>
Message-ID: <4D8A0733.90006@webr3.org>
Hi Sandro,

Sandro Hawke wrote:
> On Wed, 2011-03-23 at 13:44 +0000, Nathan wrote
>> Just a note to say that there are many weird and wonderful options...
> 
> Indeed, and it reminded be of the big divide between Group C (who uses
> libraries) and Groups A and B (who do not).   
> 
> Group C doesn't care as much about what the json looks like as Groups A
> and B, but I'm sure they do care.  They want it to be (1) small, (2)
> fast, and (3) easy to read by humans, for those rare times when they
> need to think about what it actually looks like, eg in debugging [which
> is rare, of course :-) ].    Your examples seem to suggest we can find
> nice ways to optimize among these trade-offs.

Yes exactly, if it's a machine optimized round-tripiable rdf in json 
serialization then there's a whole host of optimizations that could be made.

> In contrast, while Groups A and B do care about the above, they care
> much more about how easy it is to write code to deal with this data.

Exactly again, that's the distinction I keep trying to make when I 
suggest we need two serialization :) - perhaps better termed as one RDF 
Serialization, and one way of viewing simple "obvious json" as RDF.

> Of course, you know this, Nathan -- you once went through and wrote
> javascript snippets of how to do things with RDF in various styles,
> which alas I can't find -- but I did want to point it out for the group.

I can't remember either! I write too much lol.

> In looking for that post of yours, I came across Jeni's great post [1]
> where she quotes you as saying:
> 
>         You can’t shoe horn RDF in to JSON, no matter how hard you try -
>         well, you can, but you loose all the benefits of JSON in the
>         first place, because the data is RDF, triples and not objects,
>         rdf nodes and not simple values
>         
> and then paraphrases it herself, as:
> 
>         In other words, using JSON as the basis for an RDF syntax
>         doesn’t actually win you anything in terms of the ease of
>         processing of that RDF. In fact, I’ll go further and say it has
>         exactly the same bad qualities as RDF/XML.
>         
> ... which several people in this WG have pointed out.  I wonder if we as
> a group have consensus on this view, or there are other angles.

Ahh yes, I followed up on that later [1] to clarify:

[[
However I have to point out that I probably didn't make the context 
clear, I was referring specifically to making a human friendly/optimized 
JSON serialization of RDF, you can however, very easily, create a 
machine optimized JSON-compatible serialization of RDF, without the 
drawbacks of XML (you just don't pin it as being human friendly JSON as 
outlined above) because unlike XML which requires a full XML stack, and 
RDF/XML which can be serialized in any one of a billion ways, we have a 
chance here to make a clean lightweight unambiguous serialization of 
RDF, one which is based on a lightweight data interchange format and not 
on a heavy extensible document interchange format.
]]

[1] 
http://webr3.org/blog/semantic-web/rdf-api-json-serialization-and-standardization/

> I think Manu disagrees, focusing on the greenbox and the trick of using
> external mapping information.  The job of the shoehorn is much easier
> when there are extra secret storage compartments.   (or maybe: stretchy
> shoes.)
> 
> My vague sense is that we'll get the most benefit focusing on giving
> json folks SPARQL results instead of RDF per se.  I think that addresses
> most of the use cases more simply.   (And that may be out of scope for
> this WG, but let's come back to that after we've figured what technology
> standards would actually really help folks here.)

Jury is still out for me to be honest, I don't think I fully understand 
the benefits of each approach yet, and haven't studied your latest posts 
in detail yet. I will say though, that where previously I say two 
distinct groups, I think I see three now.

1: Publisher uses RDF/SemWeb stack behind the scenes, wants to publish 
RDF triples in a JSON serialization (for rdf on the wire data interchange)

2: Publisher uses RDF/SemWeb stack behind the scenes, wants to give 
access to that data using some API via sparql, or by giving direct 
access to sparql endpoint, wants to return JSON (providing data from RDF 
in a simpler form to developers, perhaps making it round trip capable 
back to rdf)

3: Publisher uses *non* RDF/SemWeb stack behind the scenes, and wants to 
let developers see their data as "simple rdf" (typically simple JSON 
objects w/ linked data values -> URIs as IDs, shared schemas/properties etc)

Seeing (2) is new for me, and I can't tell if we're talking SPARQL 
results or linked-data-api or what, and am slightly worried that it was 
easy to see a distinction between 1 & 3, and that 2 could potentially 
either not cover the use cases for 1 & 3 at all, or (imho worse) try to 
cover the requirements of 1 & 3 and become a complete mess.

I shouldn't speak about this atm until I understand better, I feel we're 
at a time where it's critical to focus on understanding the problem 
space(s) and potential solutions/trade-offs/options/approaches, but 
being pushed in to making decisions and casting opinions or 
backing/rejecting options, without fully understanding the big picture.

Best,

Nathan
Received on Wednesday, 23 March 2011 14:45:03 UTC