Re: [JSON] Initial comments from Thomas Steiner on 2011-02-24 (public-rdf-wg@w3.org from February 2011)

From: Thomas Steiner <tomac@google.com>
Date: Thu, 24 Feb 2011 10:07:29 +0100
To: nathan@webr3.org
Cc: RDF WG <public-rdf-wg@w3.org>
Message-ID: <AANLkTimOocw5v6EgJ8=Z3qeg+J9+OmnudUV5r85kD1-T@mail.gmail.com>
Nathan, all,

> One is to create a JSON serialization of RDF, capable of serializing all the
> RDF concepts (anything Turtle will be able to), an optimized machine to
> machine RDF transportation format.

> The second need is much more complex, to create a JSON format which allows
> people to publish and work with linked/web data easily.

Assuming I got Nathan mostly right, honestly I'm not entirely sure
whether I fully agree with his ideas. He basically writes, and correct
me if I'm wrong, that there is a need for two media types, let's call
them:

application/rdf-humans+json
application/rdf-robots+json

I'm quite new to the whole Semantic Web scene (so I beg your pardon
should I write something stupid and thanks for correcting me in either
case), however, from the beginning was under the impression that this
whole scene is not the easiest to enter. As most beginners, I was
first confronted with RDF/XML. I learnt to prefer Turtle very quickly,
though. OK, so far, so good. Now, let's consider what this WG (among
other things of course) tries to enable: to make John Doe Web
developer grasp the principles of RDF. A Web developer who might be a
jQuery god (or in the worst case a W3Schools alumnus), but never have
heard of the Semantic Web. Basically someone who loves JSON, because
it enables him to easily mash-up several APIs to a cool Web
application. Long story short, imagine for a minute our community
confronting this guy with the following made up introduction:

===
OK, so, the Semantic Web. You know, triples and such. Subject,
predicate, object. It's totally easy, look at Open Graph from Facebook
[sample here]. Open Graph is just the beginning. Look, there's more
[try to "wow" John Doe Web developer and show him DBpedia,
MusicBrainz, explain Linked Data, etc.]. Keen to use it: great! OK, so
there's RDFa for marking up your own content. Look, it's totally
simple [try to make it look simple]. OK, now, so, you want to use the
X API [cool Web 3.0 read/write-enabled API]. It's totally easy with
JavaScript. It's all JSON. It gives you application/rdf-robots+json.
You can even write to the service using PUT or POST [many real-world
APIs will even use GET, but let's ignore this for a moment]. You can
use application/rdf-humans+json. Or you can use
application/rdf-robots+json, but you don't really want to. Look,
there's this application/rdf-humans+json2application/rdf-robots+json
service. Yeah, it even has has a SOAP API for legacy reasons. Oh, and
for the same reasons, it's also able to return RDF/XML. RDF/XML? Oh,
you know, that's just another serialization. We used to use it in the
old days, before the Semantic Web was cool. OK, there're still some
services around that just speak RDF/XML, but you know, actually, they
should just move to application/rdf-humans+json. Or
application/rdf-robots+json. Or both. Hell, simply don't use them.
===

OK, most of this is heavily exaggerated, but I guess I made my point
clear. If we want people to use RDF in JSON, heck, it better be easy.
Ideally it would fit on a napkin or a beer coaster.

> I'm personally convinced that if we try to mix the two we'll be here for
> years and tbh, we'll simply fail. So, my first request would for people to
> either agree or disagree with what I've said above, put it to a vote and
> move on with doing the two distinct things.
I'm pretty sure Nathan has very good technical reasons for bringing up
the idea of two formats. I bring up the above perceptional reasons
against the technical reasons, and fair enough, Nathan's, for sure,
costed much more work brain-wise, and all respect for that. Simply,
first and above we should put the user. Make it fit on a napkin, make
it fit on a beer coaster, or see it fail with the John Doe Web
developers of this world. My point of view.

> JSON is so popular because it's focussed on simple key/value objects, with
> limited value types, essentially a JSON object is as simple as:
>
>  { "name": "nathan", "gender": "unknown" }
>
> and people can work with that data by doing:
>
>  print( obj.name );
>  obj.gender = "male";
All agreed.

> I'd place that as a constraint, that if we make it any more complicated than
> that, people simply won't use it.
All agreed. But, where do you put subjects?

> From a linked/web data angle, the most critical parts are to give things
> IRIs as identifiers, and to have a shared understanding of properties,
> perhaps better said as use IRIs to name properties.
>
> At the bare minimum that's all we need to get by, thus I'd also place these
> constraints on what we do, that those two needs must be fulfilled.
Agreed.

> To me, the above simply points to needing a way to specify @id's and some
> kind of data transformation map for these objects, essentially a simple map
> from property name to property URI
>
>  "name" -> "http://xmlns.com/foaf/0.1/name"
>  "gender" -> "http://xmlns.com/foaf/0.1/gender"
This sounds straight-forward, however, as soon as you start mixing
ex:name and foaf:name (and this will happen), you need the concept of
namespaces (call it CURIEs if you want). I have seen foaf.name or
foaf_name in some of the proposed JSON formats (was it JRON?), and I
fear, there's no way around sticking to the idea of namespace.

> The next question is whether that map needs to be in with the JSON, or
> outwith it (in an external document) - again this seems like an easy design
> trade-off to make, just as with CSS for HTML an external document makes a
> lot of sense, especially when you consider the common JSON use-cases (like
> twitter api), and it also allows bootstrapping on to existing data sources
> via a Link header or suchlike.
Not sure I get this, but I think, you're suggesting to introduce
something like a @profile document as in RDFa, or at least a
well-known URI where such document would be available, and not be
touched unless the modification was downwards-compatible. I'm not
against this idea, thinking of, e.g., well-known Link relations (e.g.,
@rel="license").

> Additionally, there is often a need to provide custom datatypes and to place
> restrictions on values for validation and the like, as with xsd and owl,
> which imho points to the need for something like JSON-Schema.
And this is where most Web developers (in my humble opinion) will
start to hate us. JSON is so successful because it limited itself to
the things that are available in almost every programming language,
arrays, integers, floats, booleans, objects (did I forget something?).
Introducing long ints, short ints, this kind of things, just makes
things brittle, strictly tightened, SOAPish. The Web is just not like
that.

> That's about it I'm afraid, something to discuss.
Very happy to get it started :-)

Best,
Tom

-- 
Thomas Steiner, Research Scientist, Google Inc.
http://blog.tomayac.com, http://twitter.com/tomayac
Received on Thursday, 24 February 2011 09:08:22 UTC