[JSON] object-based JSON vs. triple-based JSON

Hi all,

Just wanted to get some thoughts down while they were fresh in my mind
concerning the two camps issue related to RDF in JSON. Instead of
defining these two camps as "human-friendly" and "machine-friendly",
perhaps it would be better to define the two camps as "object-based" and
"triple-based":

object-based
------------

The object-based design approach for RDF in JSON places the way
developers use JSON in their applications over how the data is modeled
in RDF (as triples). This camp would like to see a serialization that
looks something like JSON-LD or JSN3. The following example is JSON-LD:

{
  "#": {"foaf": "http://xmlns.com/foaf/0.1/"},
  "@": "http://example.org/people#john",
  "a": "foaf:Person",
  "foaf:name" : "John Lennon"
}

triple-based
------------

The triple-based design approach for RDF in JSON celebrates the way that
the data is modeled in RDF (as triples) by exposing developers to the
raw data. This camp would like to see a serialization that looks
something like RDF/JSON or JTriples. The following example is JTriples:

[
 {
  "s": "<http://example.org/people#john>",
  "p": "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>",
  "o": "<http://xmlns.com/foaf/0.1/Person>"
 },
 {
  "s": "<http://example.org/people#john>",
  "p": "<http://xmlns.com/foaf/0.1/name>",
  "o": "John Lennon"
 }
]

What is success?
----------------

I consider plain old JSON successful because it is the de-facto standard
format used to exchange data between Web Applications and back-end REST
Web Services. This means that almost every web applications developer I
know uses JSON as the serialization format for their data. We're talking
millions of developers and billions of documents.

Now, let's define what success is for the RDF in JSON format. As far as
I see it, success results in RDF in JSON being used as the format used
for exchange of semantic data between Web Applications and back-end REST
Web Services. This means that almost every web applications developer
ends up using RDF in JSON as the serialization format for their semantic
data. We're talking millions of developers and billions of documents.

Mistakes of the past
--------------------

I say this with the utmost respect for the folks that were involved with
the initial creation of RDF/XML. They are a very talented, smart group
of people. You do what you think is best at the time and only history
can tell if you made the right decision. Hindsight is 20/20, etc.

At some point during the telecon I made the point that RDF/XML is a
failed format. Steve Harris said "RDF/XML is was very widely used too,
it's just not liked" and Sandro followed up with "I don't agree that
RDFa was more successful than RDF/XML."

I don't claim to know which definition of success either Steve or Sandro
were using - you can always argue that a particular technology is
successful. However, based on my definition of success above, I think
RDF/XML is a failed format and RDFa is on its way to becoming a
moderately successful format (within the next 10 years). I think it's
important that everyone understand why RDFa was successful where RDF/XML
wasn't successful.

I've never seen data to back up whether or not RDF/XML is successful -
if someone has something, please share it. Here's the data to back up
that RDFa is on its way to being successful:

http://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/

Every major search company now indexes RDFa, I don't think you can say
the same for RDF/XML. 3.6% of all web pages (36B+ documents) now contain
some RDFa in them. How many RDF/XML documents are being published today
on the Web? I don't know, but I don't think it approaches 3.6% of all
documents on the web. How many search companies index RDF/XML? None that
I know of.

I don't think any of us want to repeat the story of RDF/XML with RDF in
JSON. Perhaps by focusing on what made HTML+RDFa successful, we can hope
to see more than 3.6% of web services using RDF in JSON in the coming 5
years.

So, why did HTML+RDFa "succeed" where RDF/XML failed? If I had to point
to the main thing, I'd say this:

RDF/XML mandated that developers would have to start publishing their
semantic data (RDF/XML) in a wholly different way than they were
publishing their data before (HTML).

HTML+RDFa did not mandate that the core data should be published
differently, we very deliberately took the approach that meaning could
be layered on top of the HTML data that was already being published.
That is why RDFa does not contain any new elements, only attributes. You
take the data that people are already publishing and figure out the
least disruptive changes to their workflow that results in useful Linked
Data. This is the object-based JSON approach that JSON-LD takes - add
semantic meaning to data that is already being published.

Publishing data as JSON
-----------------------

JSON is the de-facto web services data format - that is how people are
publishing their data today. JSON data is almost always object-based:

Twitter API: http://dev.twitter.com/doc/get/users/show#example-request
CouchDB API: http://wiki.apache.org/couchdb/HTTP_Document_API#PUT
FourSquare API:
http://groups.google.com/group/foursquare-api/web/api-documentation
Google Maps API:
http://code.google.com/apis/maps/documentation/localsearch/jsondevguide.html#using_json

That is, the way people are publishing their data today is not
triple-based and asking them to start publishing their data as triples
is to utilize the same failed data publishing strategy that RDF/XML did.

The path to success
-------------------

One of the ingredients for success for RDF in JSON is ensuring that we
don't repeat the RDF/XML data format mandate mistake. If we take the
triples-based approach, we are making that same mistake.

I am almost certain that if we were to adopt the triples-based approach
that we would be "successful" in creating another data format for the
thousands of people that already use RDF/XML and TURTLE. That cannot be
our focus for the RDF in JSON format - it will barely affect RDF's
adoption rate into the general web developer populace. The
RDF/XML/TURTLE community doesn't need yet another serialization format
for RDF. It is a very low bar to reach that will not result in the
expansion of mindshare for RDF.

We have a large group of very smart and capable people here. We should
shoot higher. We should set our sights on where JSON is right now -
billions of documents and millions of developers. What we need to do is
bring RDF to the millions of developers that interact with Web Services
and we should do it on their terms. Not terms (triple-based) that we
impose on them, but rather terms (object-based) with which they are
already familiar. In order to do that successfully, we can't drastically
disrupt their workflow - which includes object-based JSON.

At this point, I'd like to see a strong counter-argument against the
object-based approach described above.

Apologies if I came off a bit strong on the call, I have some very
strong opinions (backed with data and implementation experience) on what
will lead to success for the serialization. This is also the reason I'm
reluctantly speaking for the Task Force. My reluctance is that I have to
balance my strong opinions vs. fairly representing everyone elses in the
discussion and I'm concerned about accidentally crossing that line and
not realizing it. I hope that someone out there will let all of us know
if/when this happens.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Payment Standards and Competition
http://digitalbazaar.com/2011/02/28/payment-standards/

Received on Wednesday, 9 March 2011 19:45:47 UTC