RE: Philosophy, was framing from Markus Lanthaler on 2013-08-02 (public-openannotation@w3.org from August 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Fri, 2 Aug 2013 10:13:23 +0200
To: "'Robert Sanderson'" <azaroth42@gmail.com>
Cc: "'Linked JSON'" <public-linked-json@w3.org>, "'public-openannotation'" <public-openannotation@w3.org>
Message-ID: <00fc01ce8f58$29536bb0$7bfa4310$@lanthaler@gmx.net>
On Thursday, August 01, 2013 8:36 PM, Robert Sanderson wrote:
> Sorry about the HTML... must have been from cut/pasting
> out of the spec.

Probably not, this email was HTML as well :-P


> As the framing issue is solved, thanks!, I changed the subject.
> 
> I have to disagree philosophically with you here.  I think that the
> JSON-ness (is "jsonic" a word?) of JSON-LD is a huge strength. Perhaps
> the fundamental strength of JSON-LD over any other RDF serialization.
> As Manu implies in his blog post on nuclear rdf, the fact that RDF/XML
> is unable to be usefully processed by XML tools or understood by
> people familiar with XML is a massive failing that has negatively
> impacted the adoption of RDF in general for many years.

I probably should have been a bit clearer in my last email. I completely
agree that it is the JSON-ness that makes JSON-LD so powerful and graspable
for average web developers.


> And to quote the post:   "RDF is plumbing... and developers don't need
> to know about it to use JSON-LD"

Right


> If you want to understand why your tools are adding this stupid
> "@graph" and "@id": "_:b1" crud all over your nice JSON, the answer
> is... RDF.

Not entirely. The main reason is that the data JSON-LD is serializing is a
graph and not a tree. There are multiple ways to serialize exactly the same
graph. The simplest form is to flatten everything and to connect the
different nodes with links (edges). In JSON that means that you end up
having an array of objects. Since we want to use short terms instead of full
IRIs, we need a context. Now we could add a @context property to each single
object in that array. That bloats the document up considerably if you use
embedded contexts:

  [
    { "@context": ..., -- other properties -- },
    { "@context": ..., -- other properties -- },
    { "@context": ..., -- other properties -- }
  ]


The alternative is to use a object at the document's top-level and move that
array into a member of the object instead. This means that we have to add
the context just to that top-level object

  {
    "@context": ...
    "data": [
      { -- other properties -- },
      { -- other properties -- },
      { -- other properties -- }
    ]
  }

Now we could discuss how to name that "data" property (and trust me, we
have). We decided that the most sensible thing to do is to call it @graph
because the value represents a graph and it allows us to use the same
mechanism to create named graphs. That's how we ended up with

  {
    "@context": ...
    "@graph": [
      { -- other properties -- },
      { -- other properties -- },
      { -- other properties -- }
    ]
  }

But typically, you don't even want such a structure because typically
there's a single node which could be thought of as the root of a tree
because it contains links (direct or indirect) to all other nodes. We could
try to write a complex algorithm to find that node automatically but I think
that would be too much magic. Publishers know which node it is in most cases
so it would be unnecessary anyway. Consumers may be interested in other
parts of the document or desire a different shape because it simplifies
their processing algorithms. Here's where flattening, (re)compaction and
framing come into play.


> But not even just RDF, it's a choice in the algorithms to
> include them as there's nothing in the spec that says they have to be
> there when not necessary.

Right, but as soon as you flatten those bnode ids become necessary because
otherwise you couldn't connect the different nodes anymore.


> That a JSON developer can look at an RDF
> serialization and instantly understand what is going on, without
> knowing the underlying model, is /the/ saving grace for the semantic
> web, IMO.  It is all about being easy /and/ semantic.

Fully agreed. That's exactly the reason why I said you shouldn't use @graph
in your examples. That however doesn't mean that a (JSON-LD) client can
safely assume that it will never be there. Or that there won't be a
top-level array... or different property names.


> So I would implore you to please reconsider the "anti-pattern" stance.

When I talked about anti-pattern I meant the coupling of the client to a
specific document structure. JSON-LD is all about eliminating that. If that
wouldn't be the case, all we would have to do is to add a profile media type
parameter to application/json and call it a day. The profile would then tell
your client how to interpret that specific structure. But we are not
interested in the structure, we are interested in the data.

We want be able to mix it with other data. We want to be able to use
different vocabularies. We want to empower consumers so that they can
declaratively reshape the data to the most useful form for their use case.
We want them to be able to easily create an in-memory representation of the
serialized graph so that they can walk it as they want.

I hope this helps to understand my position.


Cheers,
Markus


--
Markus Lanthaler
@markuslanthaler
Received on Friday, 2 August 2013 08:13:55 UTC