- From: glenn mcdonald <glenn@furia.com>
- Date: Fri, 1 Jul 2011 17:49:50 +0000
- To: Linked JSON <public-linked-json@w3.org>
- Message-ID: <BANLkTimQWMfPnb6c_w=yKNsjn5nUPKyw8A@mail.gmail.com>
I've been pondering RDFa and Microdata and what unifying them (or replacing them with a single new thing) might mean, in parallel to thinking about JSON graph-serialization. The thing that bothers me about all the approaches to embedding ids and predicates and scopes and types and such in HTML is that, well, they involve embedding, or attempting to embed, machine-audience data structures inside human-audience presentation structures. Why are we doing this at all? It's terrible and we know better, and it isn't even helpful. We may very reasonably want to know when a bit of presentation *corresponds* to a bit of data, but I see no argument at all for why the entire structures need or even want to be interleaved. Here is what might be a much, much simpler and yet better idea: 1. Add to HTML5 a new global attribute called "data". This takes, as a value, a space-separated list of absolute or relative IRIs, which identify data objects represented by the contents of the HTML element so-marked. The exact semantics of "represented by" are human, not technical, but we could provide many guiding examples. 2. Add to HTML5 a new element called "DATA". The contents of this are a canonical JSON serialization of the data structure underlying the contents of the page, presumably including (but not limited to) the objects referred to by "data" attributes on elements in the BODY. Isn't this vastly simpler to understand, produce and consume than any of the existing embedding schemes, to at least the same benefit? The embedding part of this is now concerned *only* with associating the visible content ands its corresponding data, so we get from 3 embedding schemes to 1. But maybe even more importantly, by separating the data-structure from the embedding we eliminate the need to have embedded encodings separate from the regular non-embedded encodings. And we provide the most compelling possible justification for using JSON as the canonical serialization (i.e., DATA is effectively a SCRIPT block with an implicit jsonp callback). And we eliminate the need for content negotiation in a vast number of cases, because a machine agent can just take the DATA from the page. And we ensure that people use IRIs for everything, because that's how it all works. And we start to establish the expectation that a data-backed page *should* have its data included. And then the task of this mailing-list/group/whatever would become very specific: provide the rules for how the DATA element is written. That is, it's not just *a* JSON serialization, but *the* JSON serialization. In fact, it's not just *a* web data-graph serialization, but basically *the* web data-graph serialization. glenn
Received on Friday, 1 July 2011 17:50:37 UTC