RE: Defining a common convention for marking up JSON

Hi Michael,

Just some quick thoughts/comments/questions.

On Tuesday, August 27, 2013 10:52 PM, Michael Pizzo wrote:
> Dear JSON-LD Community;
> 
> JSON-LD, OData's JSON format, and other formats built on JSON are
> trying to do very similar things (add "markup" to a JSON payload for
> things like ids, types, etc.). Unfortunately, since JSON doesn't
> define a way to differentiate properties from markup, each
> specification invents its own naming conventions to differentiate
> properties from markup.
> 
> We have a real opportunity to align efforts here in defining a common
> convention for marking up JSON payloads.
> 
> JSON-LD adds markup to JSON payloads by defining a set of keywords
> that begin with the "@" symbol. JSON-LD parsers understand these
> keywords and treat them differently than other properties.^

More or less the only reason we choose to prefix JSON-LD's keywords with an
@ symbol was to reduce the likelihood of a collision with already existing
property names. 


> OData's JSON format separates properties from markup through a
> namespacing mechanism similar to XML. Properties that contain a dot
> (.) (which most JSON parsers already treat differently) are "namespace
> qualified" names - the prefix before the dot is the namespace and the
> part after the last dot is the keyword within that namespace.

By "treat differently" you mean that you can't access them using the dot
notation anymore, right?


> This general mechanism allows anyone to extend a JSON payload with
> "markup", and JSON clients to differentiate markup from data, and
> ignore markup that they don't know/care about.

Unless properties with a dot in them are already used. The same obviously
applies to properties colliding with one of the JSON-LD keywords (please
note that it is fine to use properties starting with an @ as long as it is
not a defined keyword from a JSON-LD perspective). Also, I don't think (at
least for JSON-LD) that we can differentiate between "markup" and "data".
It's not like HTML where you just markup some text. Losing, e.g., an
identifier of an entity is not really desired and most people wouldn't
classify that as markup - at least I wouldn't.


> OData uses this *general mechanism* to add odata-specific markup,
> defined in the "odata" namespace. So "odata.id" is clearly recognized
> as the id keyword defined by the OData specification, and "odata.type"
> is clearly recognized as the type keyword defined by the OData
> specification (there is clearly an opportunity to align in some of
> these moving forward, but for right now I'm more interested in having
> a common "markup" convention).

I haven't had a look at the latest OData draft yet, but how does a processor
know what odata (or any other prefix) stands for? Who owns it? Is there a
central registry for those prefixes?


> Following this same common convention, JSON-LD could mark up a payload
> as:
> 
> {
>   "jsonld.context": "http://json-ld.org/contexts/person.jsonld",
>   "jsonld.id": "http://dbpedia.org/resource/John_Lennon",
>   "name": "John Lennon",
>   "born": "1940-10-09",
>   "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
> }

You can do that already, although you would have to add a context (which in
the case of a JSON document could also be referenced by an HTTP Link header
[1]) aliasing the keywords [2]. For the sake of simplicity, I embed it
directly in the following example:

{
  "@context": [
    { "jsonld.id": "@id" },
    "http://json-ld.org/contexts/person.jsonld"
  ],
  "jsonld.id": "http://dbpedia.org/resource/John_Lennon",
  "name": "John Lennon",
  "born": "1940-10-09",
  "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}


> Regardless the syntax, providing a common convention for namespace
> qualifying "markup" keywords give us a real opportunity to foster
> consistency, reuse, and interoperability.

If we are talking about namespacing, we shouldn't talk about JSON-LD's
keywords but its compact IRIs [3] which use colons as separator which is
aligned with XML CURIEs and all RDF serialization formats. In contrast to
keywords, that's something you can't change in JSON-LD. You can however,
work around it by explicitly mapping terms (as we call them) to CURIEs,
e.g.,

   "foaf.name": "foaf:name"


> Both JSON-LD and OData are close to releasing an initial standard
> (OData has just progressed to a Committee Specification in OASIS), so
> the window is very close to closing on alignment, but the potential
> upside could be huge. Imagine being able to mark up the same JSON
> payload with JSON-LD keywords, odata keywords, and other
> "annotations".

Is there anything that prevents that today? JSON-LD processors would ignore
all odata.xyz properties unless they are mapped to something in a context.
What are OData processors doing with JSON-LD keywords?


> JSON parsers would have a common way to differentiate
> markup from data, and could consume/ignore/expose whatever markup they
> chose.

As already sais above, I don't think we can differentiate between markup and
data in JSON-LD.


> Would the JSON-LD community be open to working with the OData
> community to agree on a standard, extensible, namespaced mechanism
> that all JSON-based formats could use to extend JSON?

We are a very open community and open for all suggestions that simplify
developer's lives. I can't say much at the moment because I haven't had a
look at OData for quite a while. Maybe it becomes a bit clearer to me when
you answer my questions above. From what I understand, a JSON-LD processor
wouldn't have any problem ignoring "OData markup".


[1] http://json-ld.org/spec/latest/json-ld/#interpreting-json-as-json-ld
[2] http://json-ld.org/spec/latest/json-ld/#aliasing-keywords
[3] http://json-ld.org/spec/latest/json-ld/#compact-iris



--
Markus Lanthaler
@markuslanthaler

Received on Wednesday, 28 August 2013 07:50:11 UTC