- From: Michael Pizzo <mikep@microsoft.com>
- Date: Thu, 29 Aug 2013 22:34:53 +0000
- To: "public-linked-json@w3.org" <public-linked-json@w3.org>
- Message-ID: <d36e5bf9a84b4918a61ee6477cb3111c@BN1PR03MB220.namprd03.prod.outlook.com>
Thanks for the quick response and thoughts Markus. I'm glad to see, from the responses so far, that there is interest in exploring some type of alignment.
A few comments below:
Just some quick thoughts/comments/questions.
>> Dear JSON-LD Community;
>>
>> JSON-LD, OData's JSON format, and other formats built on JSON are
>> trying to do very similar things (add "markup" to a JSON payload for
>> things like ids, types, etc.). Unfortunately, since JSON doesn't
>> define a way to differentiate properties from markup, each
>> specification invents its own naming conventions to differentiate
>> properties from markup.
>>
>> We have a real opportunity to align efforts here in defining a common
>> convention for marking up JSON payloads.
>>
>> JSON-LD adds markup to JSON payloads by defining a set of keywords
>> that begin with the "@" symbol. JSON-LD parsers understand these
>> keywords and treat them differently than other properties.^
>
>More or less the only reason we choose to prefix JSON-LD's keywords with an
>@ symbol was to reduce the likelihood of a collision with already existing
>property names.
Right. Microsoft did the same thing in an early OData JSON format by prefixing keywords with double underscore to try and avoid collisions ("__metadata", "__count", etc). We later found we needed a more general/extensible mechanism that allowed third parties to annotate JSON objects and properties.
>> OData's JSON format separates properties from markup through a
>> namespacing mechanism similar to XML. Properties that contain a dot
>> (.) (which most JSON parsers already treat differently) are "namespace
>> qualified" names - the prefix before the dot is the namespace and the
>> part after the last dot is the keyword within that namespace.
>
>By "treat differently" you mean that you can't access them using the dot
>notation anymore, right?
Exactly.
>> This general mechanism allows anyone to extend a JSON payload with
>> "markup", and JSON clients to differentiate markup from data, and
>> ignore markup that they don't know/care about.
>
>Unless properties with a dot in them are already used. The same obviously
>applies to properties colliding with one of the JSON-LD keywords
Right. Since JSON doesn't define a means of annotating data with additional information, both OData and JSON-LD have defined conventions that attempt to differentiate data properties from other types of meta-information. We could discuss whether we needed a convention less likely to conflict, but having a common mechanism seems incredibly valuable.
> (please
>note that it is fine to use properties starting with an @ as long as it is
>not a defined keyword from a JSON-LD perspective).
I was wondering if the list of keywords was hard-coded or if the @ prefix were a general mechanism. There are advantages to both, of course; one is less restrictive for general property names and the other is more extensible.
>Also, I don't think (at
>least for JSON-LD) that we can differentiate between "markup" and "data".
>It's not like HTML where you just markup some text. Losing, e.g., an
>identifier of an entity is not really desired and most people wouldn't
>classify that as markup - at least I wouldn't.
Markup may be a poor choice of words. The general idea is that there is "data" and "meta" or "control information" (such as type, etc.). A simple JSON processor wouldn't know what to do with type, and wouldn't have to; it could just skip it.
Even for the identifier, a general control that's just trying to paint data on a screen may be perfectly fine ignoring the identifier for an entity. It's only a consumer that understands that this JSON is JSON-LD, and wants to do something like link to the object, that cares about the identifier. That doesn't mean it's not there for consumers that do care about it, just that a namespacing mechanism for properties enables generic parsers to be trained to look for the meta-information they care about and ignore the rest.
>> OData uses this *general mechanism* to add odata-specific markup,
>> defined in the "odata" namespace. So "odata.id" is clearly recognized
>> as the id keyword defined by the OData specification, and "odata.type"
>> is clearly recognized as the type keyword defined by the OData
>> specification (there is clearly an opportunity to align in some of
>> these moving forward, but for right now I'm more interested in having
>> a common "markup" convention).
>
>I haven't had a look at the latest OData draft yet, but how does a processor
>know what odata (or any other prefix) stands for? Who owns it? Is there a
>central registry for those prefixes?
Good question. The answer today is currently somewhat specific to OData ("odata" is reserved, and the document references a metadata document that defines the prefixes). This is certainly an area that we could collaborate on as well. We could define a registry of well-known prefixes, together with a mechanism like XML has to define ad-hoc prefixes.
>> Following this same common convention, JSON-LD could mark up a payload
>> as:
>>
>> {
>> "jsonld.context": "http://json-ld.org/contexts/person.jsonld",
>> "jsonld.id": "http://dbpedia.org/resource/John_Lennon",
>> "name": "John Lennon",
>> "born": "1940-10-09",
>> "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
>> }
>
>You can do that already, although you would have to add a context (which in
>the case of a JSON document could also be referenced by an HTTP Link header
>[1]) aliasing the keywords [2]. For the sake of simplicity, I embed it
>directly in the following example:
>
>{
> "@context": [
> { "jsonld.id": "@id" },
> "http://json-ld.org/contexts/person.jsonld"
> ],
> "jsonld.id": "http://dbpedia.org/resource/John_Lennon",
> "name": "John Lennon",
> "born": "1940-10-09",
> "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
>}
Interesting. So (except for context) you could make the JSON-LD keywords information look like ODATA-JSON annotations. That's actually really encouraging, but still feels like a one-off for making JSON-LD look (mostly) like OData JSON, and not a general solution for custom/third party annotations.
>> Regardless the syntax, providing a common convention for namespace
>> qualifying "markup" keywords give us a real opportunity to foster
>> consistency, reuse, and interoperability.
>
>If we are talking about namespacing, we shouldn't talk about JSON-LD's
>keywords but its compact IRIs [3] which use colons as separator which is
>aligned with XML CURIEs and all RDF serialization formats. In contrast to
>keywords, that's something you can't change in JSON-LD. You can however,
>work around it by explicitly mapping terms (as we call them) to CURIEs,
>e.g.,
>
> "foaf.name": "foaf:name"
>
>
>> Both JSON-LD and OData are close to releasing an initial standard
>> (OData has just progressed to a Committee Specification in OASIS), so
>> the window is very close to closing on alignment, but the potential
>> upside could be huge. Imagine being able to mark up the same JSON
>> payload with JSON-LD keywords, odata keywords, and other
>> "annotations".
>
>Is there anything that prevents that today? JSON-LD processors would ignore
>all odata.xyz properties unless they are mapped to something in a context.
>What are OData processors doing with JSON-LD keywords?
I'm sure we could train processors to understand both OData's JSON format and JSON-LD as one-offs, but the problem becomes when the next JSON-based format comes along and defines their own way to add control information. Or, when someone simply wants to add custom annotations to a JSON payload.
A namespacing mechanism allows a processor to understand a single, simple rule (like names containing a dot are namespaced) and anybody can add their own specific information to a payload, without worrying about conflicts.
Processors/applications can pick and choose what they want to pay attention to.
>> JSON parsers would have a common way to differentiate
>> markup from data, and could consume/ignore/expose whatever markup they
>> chose.
>
>As already sais above, I don't think we can differentiate between markup and
>data in JSON-LD.
Really? I think it would be very useful for a general JSON processor to recognize the data properties of a JSON-LD payload, even if just to paint it on a screen, without needing to know/understand/ignore all of the JSON-LD keywords.
>> Would the JSON-LD community be open to working with the OData
>> community to agree on a standard, extensible, namespaced mechanism
>> that all JSON-based formats could use to extend JSON?
>
>We are a very open community and open for all suggestions that simplify
>developer's lives. I can't say much at the moment because I haven't had a
>look at OData for quite a while. Maybe it becomes a bit clearer to me when
>you answer my questions above. From what I understand, a JSON-LD processor
>wouldn't have any problem ignoring "OData markup".
Again, thanks for taking the time for a detailed response. I actually learned a lot, and am encouraged that there may be a happy path here. I hope my answers above make sense, and help clarify the goal of moving from static, predefined keywords in each JSON-based format to a general, extensible, customizable annotation mechanism that everyone can use/understand.
>[1] http://json-ld.org/spec/latest/json-ld/#interpreting-json-as-json-ld
>[2] http://json-ld.org/spec/latest/json-ld/#aliasing-keywords
>[3] http://json-ld.org/spec/latest/json-ld/#compact-iris
>
>
>
>--
>Markus Lanthaler
>@markuslanthaler
>
>
>
Received on Thursday, 29 August 2013 22:35:25 UTC