- From: Michael Pizzo <mikep@microsoft.com>
- Date: Thu, 29 Aug 2013 22:34:53 +0000
- To: "public-linked-json@w3.org" <public-linked-json@w3.org>
- Message-ID: <d36e5bf9a84b4918a61ee6477cb3111c@BN1PR03MB220.namprd03.prod.outlook.com>
Thanks for the quick response and thoughts Markus. I'm glad to see, from the responses so far, that there is interest in exploring some type of alignment. A few comments below: Just some quick thoughts/comments/questions. >> Dear JSON-LD Community; >> >> JSON-LD, OData's JSON format, and other formats built on JSON are >> trying to do very similar things (add "markup" to a JSON payload for >> things like ids, types, etc.). Unfortunately, since JSON doesn't >> define a way to differentiate properties from markup, each >> specification invents its own naming conventions to differentiate >> properties from markup. >> >> We have a real opportunity to align efforts here in defining a common >> convention for marking up JSON payloads. >> >> JSON-LD adds markup to JSON payloads by defining a set of keywords >> that begin with the "@" symbol. JSON-LD parsers understand these >> keywords and treat them differently than other properties.^ > >More or less the only reason we choose to prefix JSON-LD's keywords with an >@ symbol was to reduce the likelihood of a collision with already existing >property names. Right. Microsoft did the same thing in an early OData JSON format by prefixing keywords with double underscore to try and avoid collisions ("__metadata", "__count", etc). We later found we needed a more general/extensible mechanism that allowed third parties to annotate JSON objects and properties. >> OData's JSON format separates properties from markup through a >> namespacing mechanism similar to XML. Properties that contain a dot >> (.) (which most JSON parsers already treat differently) are "namespace >> qualified" names - the prefix before the dot is the namespace and the >> part after the last dot is the keyword within that namespace. > >By "treat differently" you mean that you can't access them using the dot >notation anymore, right? Exactly. >> This general mechanism allows anyone to extend a JSON payload with >> "markup", and JSON clients to differentiate markup from data, and >> ignore markup that they don't know/care about. > >Unless properties with a dot in them are already used. The same obviously >applies to properties colliding with one of the JSON-LD keywords Right. Since JSON doesn't define a means of annotating data with additional information, both OData and JSON-LD have defined conventions that attempt to differentiate data properties from other types of meta-information. We could discuss whether we needed a convention less likely to conflict, but having a common mechanism seems incredibly valuable. > (please >note that it is fine to use properties starting with an @ as long as it is >not a defined keyword from a JSON-LD perspective). I was wondering if the list of keywords was hard-coded or if the @ prefix were a general mechanism. There are advantages to both, of course; one is less restrictive for general property names and the other is more extensible. >Also, I don't think (at >least for JSON-LD) that we can differentiate between "markup" and "data". >It's not like HTML where you just markup some text. Losing, e.g., an >identifier of an entity is not really desired and most people wouldn't >classify that as markup - at least I wouldn't. Markup may be a poor choice of words. The general idea is that there is "data" and "meta" or "control information" (such as type, etc.). A simple JSON processor wouldn't know what to do with type, and wouldn't have to; it could just skip it. Even for the identifier, a general control that's just trying to paint data on a screen may be perfectly fine ignoring the identifier for an entity. It's only a consumer that understands that this JSON is JSON-LD, and wants to do something like link to the object, that cares about the identifier. That doesn't mean it's not there for consumers that do care about it, just that a namespacing mechanism for properties enables generic parsers to be trained to look for the meta-information they care about and ignore the rest. >> OData uses this *general mechanism* to add odata-specific markup, >> defined in the "odata" namespace. So "odata.id" is clearly recognized >> as the id keyword defined by the OData specification, and "odata.type" >> is clearly recognized as the type keyword defined by the OData >> specification (there is clearly an opportunity to align in some of >> these moving forward, but for right now I'm more interested in having >> a common "markup" convention). > >I haven't had a look at the latest OData draft yet, but how does a processor >know what odata (or any other prefix) stands for? Who owns it? Is there a >central registry for those prefixes? Good question. The answer today is currently somewhat specific to OData ("odata" is reserved, and the document references a metadata document that defines the prefixes). This is certainly an area that we could collaborate on as well. We could define a registry of well-known prefixes, together with a mechanism like XML has to define ad-hoc prefixes. >> Following this same common convention, JSON-LD could mark up a payload >> as: >> >> { >> "jsonld.context": "http://json-ld.org/contexts/person.jsonld", >> "jsonld.id": "http://dbpedia.org/resource/John_Lennon", >> "name": "John Lennon", >> "born": "1940-10-09", >> "spouse": "http://dbpedia.org/resource/Cynthia_Lennon" >> } > >You can do that already, although you would have to add a context (which in >the case of a JSON document could also be referenced by an HTTP Link header >[1]) aliasing the keywords [2]. For the sake of simplicity, I embed it >directly in the following example: > >{ > "@context": [ > { "jsonld.id": "@id" }, > "http://json-ld.org/contexts/person.jsonld" > ], > "jsonld.id": "http://dbpedia.org/resource/John_Lennon", > "name": "John Lennon", > "born": "1940-10-09", > "spouse": "http://dbpedia.org/resource/Cynthia_Lennon" >} Interesting. So (except for context) you could make the JSON-LD keywords information look like ODATA-JSON annotations. That's actually really encouraging, but still feels like a one-off for making JSON-LD look (mostly) like OData JSON, and not a general solution for custom/third party annotations. >> Regardless the syntax, providing a common convention for namespace >> qualifying "markup" keywords give us a real opportunity to foster >> consistency, reuse, and interoperability. > >If we are talking about namespacing, we shouldn't talk about JSON-LD's >keywords but its compact IRIs [3] which use colons as separator which is >aligned with XML CURIEs and all RDF serialization formats. In contrast to >keywords, that's something you can't change in JSON-LD. You can however, >work around it by explicitly mapping terms (as we call them) to CURIEs, >e.g., > > "foaf.name": "foaf:name" > > >> Both JSON-LD and OData are close to releasing an initial standard >> (OData has just progressed to a Committee Specification in OASIS), so >> the window is very close to closing on alignment, but the potential >> upside could be huge. Imagine being able to mark up the same JSON >> payload with JSON-LD keywords, odata keywords, and other >> "annotations". > >Is there anything that prevents that today? JSON-LD processors would ignore >all odata.xyz properties unless they are mapped to something in a context. >What are OData processors doing with JSON-LD keywords? I'm sure we could train processors to understand both OData's JSON format and JSON-LD as one-offs, but the problem becomes when the next JSON-based format comes along and defines their own way to add control information. Or, when someone simply wants to add custom annotations to a JSON payload. A namespacing mechanism allows a processor to understand a single, simple rule (like names containing a dot are namespaced) and anybody can add their own specific information to a payload, without worrying about conflicts. Processors/applications can pick and choose what they want to pay attention to. >> JSON parsers would have a common way to differentiate >> markup from data, and could consume/ignore/expose whatever markup they >> chose. > >As already sais above, I don't think we can differentiate between markup and >data in JSON-LD. Really? I think it would be very useful for a general JSON processor to recognize the data properties of a JSON-LD payload, even if just to paint it on a screen, without needing to know/understand/ignore all of the JSON-LD keywords. >> Would the JSON-LD community be open to working with the OData >> community to agree on a standard, extensible, namespaced mechanism >> that all JSON-based formats could use to extend JSON? > >We are a very open community and open for all suggestions that simplify >developer's lives. I can't say much at the moment because I haven't had a >look at OData for quite a while. Maybe it becomes a bit clearer to me when >you answer my questions above. From what I understand, a JSON-LD processor >wouldn't have any problem ignoring "OData markup". Again, thanks for taking the time for a detailed response. I actually learned a lot, and am encouraged that there may be a happy path here. I hope my answers above make sense, and help clarify the goal of moving from static, predefined keywords in each JSON-based format to a general, extensible, customizable annotation mechanism that everyone can use/understand. >[1] http://json-ld.org/spec/latest/json-ld/#interpreting-json-as-json-ld >[2] http://json-ld.org/spec/latest/json-ld/#aliasing-keywords >[3] http://json-ld.org/spec/latest/json-ld/#compact-iris > > > >-- >Markus Lanthaler >@markuslanthaler > > >
Received on Thursday, 29 August 2013 22:35:25 UTC