- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Sun, 30 May 2010 11:57:40 -0400
- To: RDFa Community <public-rdfa@w3.org>
On 05/30/2010 04:56 AM, Toby Inkster wrote: > On Sat, 29 May 2010 21:06:31 -0400 > Manu Sporny <msporny@digitalbazaar.com> wrote: > >> http://rdfa.digitalbazaar.com/specs/source/json-ld/ > > The document at several times uses the term "unambiguous", but I don't > think it is. For example, it says: > > In order to differentiate between plain text and IRIs, the < and > > are used around IRIs. > > But what about plain text that happens to start with "<" and end with > ">"? Escape characters: http://rdfa.digitalbazaar.com/specs/source/json-ld/#escape-character I'll need to beef up that section, but the general idea is that any special characters like "<", ">", and "^" MUST be escaped to not be interpreted as IRIs or TypedLiterals. Really, "<" for establishing IRIs only needs to be escaped if it's at the beginning of a string. ">" only needs to be escaped for IRIs if it's at the end of a string, and "^^" needs to be escaped if it is at the beginning of a string /and/ is the second element in an array. I haven't quite worked through whether or not these values should /always/ be escaped, or only in those instances. The key, though, is that as long as they're escaped, the markup is unambiguous. > For example: > > { > "dc:abstract" : "A discussion of the abbreviations in HTML.", > "dc:title" : "<abbr> versus <acronym>" > } This would be: { "dc:abstract" : "A discussion of the abbreviations in HTML.", "dc:title" : "\\<abbr\\> versus \\<acronym\\>" } or (if we employ some more involved escaping rules): { "dc:abstract" : "A discussion of the abbreviations in HTML.", "dc:title" : "\\<abbr> versus <acronym\\>" } or { "dc:abstract" : "A discussion of the abbreviations in HTML.", "dc:title" : "\\<abbr> versus <acronym>" } > Also, if you imagine the following two RDFa snippets, with different > meanings, they seem to have the same representation in JSON-LD: > > <!-- snippet 1 --> > <div typeof=""> > <span property="dc:modified" > datatype="xsd:dateTime">2010-05-29T14:17:39+02:00</span> > </div> > > <!-- snippet 2 --> > <div typeof=""> > <span property="dc:modified">2010-05-29T14:17:39+02:00</span> > <span property="dc:modified">^^xsd:dateTime</span> > </div> > > Both are represented as: > > { > "dc:modified" : ["2010-05-29T14:17:39+02:00", "^^xsd:dateTime"] > } Not if they're escaped... properly encoding the values is up to the application, but the first would be: { "dc:modified" : ["2010-05-29T14:17:39+02:00", "^^xsd:dateTime"] } and the second would be: { "dc:modified" : ["2010-05-29T14:17:39+02:00", "\\^\\^xsd:dateTime"] } > This could possibly be addressed by representing datatyped values like > this (i.e. similarly to RDF/JSON): > > { > "dc:modified" : { > "value" : "2010-05-29T14:17:39+02:00", > "datatype" : "xsd:dateTime", > } > } One of the goals of JSON-LD is being as terse as possible. The primary issue I have with RDF/JSON is that it is incredibly verbose. That verboseness turns most developers away because the JSON ends up being huge for real-world uses. We tried using RDF/JSON for our web services at one point and it ballooned the data sent to API calls by 200%-500%. So JSON-LD asserts the following lessons learned: 1. Deeply nested structures are very bad. 2. Terseness improves readability and reduces data size requirements. The key concept that makes JSON-LD stick out of the pack is "The Context". The Context makes compression of the JSON data possible. > How language tags are represented is not mentioned in the document, but > they could perhaps be handled similarly to datatypes. Yeah, I haven't put enough thought into that yet, but this may be where we end up: { "dc:title" : ["Abbreviations in HTML", "@en"], } or even: { "dc:title" : ["Abbreviations in HTML@en"], } The second is unambiguous if you use this algorithm: 1. Check the last 4 characters of the string 2. If it starts with "\", then it's a PlainLiteral. 3. If it starts with "@" it is a PlainLiteral with a language. > It seems pretty far out of scope for HTMLWG. Perhaps SWIG? SWIG can't publish REC-track documents, IIRC. >From HTML WG Charter (Scope): """Data and canvas are reasonable areas of work for the group.""" HTML WG is also the group that is publishing HTML+RDFa /and/ Microdata. Doesn't hurt to ask... and one only needs 3 supporters to publish a document via HTML WG. :) WebApps may be another option... from WebApps Charter (Scope): """The scope of the Web Applications Working Group covers the technologies related to developing client-side applications on the Web, including both markup vocabularies for describing and controlling client-side application behavior .... Additionally, server-side APIs for support of client-side functionality will be defined as needed.""" W3C really needs a Working Group to create POSIX for the Web. A set of Web APIs and calling conventions to enable websites to expose common APIs (like login, logout, certificate registration, sign up, etc.). -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.2.2 - Good Relations and Ditching Apache+PHP http://blog.digitalbazaar.com/2010/05/06/bitmunk-3-2-2/2/
Received on Sunday, 30 May 2010 15:58:10 UTC