W3C home > Mailing lists > Public > public-html-data-tf@w3.org > November 2011

Re: Mapping RDFa to microdata+json

From: Jeni Tennison <jeni@jenitennison.com>
Date: Fri, 25 Nov 2011 09:45:30 +0000
Cc: Gregg Kellogg <gregg@kellogg-assoc.com>, Ivan Herman <ivan@w3.org>
Message-Id: <8BCA85E5-76E8-4A2E-A6AF-4294AE456EEF@jenitennison.com>
To: HTML Data Task Force WG <public-html-data-tf@w3.org>
On 25 Nov 2011, at 08:11, Ivan Herman wrote:
> What remains is the question on how JSON-LD could/should be mapped on the microdata-JSON. However, wouldn't that be a lossy mapping? Wouldn't we hit the issue of multiple typing, datatypes, etc, again?


Multiple typing isn't a problem: in microdata+json, any item can have multiple types and they are all just strings (the limitation that they have to be part of the same vocabulary is in the microdata specification, not the definition of the JSON vocabulary).

Datatypes and languages are obviously not represented in microdata+json, but the fact is that consumers who use these formats don't care. If they did, they would be using a richer data model and a richer syntax to support it.

The main question as far as I'm concerned is the source of the data for the mapping:

  1. the RDFa markup itself
  2. the two graphs that RDFa produces
  3. any RDF
  4. JSON-LD

#1 (directly from RDFa markup) would mean repeating the algorithm for RDFa parsing within a mapping specification (and the potential for the specs getting out of sync), but would mean that the generated microdata+json tree can closely match the structure of the original RDFa markup in the same way that it does for microformats-2 or microdata. It could also mean that the @vocab was picked up to determine when short property names were used, for example, which again makes the resulting microdata+json simpler to use.

#2 (using the graphs generated from RDFa) means that the converter can just plug in to the results of an RDFa parser, and the spec can be simpler, but I'm not sure that having information from both graphs gives much benefit over #3 (any RDF).

#3 (converting from any RDF) has the advantage of being usable from any RDF, not just that generated from RDFa, but means the generated microdata+json could be quite different from the original RDFa markup. It raises gnarly questions about when/whether/how to flatten the graph and when/whether/how to use short names for properties.

#4 (converting from JSON-LD) introduces a dependency on something that is not a W3C standard (might that mean IP issues?) and is still changing. On the other hand, it may well mean that the mapping to microdata+json can be defined easily because it shunts the difficult questions into the mapping from RDF(a) to JSON-LD instead.

I think overall my preference is probably for #1.

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Friday, 25 November 2011 09:45:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 25 November 2011 09:45:56 GMT