Re: Mapping RDFa to microdata+json from Jeni Tennison on 2011-11-25 (public-html-data-tf@w3.org from November 2011)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Fri, 25 Nov 2011 09:45:30 +0000
To: HTML Data Task Force WG <public-html-data-tf@w3.org>
Cc: Gregg Kellogg <gregg@kellogg-assoc.com>, Ivan Herman <ivan@w3.org>
Message-Id: <8BCA85E5-76E8-4A2E-A6AF-4294AE456EEF@jenitennison.com>

On 25 Nov 2011, at 08:11, Ivan Herman wrote:
> What remains is the question on how JSON-LD could/should be mapped on the microdata-JSON. However, wouldn't that be a lossy mapping? Wouldn't we hit the issue of multiple typing, datatypes, etc, again?

Multiple typing isn't a problem: in microdata+json, any item can have multiple types and they are all just strings (the limitation that they have to be part of the same vocabulary is in the microdata specification, not the definition of the JSON vocabulary).

Datatypes and languages are obviously not represented in microdata+json, but the fact is that consumers who use these formats don't care. If they did, they would be using a richer data model and a richer syntax to support it.

The main question as far as I'm concerned is the source of the data for the mapping:

1. the RDFa markup itself
2. the two graphs that RDFa produces
3. any RDF
4. JSON-LD

#1 (directly from RDFa markup) would mean repeating the algorithm for RDFa parsing within a mapping specification (and the potential for the specs getting out of sync), but would mean that the generated microdata+json tree can closely match the structure of the original RDFa markup in the same way that it does for microformats-2 or microdata. It could also mean that the @vocab was picked up to determine when short property names were used, for example, which again makes the resulting microdata+json simpler to use.

#2 (using the graphs generated from RDFa) means that the converter can just plug in to the results of an RDFa parser, and the spec can be simpler, but I'm not sure that having information from both graphs gives much benefit over #3 (any RDF).

#3 (converting from any RDF) has the advantage of being usable from any RDF, not just that generated from RDFa, but means the generated microdata+json could be quite different from the original RDFa markup. It raises gnarly questions about when/whether/how to flatten the graph and when/whether/how to use short names for properties.

#4 (converting from JSON-LD) introduces a dependency on something that is not a W3C standard (might that mean IP issues?) and is still changing. On the other hand, it may well mean that the mapping to microdata+json can be defined easily because it shunts the difficult questions into the mapping from RDF(a) to JSON-LD instead.

I think overall my preference is probably for #1.

Jeni
--
Jeni Tennison
http://www.jenitennison.com

Received on Friday, 25 November 2011 09:45:54 UTC