Re: Absorbing Microdata

On Wed, 15 Sep 2010 19:58:17 +0200
Ivan Herman <ivan@w3.org> wrote:

> thank you, but this seems to be the same document that Toby was
> referring to. But that is not what I was looking for:-( That document
> defines a Microdata->RDF transformation; what I am wondering is
> whether it is possible to have a Microdata->RDFa transformation, too.
> If so, then existing RDFa implementations could be easily extended to
> include microdata, too.

If RDFa implementations wish to support Microdata, then converting
Microdata to RDFa before parsing is just about the most complicated way
they could offer said support. I've been trying to think of more
complicated ways they could do it, but I can't. ;-)

Most of the conversion is relatively easy. It's generally just a matter
of detecting certain Microdata attributes, or combinations thereof, and
replacing them with RDFa equivalents. There are two main exceptions:

1. The combination of @itemprop, @itemscope and @itemtype on the same
element. In RDFa terms this is roughly equivalent to:

	<div rel="itemprop"
	     resource="itemscope"
	     resource-typeof="itemtype">

i.e. it sets the type for the object of the triple rather than the
subject. I get around this by adding a hidden <span> child element and
including an rdf:type triple on that hidden element. Nothing too tricky.

2. The @itemref attribute - as I previously mentioned, RDFa has no
equivalent. Consider this:

	<p id="age">
	  The following people are all aged
	  <span itemprop="http://xmlns.com/foaf/0.1/age"
	  >30</span>:
	</p>
	<ul>
	  <li itemscope itemref="#age">
	    <b itemprop="http://xmlns.com/foaf/0.1/name">Alice</b>
	  </li>
	  <li itemscope itemref="#age">
	    <b itemprop="http://xmlns.com/foaf/0.1/name">Bob</b>
	  </li>
	  <li itemscope itemref="#age">
	    <b itemprop="http://xmlns.com/foaf/0.1/name">Carol</b>
	  </li>
	</ul>

From what I can tell, the only way this can be converted to RDFa is:

If an element has @itemscope and @itemref, hand that element (call it
'X') over to a Microdata parser. For each triple you get back, skip the
triple if the element it was generated on was X itself, or a
descendent of X. For each triple that has not been skipped, create a
hidden <span> element as a child of X and add RDFa attributes to encode
the triple.

It's actually slightly more complicated than that in that you need to
align bnode identifiers. But the upshot of it is that I don't think you
can get away with doing Microdata-to-RDFa without embedding a full
Microdata parser.

Microdata-to-RDFa has potential uses for publishers, and middle-man
conversion services, but I don't think it's especially useful for RDFa
consumers.

I'd be happy to write it up though. My algorithm's not been tested on a
wide variety of input, though it seems to work in theory.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Received on Wednesday, 15 September 2010 20:10:40 UTC