Re: Microdata implied property order

Gregg,

On 2 Oct 2011, at 18:13, Gregg Kellogg wrote:
> The Microdata spec [1] has a defined property order for the JSON serialization. The (former) RDF processing rules [2] did not take this into consideration, but ordering of properties in RDF is commonly presumed, even though it doesn't exist.
> 
> If an RDF conversion were to be specified to preserve order, it should probably make use of RDF collections (rdf:List). The question would be, are single-valued properties also in a collection? If multi-valued properties are presumed to be ordered in a collection, how can you express unordered properties in Microdata?

I think we have to assume that vocabulary-aware applications that use RDF extracted from microdata would usually transform the resulting RDF into something that's more consistent with what they expect from RDFa, which might include:

  * adding datatypes to values
  * reconciliation of strings into URIs
  * mapping into other common vocabularies
  * mapping properties into properties chains linking the original subject and object

Part of this arises from the fact that microdata doesn't capture things like datatypes that you might want in RDF and part of it arises from wanting to make it easy for people to use embedded data in HTML pages. For example, although Facebook uses RDFa syntax, the RDF that they expose is nothing like the RDF that they would get from simply parsing RDFa from the relevant page (eg [4]).

So if we assume that there's always a transformation step involved, there are two guidelines for how any mapping should be done:

  1. preserve any information that is likely to inform the transformation

  2. make it as easy as possible to do the transformation (ie as close as possible to what is likely to be the desired post-transformation result)

On that basis, my answer to the questions above would be:

  * single-valued properties should not be in a collection (as this is unlikely to be the desired output of a transformation)
  * multi-valued properties should always be put in an ordered collection so that this information is preserved

The big difficulty is working out whether a given property with a single value is a single-value property or a multi-value property. I'd be inclined to do something like saying that if, within a page, any item ever has more than one value for a given property then it's a multi-valued property and otherwise a single-valued property. This will lead to mistakenly making what should be a multi-valued property single valued of course, but is better than doing it on an item-by-item basis.

> For example (from schema.org):
> 
> 	<div itemscope itemtype="http://schema.org/MusicPlaylist">
> 	  1. <div itemprop="tracks" itemscape itemtype="http://schema.org/MusicRecording">...</div>
> 	  2. <div itemprop="tracks" itemscape itemtype="http://schema.org/MusicRecording">...</div>
> 	</div>

s/itemscape/itemscope/

> There is an assumed ordering of the tracks in a playlist, but no clear processing rules (for RDF anyway) to describe the ordering. With previously existing rules, this would render the following:
> 
> 	@prefix : <http://schema.org/> .
> 	[ a :MusicPlaylist
> 	  :tracks [ a :MusicRecording; ...], [a :MusicRecording; ...]
> 	]

Yes. I think it should be:

  	@prefix : <http://schema.org/> .
	[ a :MusicPlaylist ;
	  :tracks (
            [ a :MusicRecording; ...] 
            [ a :MusicRecording; ...]
          ) ;
	]

> RDFa recently added the @inlist attribute to indicate that property values should be added to a list, so the same thing could be represented as follows:
> 
> 	<div vocab="http://schema.org/" typeof="MusicPlaylist">
> 	  <div rel="tracks" inlist>
> 	    1. <div typeof="MusicRecording">...</div>
> 	    2. <div typeof="MusicRecording">...</div>
> 	  </div>
> 	</div>

Or:

	<div vocab="http://schema.org/" typeof="MusicPlaylist">
	    1. <div rel="tracks" inlist><div typeof="MusicRecording">...</div></div>
	    2. <div rel="tracks" inlist><div typeof="MusicRecording">...</div></div>
	  </div>
	</div>

where the parallels with the microdata equivalent are a little more obvious.

> which would render the following:
> 
> 	@prefix : <http://schema.org/> .
> 	[ a :MusicPlaylist
> 	  :tracks ([ a :MusicRecording; ...] [a :MusicRecording; ...])
> 	]
> 
> Gregg

Jeni

> [1] http://dev.w3.org/html5/md/#json
> [2] http://www.w3.org/TR/2011/WD-microdata-20110525/#rdf


[3]: https://gist.github.com/1262434
[4]: http://www.rottentomatoes.com/m/matrix/
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Tuesday, 4 October 2011 19:02:06 UTC