Re: Multiple itemtypes in microdata from Ian Hickson on 2011-10-13 (public-html-data-tf@w3.org from October 2011)

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 13 Oct 2011 17:26:04 +0000 (UTC)
To: Jeni Tennison <jeni@jenitennison.com>
cc: public-html-data-tf@w3.org
Message-ID: <Pine.LNX.4.64.1110131707430.27449@ps20323.dreamhostps.com>
On Thu, 13 Oct 2011, Jeni Tennison wrote:
> >
> > I don't understand what you mean. Surely if authors only provide data 
> > in one vocabulary, it's less of a legacy to maintain than if they 
> > provide the data in two vocabularies?
> 
> We're in a situation where not everyone in the world is consuming or 
> publishing data using the same format (syntax or vocabulary).

Sure. Most people aren't consuming data in any vocabulary, and most people 
aren't publishing data in any vocabulary.

If a particular author has a particular need to publish data for a 
particular consumer, e.g. a recipe site author who wants their data to 
trigger context-specific restricts in a search engine, then they would 
mark up their data using the vocabulary used by the search engine they are 
targetting.

In the unlikely event that they care about two consumers that want the 
same data but are using two different vocabularies, then they would mark 
up the information twice. It's unfortunate, but it's no different than, 
e.g., an author targetting RSS and Atom readers by providing both an RSS 
feed and an Atom feed, or video producers providing both an MPEG stream 
and a WebM stream. It's usually a short-term situation.

(I disagree with Henri that supporting multiple formats, especially 
multiple microdata vocabularies, is an especially great burden. It's a few 
lines of code to map each vocabulary supported to a common data structure. 
It's not anywhere near the complexity of supporting both XML and HTML, 
which itself is really an undue burden primarily due to complexities in 
the rendering and API layer of those two technologies that are the result 
of years of historical accidents, something that really shouldn't apply 
when we're talking about supporting two different microdata vocabularies.)


> We could say that each publisher should use only one format, but (given 
> they're motivated to share their data as widely as possible)

I do not think that there really is much motivation for authors to publish 
their data in a machine-readable form at all, let alone motivation for 
them to do it "widely" or that such motivation would usually be sufficient 
to convince them to use even one vocabulary, let alone two.

I'm happy to see evidence to the contrary though. What data is it that 
people are trying to publish to be used by two consumers that do not 
support a common format?


> >> If people are using multiple vocabularies they will very probably 
> >> want to use types from each of those vocabularies.
> > 
> > I'm not sure what you mean. What's the difference between "type" and 
> > "vocabulary" here?
> 
> A type is a class, such as http://schema.org/Place or 
> http://purl.org/goodrelations/v1#Location. A vocabulary is a set of 
> classes and properties, such as the schema.org vocabulary or the 
> GoodRelations vocabulary.

By those definitions, yes, one would imagine that if there was an author 
using multiple standards such as the schema.org or GoodRelations that 
themselves define multiple itemtypes, the author would obviously want to 
use types from each of those vocabularies. That seems obvious. :-)


> > If you've got two different vocabularies, just provide the data twice, e.g.:
> > 
> >   <div itemscope itemtype="http://example.org/feline">
> >    <meta itemprop="name" content="Cat Adorable Pillar">
> >    <meta itemprop="species" content="American Shorthair">
> >    <meta itemprop="color" content="White">
> >   </div>
> >   <div itemscope itemtype="http://example.com/cat">
> >    <meta itemprop="common-name" content="Pillar">
> >    <meta itemprop="name" content="ASH"> <!-- American Shorthair code -->
> >    <meta itemprop="color" content="#FFFFFF">
> >   </div>
> > 
> > There's no sane way to use both vocabularies in parallel, since the 
> > vocabularies will almost certainly have different requirements (e.g. 
> > in this example, one has its colour as a string and the other as a hex 
> > code).
> 
> Sure. That workaround has the disadvantages of creating two items rather 
> than one

Why does that matter?


> and means that at least one of the copies is detached from the content 
> of the page, so can't be drag/dropped etc.

You can work around this pretty easily:

   <div itemscope itemtype="http://example.org/feline" itemref="a">
    <div itemscope itemtype="http://example.com/cat" itemref="b">
     ...whatever it is you want to drag and drop...
    </div>
   </div>
   <div id="a">
    <meta itemprop="name" content="Cat Adorable Pillar">
    <meta itemprop="species" content="American Shorthair">
    <meta itemprop="color" content="White">
   </div>
   <div id="b">
    <meta itemprop="common-name" content="Pillar">
    <meta itemprop="name" content="ASH"> <!-- American Shorthair code -->
    <meta itemprop="color" content="#FFFFFF">
   </div>

However, this seems like a rather hypothetical problem. I'm not aware of 
even a single vocabulary today that anyone is supporting in this way, let 
alone two for the same topic.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 13 October 2011 17:29:51 UTC