Re: Multiple itemtypes in microdata from Jeni Tennison on 2011-10-13 (public-html-data-tf@w3.org from October 2011)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 13 Oct 2011 16:48:12 +0100
To: Ian Hickson <ian@hixie.ch>
Cc: public-html-data-tf@w3.org, Henri Sivonen <hsivonen@iki.fi>
Message-Id: <CFD63044-BEEA-4888-AE82-D6583B6A5CD0@jenitennison.com>
On 12 Oct 2011, at 23:30, Ian Hickson wrote:
> On Wed, 12 Oct 2011, Jeni Tennison wrote:
>> One of the assumptions we're making within the HTML Data TF is that 
>> publishers will need to publish in multiple formats (rather than 
>> consumers understanding multiple formats)
> 
> That sounds like a horrible authoring experience. :-)

Yes, it sucks.

>> so that when/if there is eventual convergence on a single format, 
>> consumers aren't stuck having to be able to maintain a massive legacy. 
> 
> I don't understand what you mean. Surely if authors only provide data in 
> one vocabulary, it's less of a legacy to maintain than if they provide the 
> data in two vocabularies?

We're in a situation where not everyone in the world is consuming or publishing data using the same format (syntax or vocabulary).

We could say that each publisher should use only one format, but (given they're motivated to share their data as widely as possible), publishers can only do that if consumers understand multiple formats. I might be misrepresenting him but I think that Henri's argument is that consuming multiple formats places an unacceptable implementation burden on consumers and one that causes long-term problems because it's hard for consumers to remove support for a particular format once they have it.

So the alternative is for consumers to each recognise a restricted range of formats, and for publishers to publish in whatever formats the consumers they care about understand, presumably dropping support for particular formats as they revise their websites and the consumers they care about change, fade and die.

The current draft guidance about this choice is at

  http://www.w3.org/wiki/Choosing_an_HTML_Data_Format

Comments and suggestions welcome.

>> If people are using multiple vocabularies they will very probably want 
>> to use types from each of those vocabularies.
> 
> I'm not sure what you mean. What's the difference between "type" and 
> "vocabulary" here?

A type is a class, such as http://schema.org/Place or http://purl.org/goodrelations/v1#Location. A vocabulary is a set of classes and properties, such as the schema.org vocabulary or the GoodRelations vocabulary.

By the way, it would be great to have an example of the use of multiple itemtypes from the same vocabulary within the spec, particularly to make it clearer what it means by "same vocabulary".

>> An example of the kind of workaround that's currently being recommended 
>> is shown with the use of GoodRelations with schema.org where the ugly 
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type property is used to 
>> provide the GoodRelations type for the item while the itemtype holds the 
>> schema.org type. As you know, I've also blogged about this from the 
>> perspective of a publisher targeting both browsers and search engines.
> 
> If you've got two different vocabularies, just provide the data twice, e.g.:
> 
>   <div itemscope itemtype="http://example.org/feline">
>    <meta itemprop="name" content="Cat Adorable Pillar">
>    <meta itemprop="species" content="American Shorthair">
>    <meta itemprop="color" content="White">
>   </div>
>   <div itemscope itemtype="http://example.com/cat">
>    <meta itemprop="common-name" content="Pillar">
>    <meta itemprop="name" content="ASH"> <!-- American Shorthair code -->
>    <meta itemprop="color" content="#FFFFFF">
>   </div>
> 
> There's no sane way to use both vocabularies in parallel, since the 
> vocabularies will almost certainly have different requirements (e.g. in 
> this example, one has its colour as a string and the other as a hex code).

Sure. That workaround has the disadvantages of creating two items rather than one and means that at least one of the copies is detached from the content of the page, so can't be drag/dropped etc.

> Note that the property "name" in the vocabulary "http://example.org/feline"
> and the property "http://example.org/feline#name" have absolutely not 
> relationship in microdata. They are different properties and cannot be 
> mechanically considered to be equivalent in any way. Any use of microdata 
> that claims that a full URL property name is the same property as a short 
> name in a specific vocabulary is wrong. It's two properties. They might 
> have the same semantics and can be used as equivalent, but they are 
> different properties and any specification that defines or uses both would 
> need to define how to handle clashes.

Understood. I'll make sure that we take that into account in the work on mapping microdata to RDF.

>> If support for multiple types from different vocabularies is definitely 
>> out of scope for microdata, it would be really helpful to understand the 
>> rationale so that we can document it for users.
> 
> Nobody has asked for it in an actionable manner (giving concrete use cases 
> that demonstrate real need, e.g. showing real Web pages where there really 
> are two incompatible vocabularies that are nonetheless compatible enough 
> that it actually makes sense to have some sort of special syntax for 
> mixing the vocabularies).


OK, that's helpful. I'll see if the members of the TF know of any such pages that we can point you to.

Thanks,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Thursday, 13 October 2011 15:48:40 UTC