W3C home > Mailing lists > Public > public-html-data-tf@w3.org > October 2011

Re: Multiple types from different vocabularies (ACTION-7)

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Tue, 18 Oct 2011 21:47:17 +0200
Cc: Jeni Tennison <jeni@jenitennison.com>, public-html-data-tf@w3.org
Message-Id: <3883D11D-B604-427A-8C7D-3265988FA530@ebusiness-unibw.org>
To: Ivan Herman <ivan@w3.org>
Hi Ivan:

I still think that defining a reserved keyword for itemprop is simpler than 
a) defining a new attribute or
b) allowing multiple types in the itemtype attribute.

a) is, IMO, adding more to the core Microdata constructs that a Webmaster must learn than a reserved property name and
b) is terrible, IMO, because there it is only implicit which value sets the itemtype and thus the scope for the properties. People will almost certainly fail with this. Not us, but typical developers. It is really an awkward mechanism to adopt the RDFa-style "multiple types" syntax from RDF with its global property IDs to a syntax that follows a frame-based approach with local property names inside the context of a single itemtype.

PS: I do not think that the choice of this mechanism has to do with whether schema.org will be the only Microdata vocabulary or not. I personally think that most of the mature and broadly used vocabularies will emerge into RDF+Microdata vocabularies, usable in both syntaxes.

Martin


On Oct 16, 2011, at 11:25 AM, Ivan Herman wrote:

> (Martin, this is a joint response on different mails)
> 
> I respectfully disagree (with your first sentence, not with the use case:-)...
> 
> The somewhat higher level question is where we imagine microdata to go in future. *If* we envisage that microdata (to push things a bit to the extreme for the purpose of this discussion) is merely a syntax to express schema.org vocabularies, with no other goals, then, of course, using a schema.org type is fine. After all, in this scenario, microdata and schema.org are so tightly coupled that we might as well make use of this. 
> 
> However, *if* we consider microdata as a simple syntax to add structured data to HTML which happens to be used by schema.org as well (even if we say that for which schema.org is the biggest 'customer'), but can also be used, eg, to encode microformat vocabularies, then using a schema.org/type is not really the good solution. Indeed, I do not see any major difference between using schema.org/type or www.w3.org/ns/type or, for that matter, the current rdf:type: indeed, in my view usage of *all* these options are equally bad insofar as it binds microdata to a particular vocabulary which, as far as I can understand it, is not the design of microdata. (Let us forget about the microdata->RDF mapping which is a different matter.)
> 
> Personally, I happen to think about microdata in terms of the second option. If so, I see two clean solutions:
> 
> 1. keeping the current microdata syntax with the absolutely tiny change that allows several types listed in @itemtype (that is the option currently discussed with the HTML group)
> 2. adding a @secondaryType to the microdata syntax
> 
> both of these are vocabulary neutral and keep the notions strictly within the range of the microdata spec.
> 
> Though I happen to believe that approach #2 is better, I can see the major advantage of #1 insofar as it requires only very few changes to the current tooling of microdata and it would cover all the use cases that we know of.
> 
> (By the way: I do not see these two options as particularly intrusive. Actually, even less that using a special attribute value: the latter indeed require generic microdata processors to interpret a particular attribute value for a specific purpose, which is not really the case now.)
> 
> Cheers
> 
> Ivan
> 
> On Oct 15, 2011, at 21:22 , Martin Hepp wrote:
> 
>> Hi Ivan:
>> I think that a reserved or recommended itemprop name is much less intrusive to the spec and tooling than a new Microdata keyword.
>> 
>> By the way, if you want use cases for when multiple types are important, you should look at all the parts of schema.org spec that try to capture taxonomic knowledge, e.g. the subtypes of
>> 
>>  http://schema.org/LocalBusiness
>> 
>> It is a sheer impossibility to find one globally valid partitioning of the world at this level of granularity, so a typical setting would be
>> 
>> - you want to please a search engine and other clients that just support the basic type
>> - but you also want to preserve more fine-grained type information from another standard.
>> 
>> Having multiple types in the markup, i.e. at the data level, does not require any reasoning (e.g. the OWL/RDFS subclassOf approach) or other sophisticated technology while still allowing basic and sophisticated types to evolve in parallel.
>> 
>> Of course, you could also use the schema.org extension mechanism, like
>> 
>>  http://schema.org/LocalBusiness/SomeWeirdType
>> 
>> but this
>> 
>> 1. ties in a lot worse with RDF and other vocabulary technology,
>> 2. does not give you a way of defining the type, at least textually,
>> 3. implies the risk of unintended collisions, because independently invented type extensions can use the same identifier and there is no way to spot this easily (other than from the data, based on some heuristics),
>> 4. it does not work with other languages than English. Imagine if Eskimos needed to distinguish 100 types of snow, but that there were no English words for those - the current schema.org extension mechanism supports this in only very cumbersome ways.
>> 
>> Best
>> 
>> Martin 
>> 
>> 
>> Martin
>> 
>> On Oct 15, 2011, at 10:54 AM, Ivan Herman wrote:
>> 
>>> Ah! I see now that I misunderstood Martin's proposal (again... just as I misunderstood it the first time he raised that at the schema.org workshop:-(. My thought were to introduce a new 'secondaryType' (or whatever) microdata _attribute_ for, well, secondary types. Ie, Martin's examples would become:
>>> 
>>> <div itemscope itemtype="http://schema.org/Product" 
>>>   secondaryType="http://www.productontology.org/id/Hammer http://example.org/my_ontology.owl#Tool">
>>> <!-- other schema.org properties go in here -->
>>> </div>
>>> 
>>> It'd require an add-on to the microdata spec but, hey, that is what we are talking about here.
>>> 
>>> The only difference between this and the multiple type feature that is discussed elsewhere is that it would remove the possible ambiguity on which URI in an itemtype would also govern the choices of the non-full-URI term mappings. This would make things clear for all parties involved...
>>> 
>>> Ivan
>>> 
>>> 
>>>>> <div itemscope itemtype="http://schema.org/Product">
>>> 
>>> 
>>> 
>>> 
>>>>> <div itemscope itemtype="http://schema.org/Product">
>>>>> <link itemprop="secondaryType" href="http://www.productontology.org/id/Hammer" />
>>>>> <link itemprop="secondaryType" href="http://example.org/my_ontology.owl#Tool" />
>>>>> <!-- other schema.org properties go in here -->
>>>>> </div>	
>>> 
>>> On Oct 15, 2011, at 06:39 , Jeni Tennison wrote:
>>> 
>>>> Martin,
>>>> 
>>>> On 14 Oct 2011, at 09:32, Martin Hepp wrote:
>>>>> My simple take is to define a reserved keyword "secondaryType" for itemprop that is used for attaching additional types without creating ambiguity about the scope of the properties. Semantically, this would be equivalent to rdf:type, and you will want to allow multiple of those.
>>>>> 
>>>>> The nice thing of this approach is:
>>>>> 
>>>>> 1. You could start by simply adding this to http://schema.org/Thing without a need to update the Microdata spec first. So you could get this done within a day. 
>>>>> 2. You could later add this to the spec.
>>>>> 3. There is no confusion about the scope of local properties.
>>>>> 4. The mapping to RDF is trivial (simply use the full URI of rdf:type).
>>>>> 5. You do not introduce a new Microdata core keyword, just a predefined property.
>>>>> 
>>>>> Example
>>>>> 
>>>>> <div itemscope itemtype="http://schema.org/Product">
>>>>> <link itemprop="secondaryType" href="http://www.productontology.org/id/Hammer" />
>>>>> <link itemprop="secondaryType" href="http://example.org/my_ontology.owl#Tool" />
>>>>> <!-- other schema.org properties go in here -->
>>>>> </div>	
>>>>> 
>>>>> One prop, and you will be all set, and the SW community and the search engines can live and prosper in love, peace and harmony ;-)
>>>> 
>>>> 
>>>> You have a dream, huh? ;)
>>>> 
>>>> I think this is a promising direction, but there are several variants to consider.
>>>> 
>>>> 1. A reserved short property name. I don't think this is effective unless it's in the microdata spec and I think that the only chance of that is if Hixie were convinced of the requirement to support multiple types from different vocabularies, in which case he could well decide to support that requirement in some other way.
>>>> 
>>>> 2. A recommended short property name. We could have a guideline for vocabulary authors that said "always define a property 'type' (or 'secondaryType' or whatever) so publishers can associate types from other vocabularies to items when they use your vocabulary". The problem with that is that not everyone will do it (eg you're not going to find Hixie adding a 'type' property to the vCard or iCalendar vocabularies unless you convince him of the requirement etc etc) and that you couldn't be certain that every vocabulary was using 'type' with those semantics (this becomes less of an issue the more obscure you make the name).
>>>> 
>>>> 3. Vocabulary-specific properties. We could have a guideline for vocabulary authors that said "to enable your types to be used where publishers are using a different primary type, define a property that can be used for attaching your types to items; for consistency across vocabularies, we recommend this having a local name of 'type' and taking values that are the local names of types in your vocabulary". In your example above this would mean:
>>>> 
>>>> <div itemscope itemtype="http://schema.org/Product">
>>>> <meta itemprop="http://www.productontology.org/id/type" content="Hammer" />
>>>> <meta itemprop="http://example.org/my_ontology.owl#type" content="Tool" />
>>>> <!-- other schema.org properties go in here -->
>>>> </div>	
>>>> 
>>>> which isn't bad.
>>>> 
>>>> This puts the onus on vocabulary authors to decide whether they want their vocabulary to be usable when a publisher is primarily using a different type. Vocabulary authors already have to make that decision: if they don't specify URI equivalents for their properties then publishers can't use those properties on items which have types outside their vocabulary. So while this can mean an inconsistent picture for publishers, it's no more inconsistent than it currently is.
>>>> 
>>>> There are two disadvantages from an RDF perspective.
>>>> 
>>>> The first is that it would mean adding this property to existing vocabularies to make them microdata ready. You could argue that to use existing RDF vocabularies in microdata you have to do some extra work anyway (as Hixie has pointed out, you can't just port them because there are extra semantics you have to define for microdata use), so adding another property isn't a big deal, but some vocabularies may be very hard to change.
>>>> 
>>>> The second is that it wouldn't be possible for a generic microdata-to-RDF mapping to map these properties into an rdf:type relationship, so data that used this pattern would always have to go through a vocabulary-specific conversion to become usable RDF.
>>>> 
>>>> 4. A global property. This could be rdf:type or we could recommend that the W3C define an equivalent property but with a more approachable URI, such as 'http://w3.org/ns/global/type'. In your example, that would mean:
>>>> 
>>>> <div itemscope itemtype="http://schema.org/Product">
>>>> <link itemprop="http://w3.org/ns/global/type" 
>>>>    href="http://www.productontology.org/id/Hammer" />
>>>> <link itemprop="http://w3.org/ns/global/type" 
>>>>    href="http://example.org/my_ontology.owl#Tool" />
>>>> <!-- other schema.org properties go in here -->
>>>> </div>	
>>>> 
>>>> This has the advantage of having a consistent way of adding types, but makes the markup more cluttered than the previous solutions. However easy you make the URL for the type, it's always going to be something that people have to work to remember; given it'll be cut-and-pasted anyway, you might as well use the existing rdf:type rather than inventing something with an equivalent semantics.
>>>> 
>>>> My summary is that I don't think that a reserved or recommended short property name will work, but either vocabulary-specific properties or a global property (or a combination of both) might.
>>>> 
>>>> Thoughts?
>>>> 
>>>> Jeni
>>>> -- 
>>>> Jeni Tennison
>>>> http://www.jenitennison.com
>>>> 
>>>> 
>>> 
>>> 
>>> ----
>>> Ivan Herman, W3C Semantic Web Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
Received on Tuesday, 18 October 2011 19:47:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 18 October 2011 19:47:50 GMT