Re: additionalType property, vs extending Microdata syntax for multiple types

On Jun 16, 2012, at 8:59 AM, Ivan Herman wrote:

> Extending the microdata->RDF algorithm through vocabulary expansion: it does not sound such crazy idea, does it? ;-)
> 
> It would help in overcoming the problems around additionalType that Péter just emphasized in another mail...

In RDFa, vocabulary expansion is optional; in fact, there are a number of RDFa processors which are completely conformant, but do not implement this option.

For vocabulary expansion to be useful in microdata would require that _all_ microdata to RDF processors include this feature. Given that microdata is not intended for general RDF usage, I wonder how likely we would be to get reasonable conformance to a vocabulary expansion requirement?

The way this would probably work, is that the vocabulary inferred from all the @itemtype values in the document would identify one or more vocabularies (in the RDFa sense). This would create triples similar to that in the Step 2 of the RDFa processing rules [2].

<> rdfa:usesVocabulary _vocab_ .

We can then just reference the RDFa Vocabulary Expansion [3] as a normative requirement, or duplicate that section in the Microdata to RDF spec.

For those of us who are tending to do polyglot processors (RDFa, RDF/XML in HTML, Turtle in HTML, microdata), it would be simplest if the microdata processor simply generated the same triples as RDFa (even though some might not like the idea of a microdata processor outputting properties in the rdfa namespace). Then vocabulary expansion can work in parallel for both RDFa and microdata extracted from the document.

Gregg

[2] http://www.w3.org/TR/rdfa-core/#PS-default-vocabulary
[3] http://www.w3.org/TR/rdfa-core/#s_vocab_expansion

> Ivan
> 
> On Jun 16, 2012, at 16:36 , Gregg Kellogg wrote:
> 
>> On Jun 15, 2012, at 11:46 PM, gregg@kellogg-assoc.com wrote:
>> 
>>> On Jun 15, 2012, at 10:49 PM, "Dan Brickley" <danbri@danbri.org> wrote:
>>> 
>>>> On 16 June 2012 06:55, Ivan Herman <ivan@w3.org> wrote:
>>>>> On Jun 15, 2012, at 20:26 , Dan Brickley wrote:
>>>>>> HTML5 Microdata, as defined in
>>>>>> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#encoding-microdata
>>>>>> 
>>>>>> ... has only limited support for describing multiple types that
>>>>>> something belongs to. In particular it requires they are described
>>>>>> using a single schema.
>>>>>> 
>>>>>> 
>>>>>> For Good Relations integration (and other scenarios) people have asked
>>>>>> for a way of listing more types within schema.org markup.
>>>>>> 
>>>>>> * One model is to use RDFa 1.1 (Lite), where this is quite natural.
>>>>>> * Another is to add (as a workaround) a new property, e.g. called
>>>>>> 'type' or 'additionalType', to schema.org's vocab (Martin requests
>>>>>> this in http://www.w3.org/wiki/WebSchemas/GoodRelations )
>>>>>> * A 3rd is to stretch
>>>>>> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#items
>>>>>> to allow different namespaces to be used.
>>>>> 
>>>>> I think that the third option is not obvious. The issue is that, in microdata, two notions, namely the typing of an item and the vocabulary used in that item, are conflated. The vocabulary information (hence the interpretation of the terms in the item) is deduced from the itemtype value(s). Hence the restriction in the current spec to restrict multiple types to be in the same vocabulary. Technically, it would be possible to declare that the order in the attribute's value counts, ie, the first type value determines the vocabulary, but that would be terribly error-prone and I hence do not believe it would be a good idea.
>>>>> 
>>>>> It would require a more substantial change in the microdata spec (essentially introducing the equivalent of RDFa's @vocab) to solve that properly. I do not see that happening.
>>>> 
>>>> Yes. I think it is well appreciated that RDFa 1.1 handles this better.
>>>> I don't think anyone expects Microdata to change a lot more, and I
>>>> don't see anyone with the energy/interest to keep pushing for these
>>>> kinds of changes/improvements in Microdata. As you say, it wouldn't be
>>>> easy. The "main type comes first" design was all I could think of,
>>>> too.
>>>> 
>>>>> Which leaves us with the first or the second option. I am obviously biased here, but I think the first option is clearly better in this respect...
>>>> 
>>>> The question is then, what do we say for all those publishers who are
>>>> on board the Microdata train? A lot of people have worked hard to get
>>>> colleagues/stakeholds (and tools!) adopting Microdata, over the last
>>>> year. While RDFa is a good thing, we want to be careful to support
>>>> early adopters too, and have sensible advice for them (other than
>>>> 're-do everything with a new syntax'). Which is what makes
>>>> 'additionalType' attractive. But the concern there is ... if we go
>>>> that route, validators/checkers will need to understand the attribute
>>>> since it is almost an extension of the underlying syntax...
>>> 
>>> We've built in some processing rules for the Microdata to RDF note that are there specifically for schema.org, but thus far, it's been done in a way that is still fairly generic. In that light, we could potentially add some vocabulary expansion rules, similar to RDFa's vocabulary expansion, except here it would be done for itemtype, using similar rules used to find the vocabulary for constructing property URIs. You could then add an owl:equivalentProperty definition equating schema:additionalType with rdf:type.
>>> 
>>> Ivan may recall, I think we discussed this a while ago, and decided not to go there at the time.
>>> 
>>> That said, in spite of the Microdata restriction from using types from different vocabularies, the way the algorithm is written, it is only the first type which is used for constructing property URIs. Not that I would recommend that anyone do this against the requirements of Microdata.
>> 
>> Actually, that wasn't it. it was a different conversation[1]  when we were wrapping up the Microdata to RDF spec. It seems that we left this open as an idea which wasn't explored.
>> 
>> Here's a snippet from the conversation [1].
>> 
>> On Nov 22, 2011, at 12:22 AM, Ivan Herman wrote:
>> 
>>> On Nov 22, 2011, at 02:15 , Gregg Kellogg wrote:
>>> [snip]
>>> Another thought: we may think about folding into the md->RDF conversion the @vocab expansion mechanism of RDFa (maybe needless to say, but as an optional mechanism!). Some vocabularies, eg, schema.org, may set up such @vocab files anyway (we are already in discussion with DanBri on that), why not make use of those for this conversion, too?
>>>> 
>>>> If we can "extend" Microdata in this way, I think @vocab would be a fine solution. Of course at that point, the only real difference between Microformats and RDFa 1.1 Lite is the name of the attributes!
>>> 
>>> You misunderstood on one point. I do not mean to extend microdata. What I am saying is: in the 'vocabulary' scheme at least what happens, from the processor point of view, is to extract a vocabulary URI from typing. Ie, while RDFa makes explicit in its own syntax, @itemtime conflates two different roles. But once the vocabulary URI is extracted from microdata, it could be treated similarly, ie, have that optional additional mechanism on it that RDFa has already defined. 
>>> 
>>> Ivan
>> 
>> Gregg
>> 
>> [1] http://lists.w3.org/Archives/Public/public-html-data-tf/2011Nov/0155.html
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 

Received on Saturday, 16 June 2012 17:42:15 UTC