- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 19 Oct 2011 00:00:40 +0000 (UTC)
- To: Gregg Kellogg <gregg@kellogg-assoc.com>
- cc: Bradley Allen <bradley.p.allen@gmail.com>, Stéphane Corlosquet <scorlosquet@gmail.com>, "public-html-data-tf@w3.org" <public-html-data-tf@w3.org>
On Tue, 18 Oct 2011, Gregg Kellogg wrote: > > Hixie, note that I raised property URI generation as ISSUE-1 [1] (along > with other transformation issues). From reading the HTML/Microdata spec, > it would seem that processors really need to have vocabulary-specific > rules for interpreting these rules. This is important for property URI > generation, but also for maintaining value order and specifying > per-property literal datatypes. Yes, all the use cases for microdata were things where it made no sense for software to do anything with the data unless it knew what the data meant, so the assumption is that the microdata processing software knows the vocabulary. This is similar to how XML processors are expected to have namespace-specific knowledge to be useful. Sure, you can have generic XML or microdata (or JSON or...) parsers, but to do anything useful with the data, you have to stick those parsers onto a frontend that knows about the data itself. > The alternatives are: > > 1) bake in support for each vocabulary into a conformant processor This is the assumption that microdata is built around. > 2) read a vocabulary document (i.e., RDFS or OWL) and determine > processing rules from rdfs:range/rdfs:domain specifications Generally speaking, no language exists that is expressive enough to actually describe vocabularies in sufficient detail to make this practical for the kinds of vocabularies that microdata's use cases involve. > 3) do nothing, use a single processing algorithm that is generic across > all vocabularies and leave it to post-processing to perform > vocabulary-specific modifications. (Although this does not really > address property URI generation variation between vocabularies defined > in HTML and other RDF vocabularies). I don't really understand what this means. What does RDF have to do with microdata in this context? > Note, that if the HTML spec specified > http://microformats.org/profile/hcard# as the vCard type, instead of > just http://microformats.org/profile/hcard, properties would be > generated relative to the type using processing rules currently > described in [2], which is intended to be compatible with > schema.org<http://schema.org> and other RDF vocabularies. The properties in the microdata vCard vocabulary aren't URLs, and it would be incorrect to treat them as URLs. They are "defined property names" in the sense defined in the HTML specification. This has implications. For example, it would be invalid to treat these two microdata fragments as equivalent in any way: <address itemscope itemtype="http://microformats.org/profile/hcard"> Written by <span itemprop="fn"> <span itemprop="n" itemscope> <span itemprop="given-name">Jill</span> <span itemprop="family-name">Darpa</span> </span> </span> </address> <address itemscope itemtype="http://microformats.org/profile/hcard"> Written by <span itemprop="http://microformats.org/profile/hcard#fn"> <span itemprop="http://microformats.org/profile/hcard#n" itemscope> <span itemprop="http://microformats.org/profile/hcard#n/given-name">Jill</span> <span itemprop="http://microformats.org/profile/hcard#n/family-name">Darpa</span> </span> </span> </address> Any software that handled the above in equivalent ways (e.g. finding a vCard with a name "Jill Darpa" in the second case) would be non-conforming implementations of the vCard microdata vocabulary. (This is why when there was a generic HTML to RDF conversion algorithm in the HTML spec, it went to some lengths to ensure that the URLs generated on the RDF side could not be present in conforming microdata -- it ensured that there was no way to end up in this confusing situation where two different conforming property names had the same semantic.) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 19 October 2011 00:04:43 UTC