- From: Tim van Oostrom <tim@depulz.nl>
- Date: Sun, 29 Nov 2009 18:57:39 +0100
Tim van Oostrom wrote: > Philip J?genstedt wrote: >> On Sun, 29 Nov 2009 12:46:16 +0100, Tim van Oostrom <tim at depulz.nl> >> wrote: >> >>> Philip J?genstedt wrote: >>>> On Thu, 26 Nov 2009 22:30:41 +0100, Tim van Oostrom <tim at depulz.nl> >>>> wrote: >>>> >>>>> Hi, I made a forumpost : >>>>> http://forums.whatwg.org/viewtopic.php?t=4176, concerning a >>>>> possible "microdata specification bug" and a bug in the >>>>> james.html5.org microdata extractor. >>>>> >>>>> Comes down to <link/> and <meta/> elements possibly being unfit >>>>> for use with the itemscope attribute. >>>>> >>>>> I made an example in the forum post with some nice ubb formatting . >>>>> >>>> There are some other issues with <link> and <meta> you might want >>>> to review first: [1] >>> Ok >>>> Your second example was: >>>> >>>> <div itemtype="http://url.to/geoVocab#country" itemscope> >>>> <span itemprop="http://xmlns.com/foaf/spec/index.rdf#name" >>>> lang="cn">???????</span> >>>> <span itemprop="http://xmlns.com/foaf/spec/index.rdf#name" >>>> lang="en">China</span> >>>> <link itemprop="http://url.to/city" >>>> href="http://url.to/shanghai" itemscope itemref="city-shanghai" /> >>>> <div id="city-shanghai"> >>>> <span >>>> itemprop="http://xmlns.com/foaf/spec/index.rdf#name">Shanghai</span> >>>> <span itemprop="http://url.to/demoVocab#population">14.61 >>>> million people</span> >>>> <span itemprop="http://url.to/physicsVocab#time" >>>> datetime="2009-11-26 11:43">11:43 pm (CT)</span> >>>> </div> >>>> </div> >>>> >>>> <link>, <meta> and any other void elements are usually the wrong >>>> choice for itemprop+itemscope because they don't have child >>>> elements, so itemref is the only way to add properties. >>> Yes, see forumpost. Shouldn't this be noted in the Spec then ? >> >> Yes, the spec certainly needs some notes on how to use <link> and >> <meta>. > And other void alements such as : area, base, br, col, command, embed, hr, img, input, link, meta, param, source (http://dev.w3.org/html5/markup/syntax.html) Basically, the microdata can't really be on all elements as stated in : HTML5 spec, 5.2.2 Items >>> According to this an "itemref" attribute can never be added to an >>> "item" within an itemscope of another "item" without the crawled >>> prop/val pairs also applying to the ancestors itemscope. >> >> Ah, I think you've found the root of the problem. By allowing a >> property to be part of several items at once, we get different kinds >> of strange problems. Except from messing up your example, it seems it >> is the real cause for the infinite recursion bug I wrote about in >> [1]. Then I was so focused on the recursion that I suggested a rather >> complex solution to detect loops in the microdata, when it seems it >> could be solved simply be making sure that a property belongs to only >> 1 item. Detailed suggestion below. >> Now, back to the problem of one property, multiple items. The >> algorithm for finding the properties of an item [2] is an attempt at >> optimizing the search for properties starting at an item element. I >> think we should replace this algorithm with an algorithm for finding >> the item of a property. This was previously the case with the spec >> before the itemref mechanism. I would suggest something along these >> lines: >> >> 1. let current be the element with the itemprop attribute >> 2. if current has an ID, for each element e in document order: >> 2.1. if e has an itemref attribute: >> 2.1.1. split the value of that itemref attribute on spaces. for each >> resulting token, ID: >> 2.1.1.1. if ID equals the ID of current, return e >> 3. reaching this step indicates that the item wasn't found via >> itemref on this element >> 4. let parent be the parent element of current >> 5. if parent is null, return null >> 6. if parent has the itemscope attribute, return parent >> 7. otherwise, let current be parent and jump to step 2. >> >> This algorithm will find the parent item of a property, if there is >> one. itemref'ing takes precedence over "parent-child linking", so in >> Tim's example the properties of Shanghai would be applied to only the >> Shanghai sub-item. I'm not convinced writing markup like that is a >> good idea, but at least this way it has sane processing. Which is important in the markup-souped web of non-linked-data :-) >> HTMLPropertiesCollection on any given element would simply match all >> elements in the document for which the the algorithm returns that >> very element. It should be invalid for there to be any elements in >> the document with itemprop where this algorithm returns null or the >> element itself. >> >> I will try implementing this algorithm in MicrodataJS [3] and see if >> it works OK. While it may look less efficient than the current >> algorithm, consider that a browser won't implement either algorithm >> as writting, only act as if they did. The expensive step of going >> through all elements with itemref attributes is actually no more >> expensive than e.g. document.querySelector('.classname') if >> implemented natively. I did something like this in my experimental/unfinished/test/learn microdata extractor based on jquery which is here : http://www.depulz.nl/microdata/ (works at least in FF 3.5 and opera 10.10). >> [1] >> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024095.html >> >> [2] >> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item >> >> [3] http://gitorious.org/microdatajs >> >
Received on Sunday, 29 November 2009 09:57:39 UTC