[whatwg] Possible bugs : Microdata Itemscope on <link/> and <meta/>

On Sun, 29 Nov 2009 14:28:05 +0100, Philip J?genstedt <philipj at opera.com>  
wrote:

> Now, back to the problem of one property, multiple items. The algorithm  
> for finding the properties of an item [2] is an attempt at optimizing  
> the search for properties starting at an item element. I think we should  
> replace this algorithm with an algorithm for finding the item of a  
> property. This was previously the case with the spec before the itemref  
> mechanism. I would suggest something along these lines:
>
> 1. let current be the element with the itemprop attribute
> 2. if current has an ID, for each element e in document order:
> 2.1. if e has an itemref attribute:
> 2.1.1. split the value of that itemref attribute on spaces. for each  
> resulting token, ID:
> 2.1.1.1. if ID equals the ID of current, return e
> 3. reaching this step indicates that the item wasn't found via itemref  
> on this element
> 4. let parent be the parent element of current
> 5. if parent is null, return null
> 6. if parent has the itemscope attribute, return parent
> 7. otherwise, let current be parent and jump to step 2.
>
> This algorithm will find the parent item of a property, if there is one.  
> itemref'ing takes precedence over "parent-child linking", so in Tim's  
> example the properties of Shanghai would be applied to only the Shanghai  
> sub-item. I'm not convinced writing markup like that is a good idea, but  
> at least this way it has sane processing. HTMLPropertiesCollection on  
> any given element would simply match all elements in the document for  
> which the the algorithm returns that very element. It should be invalid  
> for there to be any elements in the document with itemprop where this  
> algorithm returns null or the element itself.
>
> I will try implementing this algorithm in MicrodataJS [3] and see if it  
> works OK. While it may look less efficient than the current algorithm,  
> consider that a browser won't implement either algorithm as writting,  
> only act as if they did. The expensive step of going through all  
> elements with itemref attributes is actually no more expensive than e.g.  
> document.querySelector('.classname') if implemented natively.
>
> [1]  
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024095.html
> [2]  
> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item
> [3] http://gitorious.org/microdatajs
>

With an added check to ignore self-referencing itemrefs, my algorithm  
seems to work. The only test cases I have where the result (as seen  
through HTMLPropertiesCollection) isn't the same is one similar to Tim's  
"1 property 2 items" and one involving self-reference. Incidentally, the  
cases which caused the JSON and vCard extraction algorithm to recurse  
infinitely now terminate with sane results.

A consequence of this change is that when two elements add the same  
property by itemref, only one will get it (the first in document order).  
This means that it isn't possible to share properties between items, which  
is precisely the point to avoid loops. If there was a use case that  
required property sharing, this needs some more tinkering. I'm inclined to  
say that when such sharing is wanted, one should add a level of  
indirection, e.g. with an ID. This way the microdata model is kept  
strictly tree-like.

To make the limitations clear to authors, an element with itemprop for  
which the algorithm returns null should be invalid. For elements with  
itemref, it should be invalid for any of the referenced elements to either  
not exist or to have another item as their "owner". In short, itemref'ing  
must be consistent.

For the curious, from the (not so optimized) JavaScript implementation:

function getCorrespondingItem(node) {
   var current = node;
   while (current) {
     if (current.id) {
       var referrer = document.querySelector('*[itemref~='+current.id+']');
       if (referrer && referrer != node)
	return referrer;
     }
     current = current.parentNode;
     if (current && current.itemScope)
       return current;
   }
   return null;
}

-- 
Philip J?genstedt

Received on Sunday, 29 November 2009 17:03:45 UTC