Re: Microdata: The Itemref element

On Sun, 18 Oct 2009 21:18:40 +0200, Nicholas Stimpson  
<nicholas.stimpson@ntlworld.com> wrote:

> Dear HTML WG,
>
> I have been looking at the practicalities of using microdata.
>
> One issue that concerns me is the itemref element. The itemref element  
> is defined as empty, but current browsers don't recognise the element,  
> so non-IE browsers treat it as having content.
>
> Take this example from the draft:
>
> <div itemscope>
>  <itemref refid="x">
>  <p itemprop="b">test</p>
>  <p itemprop="a">2</p>
> </div>
> <div id="x">
>  <p itemprop="a">1</p>
> </div>
>
> Currently in non-IE browsers this will produce this DOM - Taken in  
> Firefox 3.5 from  
> http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cbody%3E%0A%3Cdiv%20itemscope%3E%0A%20%3Citemref%20refid%3D%22x%22%3E%0A%20%3Cp%20itemprop%3D%22b%22%3Etest%3C%2Fp%3E%0A%20%3Cp%20itemprop%3D%22a%22%3E2%3C%2Fp%3E%0A%3C%2Fdiv%3E%0A%3Cdiv%20id%3D%22x%22%3E%0A%20%3Cp%20itemprop%3D%22a%22%3E1%3C%2Fp%3E%0A%3C%2Fdiv%3E%0A%0A
>
>     * DOCTYPE: |HTML|
>     * |HTML|
>           o |HEAD|
>                 + |#text|:
>           o |BODY|
>                 + |#text|:
>                 + |DIV| |itemscope|="||"
>                       # |#text|:
>                       # |ITEMREF| |refid|="|x|"
>                             * |#text|:
>                             * |P| |itemprop|="|b|"
>                                   o |#text|: test
>                             * |#text|:
>                             * |P| |itemprop|="|a|"
>                                   o |#text|: 2
>                             * |#text|:
>                 + |#text|:
>                 + |DIV| |id|="|x|"
>                       # |#text|:
>                       # |P| |itemprop|="|a|"
>                             * |#text|: 1
>                       # |#text|:
>                 + |#text|:
>
> In IE now, and presumably, at some point in the future, the other  
> browsers will be changed to treat itemref as empty, at which point, this  
> DOM will be produced.
>
>     * DOCTYPE: |HTML|
>     * |HTML|
>           o |HEAD|
>                 + |#text|:
>           o |BODY|
>                 + |#text|:
>                 + |DIV| |itemscope|="||"
>                       # |#text|:
>                       # |ITEMREF| |refid|="|x|"
>                       # |#text|:
>                       # |P| |itemprop|="|b|"
>                             * |#text|: test
>                       # |#text|:
>                       # |P| |itemprop|="|a|"
>                             * |#text|: 2
>                       # |#text|:
>                 + |#text|:
>                 + |DIV| |id|="|x|"
>                       # |#text|:
>                       # |P| |itemprop|="|a|"
>                             * |#text|: 1
>                       # |#text|:
>                 + |#text|:
>
> Clearly, for many web pages this may require different CSS selectors and  
> scripting to handle both cases, and it is likely to be a major source of  
> bugs and confusion.
>
> Is it really necessary for microdata to mint a new empty element here?
>
> Instead, would it be possible for the element with the "itemscope"  
> attribute to have an itemref attribute that was a space separated list  
> of ids?
>
>
> Or if that's not possible, could an existing empty element be overloaded  
> to replace itemref? Both "link" and "param" seem to check out in the  
> live DOM viewer as possibles, providing that one encountering them the  
> parser has already reached the parsing-the-body mode. Link is already  
> being overloaded for the itemprop attribute so extending that with an  
> itemref attribute doesn't seem a huge leap. e.g. instead of
>
> <itemref refid="x">
>
> have
>
> <link itemref="x">
>
> I don't pretend to understand what the side effects of doing something  
> like this would be, though.
>
>
> If that can't be done, then one authoring approach to solve the problem  
> would be to write
>
> <itemref refid="x"></itemref>
>
> However, that is invalid HTML5, which personally I'd live with, but  
> doesn't seem the happiest outcome. Maybe this could be made valid? (The  
> HTML shiv could be used to avoid getting the /ITEMREF tag in IE)
>
>
> If none of the above is practical, then could the draft contain advice  
> to place itemref elements at the end of the itemscope div like this?
>
> <div itemscope>
>  <p itemprop="b">test</p>
>  <p itemprop="a">2</p>
>  <itemref refid="x">
> </div>
> <div id="x">
>  <p itemprop="a">1</p>
> </div>
>
> Although the DOMs will still be different if there's more than itemref  
> element, the effect on css selectors and scripting will be minimised.
>
> - Nicholas Stimpson.
>
>

Thanks Nicholas for bringing this up. I'm also not a big fan of the  
itemref element, for two reasons. First, it's a new void element, which is  
always problematic, especially when we expect it to be sprinkled all over  
existing markup (unlike <source> which will be inside <video>).

Second, it makes HTMLPropertyCollection a more difficult to implement, see  
yesterdays bug [1] and the IRC discussion [2] for some details. Because  
<itemref> allows looped references, the algorithm for determining the  
properties of an item is complicated by keeping a list of visited nodes.  
This will be added complexity for browser implementations and external  
parsers alike.

Using an attribute avoids both of these issues. It would simplify the  
algorithm as the only possibility of loops is if there's a reference to  
the @itemscope'd element or an ancestor of it, which is much easier to  
check (just one node instead of a list of all visited nodes).

The question is if there was ever a use case for chained references which  
we would break by doing this simplification. As far as I can see, it would  
be something like:

<!-- two properties in different subtrees -->
<p>
  <span id="a" itemprop="prop-a">Property A</span>
</p>

<p>
  <span id="b" itemprop="prop-b">Property B</span>
</p>

<!-- something to collect them both -->
<div id="both">
  <itemref refid="a">
  <itemref refid="b">
</div>

<!-- reuse in two different items -->
<div itemscope>
   <itemref refid="both">
   <span itemprop="not-shared">This property isn't shared</span>
</div>

<div itemscope>
   <itemref refid="both">
</div>

Sorry, I can't come up with an example that makes sense. It seems that the  
only reason to do the above is to share properties between items while  
saving some typing. In that case itemref="a b" on both @itemscope'd  
elements would still be less typing than this.

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7964
[2] http://krijnhoetmer.nl/irc-logs/whatwg/20091019#l-24
[2]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Monday, 19 October 2009 11:00:50 UTC