- From: Tomasz Jamroszczak <toja@opera.com>
- Date: Wed, 08 Jun 2011 09:38:09 +0200
Hello. I'm implementing Microdata for Opera and I've got problems with loops in graphs of Microdata items. Summary: 1. Is there a bug in the "crawl the properties" algorithm of Microdata? 2. Is there a bug in "get the object" algorithm of converting Microdata to JSON? 3. What is the true meaning of itemref? I've been looking into Microdata specification and it struck me, that crawling algorithm (http://dev.w3.org/html5/md/Overview.html#associating-names-with-items) is so complex, when it comes to expressing simple ideas. I think that foremost the algorithm should be described in the specification with explanation what it's supposed to do, before steps of what exactly is to be done are written. Let's see, what are the properties of Microdata item from HTML element with id=up from following HTML: <div itemscope id=up itemprop=prop0> <div itemscope id=down itemprop=prop1 itemref="up"></DIV> </div>. CRAWL root = up memory = {} 1. xxx 2. COLLECT 1. results = {} pending = {} 3. pending = {down} 4. xxx 5. pending = {} current = down 7. xxx 8. results = {down} results = {down} 3. xxx 4. new_memory = {up} 5. element = down CRAWL 0. memory2 = {up} root2 = down 1. xxx 2. COLLECT 1. results2 = {} pending2 = {} 3. xxx 4. pending2 = {up} 5. pending2 = {} current2 = up 7. xxx 8. results2 = {up} results2 = {up} 3. xxx 4. new_memory2 = {up, down} 5. element2 = up CRAWL 0. memory3 = {up, down} root3 = up 1. return FAIL !!! results2 = results2 - up = {} 7. return results2 == {} (not FAIL). 7. return results == {down} In the end properties of Microdata item from HTML element with id=up has length=1. The troubling part is in the line marked with triple exclamation marks. It means that step 5. of the algorithm should be simplified to "For each element in results that has an itemscope attribute specified, if the element is equal to /root/, then remove the element from results [and increment errors]". Further recursive crawling is not needed. But then there's problem with infinite recursion when going through stringification algorithm of http://dev.w3.org/html5/md/Overview.html#json for HTML given above. We can proceed in two ways: a) allow loops of Microdata items and make JSONification of Microdata item behave just like JSONification of any javascript object, that is - throw exception when loop is found. Or b) exclude loops of Microdata items (so in above example Microdata item from HTML element with id=up would have no Microdata properties). This will result in crippling functionality of a quite nice HTML API, but also it will produce consistent results in HTMLPropertiesCollection and stringification. Third solution: c) cut only offending links, is not good, because in case of graph of Microdata items with following paths: "A->B->C->D->B" and "E->D" stringification of item A would result in item D having no properties, while stringification of E would result in D having property B - so presence of property would depend on path's starting part. I can imagine good usages of loops of Microdata items, for example "John knows Amy, Amy knows John": <div itemscope id="john" itemprop> <div itemprop="friends" itemref="fred1 jenny2 amy1"></div> </div> <div itemscope id="amy1" itemprop> <div itemprop="friends" itemref="john"></div> </div> There's loop: jonh->amy1->john->... . If the loop is to be excluded, and thus recursion, the same data could be written as: <div itemscope> <div itemprop=addressbook_id>1</div> <div itemprop=name>John</div> <div itemprop=knows>2</div> </div> <div itemscope> <div itemprop=addressbook_id>2</div> <div itemprop=name>Amy</div> <div itemprop=knows>1</div> </div>. maybe with some <meta> instead of <div> or more verbosely: <p itemscope itemid="#john" id="#john">John knows <a itemprop="http://xmlns.com/foaf/0.1/knows" href="#amy">Amy</a>.</p> <p itemscope itemid="#amy" id="#amy">Amy knows <a itemprop="http://xmlns.com/foaf/0.1/knows" href="#john">John</a>.</p> The problem I'm addressing revolves around meaning of link between itemref and id attributes. Is it meant to be a part of Microdata data model? Or maybe it is introduced to cope with the fact that Microdata graph is defined on top of existing data, which is something completely different, and is meant to be rendered to the user (that is on top of HTML tree)? So the meaning of itemref attribute should also hint interpretation of it inside the specification. -- Best Regards, Tomasz Jamroszczak
Received on Wednesday, 8 June 2011 00:38:09 UTC