- From: pghj <pghjvanblokland@gmail.com>
- Date: Fri, 12 Oct 2012 16:28:14 +0200
- To: whatwg@whatwg.org
Hello, I am writing a set of tools to work with microdata, and ran into a number of issues. Is there at this point still room for discussion, and improvements to the specification? For what it is worth, here are some of the things I ran into, and proposals to make it better: == Usage of URLs that do not point to anything interesting == I'm not sure whether this has been discussed in length, though it seems that Philip Jägenstedt brought it up once [1]. For a variety of reasons, I would much rather use <data> and <a> than <meta> and <link> for microdata: less ugly, script has easy access to the user visible representation of data, and CSS styling of that representation based on microdata attributes (itemref complicates this - see below), etc. However, for enumerations like http://schema.org/InStock a clickable <a> would not be desirable, yet the use of <data> would violate the microdata specification, section "Values": "If a property's value, as defined by the property's definition, is an absolute URL, the property must be specified using a URL property element." I do not see much merit in this requirement: the URL is already absolute, so it does not need resolving and it is already defined to be a URL by the property's definition. Therefore storing it in a <data> element would not do much harm. Because there are many benefits to being able to wrap visible content in a microdata property, I would like to propose that this requirement is dropped, so the <data> element may also carry an absolute URL. Nevertheless, I see how it would be useful to store a URL in such a way that it is clear it's a URL, and have it properly resolved. For as far as I can tell, no HTML element combines the following three properties: 1. Stores a definite URL type value, 2. Can have phrasing content, 3. Has no side effects (clickable, etc). Therefore, as an alternative to dropping the requirement mentioned above, I would also be in favor of allowing an additional attribute on the <data> element (for example named 'url'), mutually exclusive with the 'value' attribute, that is to be resolved the same way as the URLs obtained from <a>, <link>, <img>, etc are. == Incompatible property names when using itemrefs == Consider the following piece of HTML: <div itemscope itemtype="http://schema.org/Book" itemref="a"> ... </div> <div itemscope itemtype="http://schema.org/LiteraryEvent" itemref="b"> ... </div> <div id="a" itemprop="author" itemscope itemtype="http://schema.org/Person" itemref="c"></div> <div id="b" itemprop="performer" itemscope itemtype="http://schema.org/Person" itemref="c"></div> <div id="c"> Name: <span itemprop="name">Amanda</span> </div> Actually, the 'Book' item and the 'LiteraryEvent' item both want to refer to the same person: the first as the author, the second as a performer. Because the property names differ, I can't seem to find a proper way to do this using itemrefs, without either polluting other items, or creating two 'Person' items (as I did above). Both approaches are undesirable. An alternative way of using the itemref attribute, which makes much more sense to me, would lead to this: <div itemscope itemtype="http://schema.org/Book"> Author: <a itemprop="author" itemref href="#a">Amanda</a> ... </div> <div id="b" itemscope itemtype="http://schema.org/LiteraryEvent"> Speaking: <a itemprop="performer" itemref href="#a">Amanda</a> ... </div> <div id="a" itemscope itemtype="http://schema.org/Person"> Name: <span itemprop="name">Amanda</span> Near you: <a itemprop="performerIn" itemref href="#b">reading from her new book</a> </div> Formally: If an element has both the attributes itemprop and itemref, but not itemscope, and itemref is empty, then it should have a URL type value that points to another element that is an item. This item, if it exists in the same document, will be the property's value. If not, the URL will be used. This has a few consequences: 1. It opens the door to pointing to microdata in other documents. Although a browser probably shouldn't try to fetch it, this can be useful for search engines. 2. It makes more sense to allow directed cycles in the graph created by the items in a page, as created with the 'performerIn' property on 'Person' in the example. I think these changes are compatible with current use, because right now itemref is not to be used on elements without itemscope. The only issue I see is that the microdata DOM API could now present cyclic graphs. It is not yet deployed anywhere, is it? Anyway, for people using it on their own data it shouldn't be a problem. In my opinion, there are great benefits to the alternative itemref approach: 1. The issue with incompatible property names is eliminated. 2. Possibility to refer to external data. 3. For most purposes microdata would better match document structure as presented to the user. 4. It more closely resembles common data models, making it easier to serialize them into microdata. 5. It is possible to mark-up more complex graphs in HTML documents this way. 6. With only this use of itemref, and forsaking nested items, CSS styling based on microdata attributes becomes very feasible. I'd be interested to hear what people think. Thank you for reading, Josh [1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024116.html
Received on Friday, 12 October 2012 14:28:43 UTC