- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Fri, 4 Nov 2011 18:48:58 +0000
- To: public-vocabs@w3.org, Guha <guha@google.com>, Dan Brickley <danbri@danbri.org>
- Cc: HTML Data Task Force WG <public-html-data-tf@w3.org>
Hi schema.orgers I'd like to get some clarification around the use of URLs within schema.org, particularly help the HTML Data TF frame guidelines around microdata and RDFa usage of the vocabulary. I'm afraid there are rather a lot of questions here... First, am I correct that schema.org does not use the @itemid attribute? I can't see it used in any examples, but I haven't found a clear statement about its use either. The microdata spec states that the vocabulary determines both whether an item can have an id and what the meaning of that id is (eg whether items with the same id can be considered the same item by consumers). To avoid doubt, it would be helpful for schema.org to explicitly state that @itemid is not allowed if that's the case. Second, it looks as though the 'url' property acts as a kind of identifier for an item. It would be helpful for other vocabulary authors to understand why schema.org uses the 'url' property rather than @itemid, as the reasoning behind that design would probably apply elsewhere as well. If two items have the same value for their 'url' property (within a page or across pages), should they be considered to be the same item by consumers? Third, there are a number of properties in schema.org that look like they can take a URL. There are discrepancies between the schema.org ontology [1] and the schema.org microdata description [2] (which I believe is consistent with what appears on the pages themselves) and the examples. OWL OWL desc MD example url Literal - URL @href/@content contentURL Literal - URL @src/@content embedURL Literal - URL - audio AudioObject object or URL AudioObject @href video VideoObject object or URL VideoObject VideoObject image URL - URL @href/@src acceptsReservations Literal Yes/No or URL Text/URL - menu Literal menu or URL Text/URL - breadcrumb Literal set of links Text HTML maps Literal URL URL - significantLinks URL - URL - discussionUrl - - URL - publishingPrinciples - - URL - thumbnailUrl - - URL - replyToUrl - - URL - The OWL ontology is clearly out of date with the current spec, so perhaps it's just best to ignore what it says. The latter four ('discussionUrl', 'publishingPrinciples', 'thumbnailUrl' and 'replyToUrl') have obviously been added recently; they follow a different naming scheme (*Url rather than *URL). These, 'maps' and 'significantLinks' are all described as links to another document; there aren't examples of any of them on the site. 'breadcrumb' is described as "A set of links that can help a user understand and navigate a website hierarchy." The sole example of it is: <div itemprop="breadcrumb"> <a href="category/books.html">Books</a> > <a href="category/books-literature.html">Literature & Fiction</a> > <a href="category/books-classics">Classics</a> </div> Microdata processing dictates that the value of the 'breadcrumb' property in this case is (normalising whitespace for brevity) "Books Literature & Fiction Classics". The HTML content of this property isn't preserved by microdata processing, so it isn't actually a set of links but rather a textual description of the context of the page. Perhaps it would be better for this property to be called 'breadcrumbs' (pluralised to indicate the expectation that there will be more than one value) which could be given a type of URL, in which case the example would be rewritten as: <div> <a itemprop="breadcrumbs" href="category/books.html">Books</a> > <a itemprop="breadcrumbs" href="category/books-literature.html">Literature & Fiction</a> > <a itemprop="breadcrumbs" href="category/books-classics">Classics</a> </div> resulting in (assuming a base URI of http://books.example.org) the 'breadcrumbs' property having the values: ["http://books.example.org/category/books.html", "http://books.example.org/category/books-literature.html", "http://books.example.org/category/books-classics"] Alternatively, it might be that you do want to retain the HTML content of this property, in which case it would be good to make a comment on the bug [3] about supporting structured HTML content for microdata values, citing this use case. Let me know if you want me to do that on your behalf. There are discrepancies in the 'audio', 'video' and 'image' properties. All three have within schema.org a related object type (AudioObject, VideoObject, ImageObject), but only 'audio' and 'video' are defined to take object values, while 'image' takes a URL. But then, in the examples, the 'audio' property takes a URL rather than an AudioObject. This makes me think that schema.org might allow a property that takes an object to be given a URL instead, in which case perhaps it's treated the same as providing an object whose 'url' property is that URL? So for example: <a href="foo-fighters-rope-play.html" itemprop="audio">Play</a> is equivalent to: <span itemprop="audio" itemscope itemtype="http://schema.org/AudioObject"> <a href="foo-fighters-rope-play.html" itemprop="url">Play</a> </span> Is that the case? Finally, there are a few examples where the @content attribute is being used instead of the @href or @src attribute to provide a URL, for example the 'url' property in: <div itemprop="tracks" itemscope itemtype="http://schema.org/MusicRecording"> <span itemprop="name">Rope</span> <meta itemprop="url" content ="foo-fighters-rope.html"> ... </div> This isn't conformant with the microdata spec, which states: The URL property elements are the a, area, audio, embed, iframe, img, link, object, source, track, and video elements. If a property's value, as defined by the property's definition, is an absolute URL, the property must be specified using a URL property element. I imagine that these are bugs in the documentation rather than anything more, in which case it would be good to correct them if possible. On the other hand, if schema.org URL properties are meant to be resolved even when they don't appear in a URL property element, then it would be good to have that documented. Thanks, and sorry for the rather long email! Cheers, Jeni P.S. These questions arose from mails by Jayson Lorenzen [4] about RDF generated from schema.org microdata [1] http://schema.org/docs/schemaorg.owl [2] http://schema.org/docs/full_md.html [3] http://dev.w3.org/html5/md/Overview.html#url-property-elements [4] http://lists.w3.org/Archives/Public/public-html-data-tf/2011Oct/0197.html -- Jeni Tennison http://www.jenitennison.com
Received on Friday, 4 November 2011 18:51:57 UTC