W3C home > Mailing lists > Public > public-html-data-tf@w3.org > November 2011

Re: @itemid and URL properties in schema.org

From: Peter Mika <pmika@yahoo-inc.com>
Date: Fri, 04 Nov 2011 21:44:06 +0100
Message-ID: <4EB44E96.6080906@yahoo-inc.com>
To: Jeni Tennison <jeni@jenitennison.com>
CC: "public-vocabs@w3.org" <public-vocabs@w3.org>, Guha <guha@google.com>, Dan Brickley <danbri@danbri.org>, HTML Data Task Force WG <public-html-data-tf@w3.org>
Hi Jeni,

We will strive to keep the OWL description in sync with the microdata 

In general, as you know, OWL makes a distinction between datatype- and 
object-properties, while microdata doesn't... (but it makes the slightly 
different distinction between URL valued properties and non-URL valued 

So the OWL description will always be stricter (more complete in a way) 
than the microdata description. It captures datatypes (URL, Number etc.) 
as subclasses of Literal and the object/datatype-property distinction, 
which is probably the most faithful translation of the intent of the spec.


On 11/4/11 7:48 PM, Jeni Tennison wrote:
> Hi schema.orgers
> I'd like to get some clarification around the use of URLs within schema.org, particularly help the HTML Data TF frame guidelines around microdata and RDFa usage of the vocabulary. I'm afraid there are rather a lot of questions here...
> First, am I correct that schema.org does not use the @itemid attribute? I can't see it used in any examples, but I haven't found a clear statement about its use either. The microdata spec states that the vocabulary determines both whether an item can have an id and what the meaning of that id is (eg whether items with the same id can be considered the same item by consumers). To avoid doubt, it would be helpful for schema.org to explicitly state that @itemid is not allowed if that's the case.
> Second, it looks as though the 'url' property acts as a kind of identifier for an item. It would be helpful for other vocabulary authors to understand why schema.org uses the 'url' property rather than @itemid, as the reasoning behind that design would probably apply elsewhere as well. If two items have the same value for their 'url' property (within a page or across pages), should they be considered to be the same item by consumers?
> Third, there are a number of properties in schema.org that look like they can take a URL. There are discrepancies between the schema.org ontology [1] and the schema.org microdata description [2] (which I believe is consistent with what appears on the pages themselves) and the examples.
>                       OWL          OWL desc       MD            example
> url                  Literal      -              URL           @href/@content
> contentURL           Literal      -              URL           @src/@content
> embedURL             Literal      -              URL           -
> audio                AudioObject  object or URL  AudioObject   @href
> video                VideoObject  object or URL  VideoObject   VideoObject
> image                URL          -              URL           @href/@src
> acceptsReservations  Literal      Yes/No or URL  Text/URL      -
> menu                 Literal      menu or URL    Text/URL      -
> breadcrumb           Literal      set of links   Text          HTML
> maps                 Literal      URL            URL           -
> significantLinks     URL          -              URL           -
> discussionUrl        -            -              URL           -
> publishingPrinciples -            -              URL           -
> thumbnailUrl         -            -              URL           -
> replyToUrl           -            -              URL           -
> The OWL ontology is clearly out of date with the current spec, so perhaps it's just best to ignore what it says.
> The latter four ('discussionUrl', 'publishingPrinciples', 'thumbnailUrl' and 'replyToUrl') have obviously been added recently; they follow a different naming scheme (*Url rather than *URL). These, 'maps' and 'significantLinks' are all described as links to another document; there aren't examples of any of them on the site.
> 'breadcrumb' is described as "A set of links that can help a user understand and navigate a website hierarchy." The sole example of it is:
>    <div itemprop="breadcrumb">
>      <a href="category/books.html">Books</a>  >
>      <a href="category/books-literature.html">Literature&  Fiction</a>  >
>      <a href="category/books-classics">Classics</a>
>    </div>
> Microdata processing dictates that the value of the 'breadcrumb' property in this case is (normalising whitespace for brevity) "Books Literature&  Fiction Classics". The HTML content of this property isn't preserved by microdata processing, so it isn't actually a set of links but rather a textual description of the context of the page.
> Perhaps it would be better for this property to be called 'breadcrumbs' (pluralised to indicate the expectation that there will be more than one value) which could be given a type of URL, in which case the example would be rewritten as:
>    <div>
>      <a itemprop="breadcrumbs" href="category/books.html">Books</a>  >
>      <a itemprop="breadcrumbs"
>         href="category/books-literature.html">Literature&  Fiction</a>  >
>      <a itemprop="breadcrumbs" href="category/books-classics">Classics</a>
>    </div>
> resulting in (assuming a base URI of http://books.example.org) the 'breadcrumbs' property having the values:
>    ["http://books.example.org/category/books.html",
>     "http://books.example.org/category/books-literature.html",
>     "http://books.example.org/category/books-classics"]
> Alternatively, it might be that you do want to retain the HTML content of this property, in which case it would be good to make a comment on the bug [3] about supporting structured HTML content for microdata values, citing this use case. Let me know if you want me to do that on your behalf.
> There are discrepancies in the 'audio', 'video' and 'image' properties. All three have within schema.org a related object type (AudioObject, VideoObject, ImageObject), but only 'audio' and 'video' are defined to take object values, while 'image' takes a URL. But then, in the examples, the 'audio' property takes a URL rather than an AudioObject. This makes me think that schema.org might allow a property that takes an object to be given a URL instead, in which case perhaps it's treated the same as providing an object whose 'url' property is that URL? So for example:
>    <a href="foo-fighters-rope-play.html" itemprop="audio">Play</a>
> is equivalent to:
>    <span itemprop="audio" itemscope itemtype="http://schema.org/AudioObject">
>      <a href="foo-fighters-rope-play.html" itemprop="url">Play</a>
>    </span>
> Is that the case?
> Finally, there are a few examples where the @content attribute is being used instead of the @href or @src attribute to provide a URL, for example the 'url' property in:
>    <div itemprop="tracks" itemscope itemtype="http://schema.org/MusicRecording">
>      <span itemprop="name">Rope</span>
>      <meta itemprop="url" content ="foo-fighters-rope.html">
>      ...
>    </div>
> This isn't conformant with the microdata spec, which states:
>    The URL property elements are the a, area, audio, embed, iframe, img,
>    link, object, source, track, and video elements.
>    If a property's value, as defined by the property's definition, is an
>    absolute URL, the property must be specified using a URL property
>    element.
> I imagine that these are bugs in the documentation rather than anything more, in which case it would be good to correct them if possible. On the other hand, if schema.org URL properties are meant to be resolved even when they don't appear in a URL property element, then it would be good to have that documented.
> Thanks, and sorry for the rather long email!
> Cheers,
> Jeni
> P.S. These questions arose from mails by Jayson Lorenzen [4] about RDF generated from schema.org microdata
> [1] http://schema.org/docs/schemaorg.owl
> [2] http://schema.org/docs/full_md.html
> [3] http://dev.w3.org/html5/md/Overview.html#url-property-elements
> [4] http://lists.w3.org/Archives/Public/public-html-data-tf/2011Oct/0197.html
Received on Friday, 4 November 2011 20:45:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:08:25 UTC