Re: Microdata itemid and src / href from Jeni Tennison on 2011-10-21 (public-html-data-tf@w3.org from October 2011)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Fri, 21 Oct 2011 20:29:40 +0100
To: Jayson Lorenzen <Jayson.Lorenzen@businesswire.com>
Cc: <public-html-data-tf@w3.org>
Message-Id: <B8C465AE-CCE5-4CA0-8D0D-68037129B6CD@jenitennison.com>

Jayson,

Thanks for writing about this, and particularly for the examples. I'm going to snip them out for brevity...

On 21 Oct 2011, at 18:48, Jayson Lorenzen wrote:
> When using the *src* or *href*, RDF distillers create a
> relation, and the resulting RDF (at least in Turtle) can look like an
> endless loop waiting to happen, or just an odd relation. Here is an
> endless loop example:

In RDF, it is absolutely fine to have a statement like:

  <http://businesswire.com> schema:url <http://businesswire.com> .

It's like an object having a property whose value is a pointer to that same object. The only endless loop would come if an application that traversed the data didn't account for potential loops, which is more to do with the application than the data.

> Changing from using <a> , to a hidden <meta> tag 
[snip]
> produces a URL property that is just a text string. 
[snip]
> For properties that require a URL (like the contentURL from
> Schema.org), which is correct?

The microdata spec [1] says:

  "If a property's value, as defined by the property's definition, 
   is an absolute URL, the property must be specified using a URL 
   property element."

The reason for this constraint is that values within the href or src attribute will be resolved into absolute URLs, whereas though that are put in a content attribute or just embedded within the content of the page will not. So if you had (assuming we're on the businesswire.com site):

  <a itemprop="url" href="/">
    <img itemprop="image"
         src="/images/Powered-by-Business-Wire.gif" />
  </a>

then the microdata would include:

  url:   http://www.businesswire.com
  image: http://www.businesswire.com/images/Powered-by-Business-Wire.gif

whereas if you use a meta element as in:

  <a href="/">
    <meta itemprop="url" content="/" />
    <img itemprop="image"
         src="/images/Powered-by-Business-Wire.gif" />
  </a>

then the properties would be:

  url:   /
  image: http://www.businesswire.com/images/Powered-by-Business-Wire.gif

The RDF generation is another step on top of that. If the value comes from a URL property element, then it creates a reference to a resource, which is how we usually treat URLs in RDF, otherwise a plain literal string.

So, assuming that schema.org mean the url attribute to hold an absolute URL then the right thing to do is to use the href attribute, not a meta element.

> Other examples of this are with Schem.org/ImageObject s where the
> *itemid* is the same as the contentURL (interestingly URL is
> upper case for this property but camel case for thumbnailUrl :)

I think that these are things to take up on the public-vocabs@w3.org mailing list. I don't know what the relationship between the microdata itemid and the schema.org url or contentURL properties is supposed to be.

> I imagine I, or other new to RDF implementors of Microdata, just use
> the *itemid* and or *href*/*src* wrongly for  these cases,
> but if a guide is produced to help them/me/us, and it had an explanation
> of how to do this correctly, it would be a big help. 

Does the above help? Would you like to add your example to the wiki, perhaps at [2]?

It would also be great to hear about how you're using microdata, and particularly why you're looking at the RDF that's extracted from it.

Thanks,

Jeni

[1] http://dev.w3.org/html5/md/Overview.html#url-property-elements
[2] http://www.w3.org/wiki/Mapping_Microdata_to_RDF
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Friday, 21 October 2011 19:29:59 UTC