Microdata itemid and src / href

I would like to point out something that may afflict web developers, who
are new to RDF and are annotating HTML with Microdata, that may be good
to have some best practice outlined for, in the guidelines Jeni pointed
out that the group will  produce. (Jeni mentioned on 10/19 *Guidelines
for publishers and consumers about how to publish and consume data
embedded in HTML*)

This may be old hat to seasoned RDF developers but it is bothering me,
and maybe others, and at least I would like to have some help. 

The issue is with adding *itemprop* to the elements <link> <a> and
<img> and using the *href* or *src* for a property that should
be a URL, when the same URL might be used as an *itemid* of another
object on the page, or even the object the *itemprop* is part of.  I
came across this while trying not to add too much hidden content, as
Google recommend, and was trying to use the value that was already on
the page, in the <a>'s href attribute.

When using the *src* or *href*, RDF distillers create a
relation, and the resulting RDF (at least in Turtle) can look like an
endless loop waiting to happen, or just an odd relation. Here is an
endless loop example:

A Schema.org/NewsArticle has a copyrightHolder which is a
Schema.org/Orgaization and that org has an *itemid* which is a URL
(as per the Microdata spec), pointing to the Business Wire home page in
this case,  an *image* and then a *url* which is coming from the
<a> .


<span itemprop="provider publisher copyrightHolder"
itemscope="itemscope"
          itemtype="http://schema.org/Organization"
itemid="http://businesswire.com">
  <meta itemprop="name" content="Business Wire"/>
  <a itemprop="url" href="http://www.businesswire.com">
     <img itemprop="image"
             
src="http://www.businesswire.com/images/Powered-by-Business-Wire.gif"
              title="Business Wire is the leading source for full-text
breaking news and press releases, multimedia and regulatory filings for
companies and groups throughout the world"
              alt="Powered by Business Wire"/>
  </a>
</span>

In an RDF extract we get a copyrightHolder for the article identified
as:

 schema:copyrightHolder <http://www.businesswire.com>;


but then for the Organization, there is a URL property which is a
reference back to Business Wire, the Organization.

<http://businesswire.com> a schema:Organization;
   schema:image
<http://www.businesswire.com/images/Powered-by-Business-Wire.gif>;
   schema:name "Business Wire";
   schema:url <http://www.businesswire.com> .


Changing from using <a> , to a hidden <meta> tag 


<span itemprop="provider publisher copyrightHolder"
itemscope="itemscope"
          itemtype="http://schema.org/Organization"
itemid="http://businesswire.com">
  <meta itemprop="name" content="Business Wire"/>
  <meta itemprop="url" content="http://www.businesswire.com"/>
  <a href="http://www.businesswire.com">
    <img itemprop="image"
             
src="http://www.businesswire.com/images/Powered-by-Business-Wire.gif"
              title="Business Wire is the leading source for full-text
breaking news and press releases, multimedia and regulatory filings for
companies and groups throughout the world"
              alt="Powered by Business Wire"/>
   </a>
</span>


produces a URL property that is just a text string. 


<http://businesswire.com> a schema:Organization;
   schema:image
<http://www.businesswire.com/images/Powered-by-Business-Wire.gif>;
   schema:name "Business Wire";
   schema:url "http://www.businesswire.com" .


For properties that require a URL (like the contentURL from
Schema.org), which is correct?

Other examples of this are with Schem.org/ImageObject s where the
*itemid* is the same as the contentURL (interestingly URL is
upper case for this property but camel case for thumbnailUrl :)

I imagine I, or other new to RDF implementors of Microdata, just use
the *itemid* and or *href*/*src* wrongly for  these cases,
but if a guide is produced to help them/me/us, and it had an explanation
of how to do this correctly, it would be a big help. 

Thanks for listening.

Jayson




Jayson Lorenzen
Senior Software Engineer
____________________________ 
B  U  S  I  N  E  S  S       W  I  R  E 
A Berkshire Hathaway Company
 
+1.415.986.4422, ext. 766 
+1.415.956.2609 (fax) 
www.BusinessWire.com
 
Business Wire/San Francisco 
44 Montgomery St. 39th Floor
San Francisco, CA 94104




Please Note:  

The information in this Business Wire e-mail message, and any files
transmitted with it, is confidential and may be legally privileged. It
is intended only for the use of the individual(s) named above. If you
are the intended recipient, be aware that your use of any confidential
or personal information may be restricted by state and federal privacy
laws. If you, the reader of this message, are not the intended
recipient, you are hereby notified that you should not further
disseminate, distribute, or forward this e-mail message. If you have
received this e-mail in error, please notify the sender and delete the
material from any computer.

Received on Friday, 21 October 2011 17:51:52 UTC