W3C home > Mailing lists > Public > public-vocabs@w3.org > August 2012

Re: Flattening Microdata

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Thu, 9 Aug 2012 01:48:33 -0400
To: Aaron Bradley <aaranged@yahoo.com>
CC: "Sandhaus, Evan" <sandhes@nytimes.com>, Public Vocabs <public-vocabs@w3.org>
Message-ID: <12E01AB8-CC5B-41DE-903B-288886C842D3@kellogg-assoc.com>
On Aug 8, 2012, at 10:40 AM, "Aaron Bradley" <aaranged@yahoo.com> wrote:

> I believe this is expected behavior because, indeed, you restrict properties to those allowed by the declared itemtype.
> 
> 
> <meta>
> declarations are a little problematic in this regard insofar as they 
> are self-closing, and so don't permit an itemscope to be declared beyond
> that <meta> content.
> 
> In <body> this isn't problematic because <span> and <div> tags can be used to define itemscope.
> 
> This...
> <span itemprop='creator' itemscope itemtype='http://schema.org/Person' itemid='the_creator_id'>
>     <meta id='author_name' itemprop='name' content='Evan S Sandhaus'/>    
>     <meta id='author_url' itemprop='url' content='http://sandha.us'/>    
> </span>
> </head>
> <body>
> </body>
> </html>
> 
> ... results in this RSST result - is this the desired behavior Evan?
> 
> Item 
> 
> Type: http://schema.org/newsarticle
>    headline =
> A Test Headline 
>    creator = Item(
> 1
> ) 
> 
> Item
> 1 
> Type: http://schema.org/person
>    name =
> Evan S Sandhaus 
>    url =
> http://sandha.us 
> 
> Though
> I'm not certain <div> and <span> are permitted within 
> <head> (I think not) - though no validation errors occur when they
> are used.

The HTML parser u use for my microdata parser will not handle div
Or span in head, as they are illegal. Really, only script can have content in head afaik, and that's not too useful.

If you really want invisible markup  in head, I'd consider turtle in a script tag.

Gregg

> A note from personal experience (re NewsArticle) 
> is that I've found itemscope declarations in the <html> tag (and 
> <body>) to be limiting, as you're out of luck if you need/want to 
> markup a property in the content that's not permitted for that 
> itemtype.  At least I've found myself rewriting a lot of code because of
> an <html itemscope itemtype="WebPage"> declaration that prevented
> me from marking up properties that within the scope of WebPage (e.g. 
> Event).
> 
> (You get two of these Evan as I forgot to reply all:)
> ________________________________
>> From: "Sandhaus, Evan" <sandhes@nytimes.com>
>> To: Public Vocabs <public-vocabs@w3.org> 
>> Sent: Wednesday, August 8, 2012 8:43:42 AM
>> Subject: Flattening Microdata
>> 
>> 
>> Hello all!
>> 
>> 
>> I'm
> interested in 'flattening' schema.org object markup into the 
> <head> element using <meta> elements.  In theory one should 
> be able to use the "itemref" and "id" attributes to 'flatten' an object 
> hierarchy into a set of metatags - but in practice this leads to 
> unexpected results.  
>> 
>> 
>> For example:
>> 
>> 
>> Suppose
> we have a NewsArticle with the headline 'A Test Headline' that has 
> a creator that is a Person that has the name 'Evan S Sandhaus' and 
> the url 'http://sandha.us';.  Here is an example of how to flatten that out in the <head> using id and itemref:
>> 
>> 
>> <html itemid='the_article_id' itemscope itemtype='http://schema.org/NewsArticle';>
>> <head>
>> <!-- Article properties in global scope -->
>> <meta itemprop='headline' content='A Test Headline'/>
>> 
>> 
>> <!-- Author Properties Flattened with itemref and ids -->
>> <meta itemprop='creator' itemscope itemtype='http://schema.org/Person'; itemid='the_creator_id' itemref='author_name author_url'/>
>> <meta id='author_name' itemprop='name' content='Evan S Sandhaus'/>
>> <meta id='author_url' itemprop='url' content='http://sandha.us'/>
>> </head>
>> <body>
>> </body>
>> </html>
>> 
>> 
>> So that's the theory.
>> 
>> 
>> In practice, however, both the Rich Snippets Tool and the Python microdata libraries I'm using locally (http://pypi.python.org/pypi/microdata) both insist on adding the creator-specific properties to both the scope of both the creator and the NewsItem.
>> 
>> 
>> More concretely - my local tools give me this: 
>> [{
>>     "id": "the_article_id",
>>     "properties": {
>>         "creator": [{
>>             "id": "the_creator_id",
>>             "properties": {
>>                 "name": ["Evan S Sandhaus"],
>>                 "url": ["http://sandha.us";]
>>             },
>>             "type": "http://schema.org/Person";
>>         }],
>>         "headline": ["A Test Headline"],
>>         "name": ["Evan S Sandhaus"],
>>         "url": ["http://sandha.us";]
>>     },
>>     "type": "http://schema.org/NewsItem";
>> }]
>> 
>> 
>> And the Rich Snippets tool gives me this:
>> Item 
>> 
>> Type: http://schema.org/newsarticle
>> headline = A Test Headline 
>> creator = Item( 1 ) 
>> name = Evan S Sandhaus 
>> url = http://sandha.us
>> 
>> Item 1 
>> 
>> Type: http://schema.org/person
>> name = Evan S Sandhaus 
>> url = http://sandha.us
>> 
>> 
>> So
> the question is: is this expected behavior?  If so, is there anything I
> could do besides this to "flatten" the markup into the <head> 
> element?
>> 
>> 
>> Thanks!
>> 
>> 
>> ~Evan
>> --
>> Evan Sandhaus
>> Lead Architect, Semantic Platforms
>> The New York Times Company
>> @kansandhaus
>> 
>>   
> 
> 
Received on Thursday, 9 August 2012 05:49:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 9 August 2012 05:49:16 GMT