W3C home > Mailing lists > Public > public-vocabs@w3.org > August 2012

Re: Flattening Microdata

From: Will Norris <will@willnorris.com>
Date: Mon, 13 Aug 2012 08:12:08 -0700
Message-ID: <CAJqAn3zK=AEMpraGKq2_KR27xzQ=3v_h3haqfbukaxvuhCuAZQ@mail.gmail.com>
To: Kevin Marks <kevinmarks@gmail.com>
Cc: Gregg Kellogg <gregg@greggkellogg.net>, Aaron Bradley <aaranged@yahoo.com>, Public Vocabs <public-vocabs@w3.org>, "Sandhaus, Evan" <sandhes@nytimes.com>
(I'm pretty sure the idea is to have it parseable by standard microdata
libraries)

On Thu, Aug 9, 2012 at 11:37 AM, Kevin Marks <kevinmarks@gmail.com> wrote:

> Why not declare the data as JSON in a script tag if you want invisible
> data in the head?
> Then you can be DRY compliant by using it client side too.
> On Aug 8, 2012 10:49 PM, "Gregg Kellogg" <gregg@greggkellogg.net> wrote:
>
>> On Aug 8, 2012, at 10:40 AM, "Aaron Bradley" <aaranged@yahoo.com> wrote:
>>
>> > I believe this is expected behavior because, indeed, you restrict
>> properties to those allowed by the declared itemtype.
>> >
>> >
>> > <meta>
>> > declarations are a little problematic in this regard insofar as they
>> > are self-closing, and so don't permit an itemscope to be declared beyond
>> > that <meta> content.
>> >
>> > In <body> this isn't problematic because <span> and <div> tags can be
>> used to define itemscope.
>> >
>> > This...
>> > <span itemprop='creator' itemscope itemtype='http://schema.org/Person'
>> itemid='the_creator_id'>
>> >     <meta id='author_name' itemprop='name' content='Evan S Sandhaus'/>
>> >     <meta id='author_url' itemprop='url' content='http://sandha.us'/>
>> > </span>
>> > </head>
>> > <body>
>> > </body>
>> > </html>
>> >
>> > ... results in this RSST result - is this the desired behavior Evan?
>> >
>> > Item
>> >
>> > Type: http://schema.org/newsarticle
>> >    headline =
>> > A Test Headline
>> >    creator = Item(
>> > 1
>> > )
>> >
>> > Item
>> > 1
>> > Type: http://schema.org/person
>> >    name =
>> > Evan S Sandhaus
>> >    url =
>> > http://sandha.us
>> >
>> > Though
>> > I'm not certain <div> and <span> are permitted within
>> > <head> (I think not) - though no validation errors occur when they
>> > are used.
>>
>> The HTML parser u use for my microdata parser will not handle div
>> Or span in head, as they are illegal. Really, only script can have
>> content in head afaik, and that's not too useful.
>>
>> If you really want invisible markup  in head, I'd consider turtle in a
>> script tag.
>>
>> Gregg
>>
>> > A note from personal experience (re NewsArticle)
>> > is that I've found itemscope declarations in the <html> tag (and
>> > <body>) to be limiting, as you're out of luck if you need/want to
>> > markup a property in the content that's not permitted for that
>> > itemtype.  At least I've found myself rewriting a lot of code because of
>> > an <html itemscope itemtype="WebPage"> declaration that prevented
>> > me from marking up properties that within the scope of WebPage (e.g.
>> > Event).
>> >
>> > (You get two of these Evan as I forgot to reply all:)
>> > ________________________________
>> >> From: "Sandhaus, Evan" <sandhes@nytimes.com>
>> >> To: Public Vocabs <public-vocabs@w3.org>
>> >> Sent: Wednesday, August 8, 2012 8:43:42 AM
>> >> Subject: Flattening Microdata
>> >>
>> >>
>> >> Hello all!
>> >>
>> >>
>> >> I'm
>> > interested in 'flattening' schema.org object markup into the
>> > <head> element using <meta> elements.  In theory one should
>> > be able to use the "itemref" and "id" attributes to 'flatten' an object
>> > hierarchy into a set of metatags - but in practice this leads to
>> > unexpected results.
>> >>
>> >>
>> >> For example:
>> >>
>> >>
>> >> Suppose
>> > we have a NewsArticle with the headline 'A Test Headline' that has
>> > a creator that is a Person that has the name 'Evan S Sandhaus' and
>> > the url 'http://sandha.us';.  Here is an example of how to flatten
>> that out in the <head> using id and itemref:
>> >>
>> >>
>> >> <html itemid='the_article_id' itemscope itemtype='
>> http://schema.org/NewsArticle';>
>> >> <head>
>> >> <!-- Article properties in global scope -->
>> >> <meta itemprop='headline' content='A Test Headline'/>
>> >>
>> >>
>> >> <!-- Author Properties Flattened with itemref and ids -->
>> >> <meta itemprop='creator' itemscope itemtype='http://schema.org/Person';
>> itemid='the_creator_id' itemref='author_name author_url'/>
>> >> <meta id='author_name' itemprop='name' content='Evan S Sandhaus'/>
>> >> <meta id='author_url' itemprop='url' content='http://sandha.us'/>
>> >> </head>
>> >> <body>
>> >> </body>
>> >> </html>
>> >>
>> >>
>> >> So that's the theory.
>> >>
>> >>
>> >> In practice, however, both the Rich Snippets Tool and the Python
>> microdata libraries I'm using locally (
>> http://pypi.python.org/pypi/microdata) both insist on adding the
>> creator-specific properties to both the scope of both the creator and the
>> NewsItem.
>> >>
>> >>
>> >> More concretely - my local tools give me this:
>> >> [{
>> >>     "id": "the_article_id",
>> >>     "properties": {
>> >>         "creator": [{
>> >>             "id": "the_creator_id",
>> >>             "properties": {
>> >>                 "name": ["Evan S Sandhaus"],
>> >>                 "url": ["http://sandha.us";]
>> >>             },
>> >>             "type": "http://schema.org/Person";
>> >>         }],
>> >>         "headline": ["A Test Headline"],
>> >>         "name": ["Evan S Sandhaus"],
>> >>         "url": ["http://sandha.us";]
>> >>     },
>> >>     "type": "http://schema.org/NewsItem";
>> >> }]
>> >>
>> >>
>> >> And the Rich Snippets tool gives me this:
>> >> Item
>> >>
>> >> Type: http://schema.org/newsarticle
>> >> headline = A Test Headline
>> >> creator = Item( 1 )
>> >> name = Evan S Sandhaus
>> >> url = http://sandha.us
>> >>
>> >> Item 1
>> >>
>> >> Type: http://schema.org/person
>> >> name = Evan S Sandhaus
>> >> url = http://sandha.us
>> >>
>> >>
>> >> So
>> > the question is: is this expected behavior?  If so, is there anything I
>> > could do besides this to "flatten" the markup into the <head>
>> > element?
>> >>
>> >>
>> >> Thanks!
>> >>
>> >>
>> >> ~Evan
>> >> --
>> >> Evan Sandhaus
>> >> Lead Architect, Semantic Platforms
>> >> The New York Times Company
>> >> @kansandhaus
>> >>
>> >>
>> >
>> >
>>
>>
Received on Monday, 13 August 2012 15:12:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 13 August 2012 15:12:59 GMT