W3C home > Mailing lists > Public > public-vocabs@w3.org > August 2012

Re: Flattening Microdata

From: Sandhaus, Evan <sandhes@nytimes.com>
Date: Wed, 8 Aug 2012 12:01:13 -0400
To: "LeVan,Ralph" <levan@oclc.org>
CC: Public Vocabs <public-vocabs@w3.org>
Message-ID: <467CE7F1-4887-4A95-BE97-03BC4A416D5D@nytimes.com>
Ralph et al.

I would rather not sidetrack this thread into a discussion about the virtues of hidden/visible metadata.  You raise an important issue to be sure, but I'd like to keep this thread focused on the itemref/id parsing issue.

~Evan


On Aug 8, 2012, at 11:57 AM, LeVan,Ralph wrote:

Evan, could you explain why you want to do this?

My understanding is that this is discouraged behavior.  Search engines donít trust metadata that isnít visible to users.  The library community got very excited about using meta tags years ago and then discovered that they were being ignored.

Could someone else verify my understanding of the meta tag?

Thanks!

Ralph

Ralph LeVan
Research Scientist
OCLC

From: Sandhaus, Evan [mailto:sandhes@nytimes.com]
Sent: Wednesday, August 08, 2012 11:44 AM
To: Public Vocabs
Subject: Flattening Microdata

Hello all!

I'm interested in 'flattening' schema.org<http://schema.org/> object markup into the <head> element using <meta> elements.  In theory one should be able to use the "itemref" and "id" attributes to 'flatten' an object hierarchy into a set of metatags - but in practice this leads to unexpected results.

For example:

Suppose we have a NewsArticle with the headline 'A Test Headline' that has a creator that is a Person that has the name 'Evan S Sandhaus' and the url 'http://sandha.us'.  Here is an example of how to flatten that out in the <head> using id and itemref:

<html itemid='the_article_id' itemscope itemtype='http://schema.org/NewsArticle'>
                <head>
                                <!-- Article properties in global scope -->
                                <meta itemprop='headline' content='A Test Headline'/>

                                <!-- Author Properties Flattened with itemref and ids -->
                                <meta itemprop='creator' itemscope itemtype='http://schema.org/Person' itemid='the_creator_id' itemref='author_name author_url'/>
                                <meta id='author_name' itemprop='name' content='Evan S Sandhaus'/>
                                <meta id='author_url' itemprop='url' content='http://sandha.us'<http://sandha.us'/>/<http://sandha.us'/>>
                </head>
                <body>
                </body>
</html>

So that's the theory.

In practice, however, both the Rich Snippets Tool and the Python microdata libraries I'm using locally (http://pypi.python.org/pypi/microdata) both insist on adding the creator-specific properties to both the scope of both the creator and the NewsItem.

More concretely - my local tools give me this:
[{
    "id": "the_article_id",
    "properties": {
        "creator": [{
            "id": "the_creator_id",
            "properties": {
                "name": ["Evan S Sandhaus"],
                "url": ["http://sandha.us<http://sandha.us/>"]
            },
            "type": "http://schema.org/Person"
        }],
        "headline": ["A Test Headline"],
        "name": ["Evan S Sandhaus"],
        "url": ["http://sandha.us<http://sandha.us/>"]
    },
    "type": "http://schema.org/NewsItem"
}]

And the Rich Snippets tool gives me this:
Item
Type: http://schema.org/newsarticle
headline = A Test Headline
creator = Item( 1 )
name = Evan S Sandhaus
url = http://sandha.us
Item 1
Type: http://schema.org/person
name = Evan S Sandhaus
url = http://sandha.us

So the question is: is this expected behavior?  If so, is there anything I could do besides this to "flatten" the markup into the <head> element?

Thanks!

~Evan
--
Evan Sandhaus
Lead Architect, Semantic Platforms
The New York Times Company
@kansandhaus
Received on Wednesday, 8 August 2012 16:01:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 8 August 2012 16:01:46 GMT