W3C home > Mailing lists > Public > public-html@w3.org > October 2009

Microdata Issues [was Microdata design philosophies]

From: Martin McEvoy <martin@weborganics.co.uk>
Date: Sat, 17 Oct 2009 23:52:56 +0100
Message-ID: <4ADA4AC8.5060403@weborganics.co.uk>
To: "Tab Atkins Jr." <jackalmage@gmail.com>
CC: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Martin McEvoy wrote:
> Tab Atkins Jr. wrote:
>> On Fri, Oct 16, 2009 at 4:19 PM, Martin McEvoy 
>> <martin@weborganics.co.uk> wrote:
>>  
>>> look at this example:
>>>
>>> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-basic-syntax 
>>>
>>>
>>> <div itemscope id="amanda"><itemref refid="a"><itemref refid="b"></div>
>>> <p id="a">Name: <span itemprop="name">Amanda</span></p>
>>> <div id="b" itemprop="band" itemscope id="jazzband"><itemref
>>> refid="c"></div>
>>> <div id="c">
>>> <p>Band: <span itemprop="name">Jazz Band</span></p>
>>> <p>Size: <span itemprop="size">12</span> players</p>
>>> </div>
>>>
>>>
>>> What is the above example trying to attempt?
>>>     
>>
>> It's marking up someone's participation in some band, apparently.
>>   
>
> Really if you say so....
>>  
>>> What does itemscope mean?
>>>     
>>
>> Have you read the Microdata section?  
>
> Of course I have...
>
>> @itemscope says "This chunk of
>> html defines a chunk of microdata."  It scopes any children of the
>> element to be part of that parent item (rather than being just random
>> unconnected bits of data).
>>   
>
> And you want me to tell that to my students?  or anyone else for that 
> matter.
>
>>  
>>> look at those funny little bits of mark up <itemref refid="a"><itemref
>>> refid="b">, do itemref and refid confuse you? again what do they mean?
>>>     
>>
>> Again, have you read the Microdata section?  
>
> Again yes I have...
>> <itemref> allows you to
>> include data from elements that aren't children of the @itemscope.
>>   
>
> kind of like the include pattern in microformats would you say?
>
>>> Look at every bit of content for example <span 
>>> itemprop="size">12</span>,
>>> what does size mean or band or any of the attribute contents?
>>> How Is a newcomer to HTML or the semantic web going to make of all 
>>> that?
>>> Does the above seem a little much just to mark up around 18 
>>> characters of
>>> data?
>>> Do you think a search engine will understand the above example, 
>>> knowing that
>>> they cant reason like humans.
>>>     
>>
>> It's some example vocabulary used to illustrate the principls.
>>   
>
> An example that may get copied and pasted around the internet...
>> Assume, for a moment, that a similar vocabulary existed in RDF, and
>> the example was instead marked up in RDFa.
>>
>> How is a newcomer to HTML or the semantic web going to make of all 
>> that RDFa?
>> Doesn't the RDFa seem a bit much just to mark up around 18 characters 
>> of data?
>> Do you think a search engine would understand the RDFa, knowing that
>> they can't reason like humans?
>>   
>
> Well at least you have a chance with either microformats or RDFa.
>
> You still didn't answer my question...
>> All of these concerns you have are *exactly* applicable to RDFa, or
>> really *any* method of marking up metadata in a page (such as CRDF,
>> GRDDL, etc.).
>>   
>
> Thank you for that last paragraph I'm glad you worked that one out, 
> microdata doesn't actually solve any problems does it?
Tab, After answering my questions with other questions you could have 
made this point...

"The examples in the previous section show how information could be 
marked up on a page that doesn't expect its microdata to be re-used"
http://dev.w3.org/html5/spec/Overview.html#typed-items

Pardon?

You mean I have to go through all that, (see the example at the top of 
this email)  thinking I am embedding some real semantics using some 
pretty fancy attributes and elements, that really has no semantic value 
outside my own website?

Why cant I just use good semantic class names?

Here is another example of microdata "falling over" quite badly from: 
http://dev.w3.org/html5/spec/Overview.html#typed-items again...

<section itemscope itemtype="http://example.org/animals#cat">
 <h1 itemprop="name">Hedral</h1>
 <p itemprop="desc">Hedral is a male american domestic shorthair, with a 
fluffy black fur with white paws and belly.</p>
  <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 
months">
</section>

Apart from the *obvious* indirection mechanism the spec says this about 
the above...

"In this example the "http://example.org/animals#cat" item has three 
properties,  a "name" ("Hedral"),  a "desc" ("Hedral is..."),  and an 
"img" ("hedral.jpeg")."

ok we will take that as said, how about If I add some more microdata to 
that, just some plain vanilla stuff that I want to use for my website 
that I don't expect to re-used elsewhere....

<section itemscope itemtype="http://example.org/animals#cat">
 <h1 itemprop="name title">Hedral</h1>
 <p itemprop="desc">Hedral is a male american domestic shorthair, with a 
fluffy black fur with white paws and belly.</p>
  <img itemprop="img my-cat" src="hedral.jpeg" alt="" title="Hedral, age 
18 months">
</section>

Good eh? notice there is a "title" and my picture is called "my-cat"

According to the description of the original markup my item now has 
*five* properties, but only three of them are part of  the vocabulary 
defined at http://example.org/animals#cat, how does a parser tell which 
property is part of my cat vocabulary and which is not? its not clear is it?

Thanks.

-- 
Martin McEvoy

http://weborganics.co.uk/

"You may find it hard to swallow the notion that anything as large and apparently inanimate as the Earth is alive."
Dr. James Lovelock, The Ages of Gaia
Received on Saturday, 17 October 2009 22:53:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:50 GMT