Re: Microdata Issues from Martin McEvoy on 2009-10-22 (public-html@w3.org from October 2009)

From: Martin McEvoy <martin@weborganics.co.uk>
Date: Thu, 22 Oct 2009 02:17:51 +0100
To: Philip Jägenstedt <philipj@opera.com>
CC: "Tab Atkins Jr." <jackalmage@gmail.com>, public-html@w3.org
Message-ID: <4ADFB2BF.80105@weborganics.co.uk>
Hello All

Philip Jägenstedt wrote:
>  I'll explain in a few paragraph when I think the microdata DOM API 
> might be useful.
>
> First, assume that you've marked up your HTML with microdata for the 
> benefit of external scrapers, to complement or replace an XML or JSON 
> webservice. If you want to make some visualizations of that data using 
> JavaScript (e.g. by generating a SVG or <canvas> graph), scraping the 
> raw data from the DOM will be very easy using document.getItems(). If 
> it isn't easy, then it won't be for the external parsers either and 
> you have a more serious problem to worry about. Changing property 
> values (e.g. in response to interaction with said graph) can also be 
> done via the same API from which you read it.
>
> If you use "semantic" class names, then you have double the work to 
> mark it up and have to write specialized code to scrape the data from 
> the DOM. If you want to read and write data via a consistent interface 
> you'll need to write wrapper objects using getters and setters (or 
> getFoo()/setFoo()) that hide the fact that some properties are the 
> child text nodes, while others are in different attributes, as is the 
> case with e.g. form elements.
>
> Of course everything above can be done without the microdata DOM API, 
> but having external and on-site scrapers use the same data rather than 
> duplicating it makes perfect sense.

Thank you Philip for giving me a few very good reasons why Microdata may 
be useful  ;)

........


Tab Atkins Jr. wrote:
> ... I am attempting to answer your questions as best as I can.
>  Unfortunately most of what you've said before demonstrates either a
> severe lack of understanding, or a very unfortunate language gap
> between us.  
...
> The rest of what you've said has been unproductive
> complaints.
>   
If it seemed like that I'm sorry, that was not my intention.

> If you still have problems with Microdata, can you restate them so
> that perhaps I can understand you better?
>   

Of course Tab, I guess the topic of this conversation "Microdata Issues" 
in hindsight that should have been "Microdata Feedback" as the other was 
a bit inflammatory.

I would like to suggest that Microdata use short meaningful attribute 
names,  i.e. remove the "item" bit where possible.
My reasons are quite simple,  because all other attributes in HTML5 use 
short descriptive attribute names, examples are "href" "id" "data" "src" 
"class", microdata names would be easier to learn, microdata would take 
less time to type and result in smaller HTML size.

The typed items section[1] says "An item can only have one type." and it 
seems a little unclear to me on what I should do if I want to combine 
two types for example I am trying to mark up my pet using both an animal 
ontology located here: http://example.org/animals#cat and a pet ontology 
located here: http://example.com/pet#cat, can I define itemtype twice, 
and how do I avoid name clashes.

[1] http://dev.w3.org/html5/spec/Overview.html#typed-items

I have a feeling I know what the answer to that is, you can only define 
*one* @itemtype="" the rest of the time I should select names as 
described in "Selecting names when defining vocabularies"[2]  It 
suggests I should use urls because this avoids conflicts with other 
vocabularies. I am sorry I absolutely cannot do that. It goes against on 
or two of my pinciples of  "being easy to read" and "using short 
meaningful names". It seems to me that Microdata lacks a scoping 
mechanism or prefixing mechanism of some kind. The data-* prefix seems 
ideal for this, but I believe it cant be used by external scrapers, the 
value of making all that potential data available to external sources 
too seems to far out weigh making it not, the data will already exist 
out in the wild why not make it useful.

[2] 
http://dev.w3.org/html5/spec/Overview.html#selecting-names-when-defining-vocabularies

As you can see my problems are few, and maybe minor and not particularly 
important technical problems but ones that would cost little to fix and 
in the end make microdata more useful I think.

The rest of my "issues" i.e. the comparisons to RDFa and microformats, 
dont realy matter the spirit seems to be the same. Thanks by the way Ian 
for removing the reverse dns idea, it was a good move.

You are of course welcome to ignore everything I said I wont mind ;)

Thanks

-- 
Martin McEvoy

http://weborganics.co.uk/

"You may find it hard to swallow the notion that anything as large and apparently inanimate as the Earth is alive."
Dr. James Lovelock, The Ages of Gaia
Received on Thursday, 22 October 2009 01:18:12 UTC