W3C home > Mailing lists > Public > public-grddl-comments@w3.org > April to June 2007

Re: Some comments re: microformats [OK?]

From: Dan Connolly <connolly@w3.org>
Date: Wed, 16 May 2007 12:24:11 -0500
To: Ryan King <ryan@technorati.com>
Cc: public-grddl-comments@w3.org
Message-Id: <1179336251.15441.23.camel@pav.dm93.org>

On Thu, 2007-05-10 at 16:59 -0600, Ryan King wrote:
> Promted by Harry's message to microformats-discuss[1], I'd like make  
> a few comments. Part of the group's charter calls for  
> interoperability with microformats. As an active member of the  
> community around microformats.org, I'd like to review the  
> specification from that perspective.

Thanks for the review and comments.

Please let us know whether you're satisfied by this response...

> Here are a few issues/comments I'd like to make:
> 
> 1. tagsoup? html?
> 
> The spec describes how to apply a transformation from "Valid XHTML",  
> but fails to define any way to deal with other content on the web.  
> Given that the majority of the web is something other than "Valid  
> XHTML" [2], this spec doesn't seem to be very useful on the Web.
> 
> There also doesn't appear to be any normative way to deal with non  
> XML HTML (like HTML 4, for example).
> 
> Unfortunately, this appears to be out of scope for for the group's  
> charter[3]:
> 
> > It binds XML documents, especially XHTML documents, XHTML profiles  
> > and XML namespace documents
> 
> and there's not mention of a requirement to work with existing  
> content on the web, so I'm not sure there's anything that can be done  
> at this stage.

Indeed, the applicability of GRDDL is, strictly speaking, limited
to well-formed XML. As I'm sure you know, there are various
not-yet-standard ways to parse tag soup as XML. While we don't
specify the details of how to use those with GRDDL, we do touch
on it in our use cases document...
  http://www.w3.org/TR/grddl-scenarios/#html_tidy_use_case


> 2. profiles, editing <head>
> 
> AFAICT from my reading of the spec, authors producing content to be  
> consumed via GRDDL will need to add a profile uri to the <head> of  
> their XHTML documents.
> 
> This requirement will reduce the compatibility with existing  
> microformats content on the web. Most content does not have a profile.

Yes... and without a profile, an agent can't be sure whether
the author meant class="vevent" in the sense of
http://microformats.org/wiki/hcalendar or not. While many
agents are willing to take the risk, as discussed
in the "Faithful Renditions" section
  http://www.w3.org/TR/grddl/#sec_rend
GRDDL is designed for the case where the author uses URI-based
mechanisms to license extraction of data from their document.

> Also, there are many web authors for whom editing the <head> of their  
> documents is either prohibited or much more difficult than adding  
> content in the <body>

Yes, we discussed that as

 issue-tx-element: is there a way to push the grddl:transformation
attribute down from the document element to individual elements without
breaking the chain of authority?
 http://www.w3.org/2004/01/rdxh/spec#issue-tx-element

but we didn't come up with any solutions. So we postponed the issue.


> Lastly, the current HTML5 draft removes @profile. Of course, this is  
> just a draft and things may change, but there doesn't appear to be a  
> story about future compatibility here.

At worst, GRDDL will continue to work with XHTML 1.x. We'll
see how the HTML 5 discussion goes, I suppose.

> -ryan
> 
> 1. http://microformats.org/discuss/mail/microformats-discuss/2007-May/009624.html
> 
> 2. I can't find the reference but Ian Hickson did a study at google  
> which showed that more than 90% of page on the web had lexical level  
> validity issues. Most of the web is not well formed, much less  
> conformant XHTML
> 
> 3. http://www.w3.org/2006/07/grddl-charter.html

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 16 May 2007 17:24:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:11:43 GMT