- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Tue, 20 Mar 2007 12:55:03 +0100
- To: public-grddl-wg <public-grddl-wg@w3.org>
- Cc: "Microformats Discuss" <microformats-discuss@microformats.org>
I've completed a survey of the current state of microformats with regard to the GRDDL mechanism for data extraction. In principle GRDDL is usable on all documents which use microformats. Results tabulated at: http://esw.w3.org/topic/CustomRdfDialects/GrddableMicroformats (It's on the ESW Wiki - please correct any errors/omissions directly) Short version: following a strict interpretation of the relevant specs, no official microformats are currently usable with GRDDL. Taking a loose view, around a third are right now. Summary: As anticipated, the weakest link is the non-existence of profile URIs. Of the 18 microformats listed, only 3 have profile URIs directly usable by GRDDL-aware agents (hCard, hCalendar & hReview), and none of these URIs are endorsed by microformats.org. Only 1 of the 18 has an endorsed profile URI (XFN), and that isn't GRDDL-enabled. It was suggested on microformats-discuss that relevant Wiki pages for the microformats could be used as interim profile URIs, but again these aren't GRDDL-enabled. Most of the microformats do have an XMDP expression of their profile, yet with a couple of exceptions these are listed as source markup, i.e. not really human or machine-readable. It isn't obvious what the intended purpose of this might be. XSLT to RDF/XML is available in various stages of completion for 6 of the 18 microformats listed (including the 3 with unofficial profile URIs). In other words, only 4 of these 18 formats exploit the HTML specification fully for disambiguation. Because the profile URIs corresponding to 3 of these have appeared outside the microformats.org process, only one format+profileURI combination may properly be called a microformat (rather than 'semantic HTML'), and that one isn't GRDDLable. While this limits the publisher's options when it comes to publishing data in HTML, consumers may still use heuristics based on GRDDL or similar mechanisms to extract data from microformat-enhanced documents (i.e. screenscraping), with the obvious impact on reliability & authority of the data, questions of provenance etc. Cheers, Danny. -- http://dannyayers.com
Received on Tuesday, 20 March 2007 11:55:08 UTC