W3C home > Mailing lists > Public > public-html@w3.org > May 2009

Re: HTML 4 Profile for RDFa

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Sat, 23 May 2009 21:59:31 +0100
Message-ID: <4A1863B3.1000706@cam.ac.uk>
To: Shelley Powers <shelleyp@burningbird.net>
CC: Shane McCarron <shane@aptest.com>, Julian Reschke <julian.reschke@gmx.de>, Sam Ruby <rubys@intertwingly.net>, RDFa Community <public-rdfa@w3.org>, "public-rdf-in-xhtml-tf.w3.org" <public-rdf-in-xhtml-tf@w3.org>, HTML WG <public-html@w3.org>
Shelley Powers wrote:
> Philip Taylor wrote:
>> [...]
>> A survey of random pages from dmoz.org about a year ago found that 
>> ~18% used an XHTML doctype, and ~0.03% were served as 
>> application/xhtml+xml. On the Alexa top 200 a bit earlier 
>> (http://lists.w3.org/Archives/Public/public-html/2007Aug/1248.html), a 
>> third used an XHTML doctype and three quarters of those were not 
>> well-formed XML. So: Any new markup will be overwhelmingly served as 
>> text/html, and most of it that claims to be XHTML won't be usable with 
>> an XML parser.
>> Thus, the XHTML syntax will mostly be processed using the 
>> RDFa-in-text/html processing rules. If those rules don't do what 
>> people expect (after they've read the XHTML-focused specs and guides 
>> and tutorials and examples), then they will be surprised and unhappy 
>> and it will be a bad situation.
>> [...]
> Can I take a leap of faith and guess that of the 18% of web pages served 
> up with the XHTML doctype not using well formed XML probably are also 
> not using RDFa?

They aren't, because approximately no pages (regardless of doctype or 
well-formedness) are using RDFa. Looking at some more recent data 
(~425000 pages from http://www.dotnetdotcom.org/ collected in the past 
few months), about 0.04% of pages in the sample appear to contain RDFa 
attributes (specifically 'property' containing a colon).

But I presume the idea is for RDFa to become much more widely used, and 
I have no reason to doubt that it would end up with roughly the same 
spread of text/html vs application/xhtml+xml and well-formed vs 
ill-formed, so the numbers are still relevant.

Philip Taylor
Received on Saturday, 23 May 2009 21:00:15 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:47 UTC