Re: HTML 4 Profile for RDFa

Shelley Powers wrote:
> Philip Taylor wrote:
>> [...]
>> A survey of random pages from dmoz.org about a year ago found that 
>> ~18% used an XHTML doctype, and ~0.03% were served as 
>> application/xhtml+xml. On the Alexa top 200 a bit earlier 
>> (http://lists.w3.org/Archives/Public/public-html/2007Aug/1248.html), a 
>> third used an XHTML doctype and three quarters of those were not 
>> well-formed XML. So: Any new markup will be overwhelmingly served as 
>> text/html, and most of it that claims to be XHTML won't be usable with 
>> an XML parser.
>>
>> Thus, the XHTML syntax will mostly be processed using the 
>> RDFa-in-text/html processing rules. If those rules don't do what 
>> people expect (after they've read the XHTML-focused specs and guides 
>> and tutorials and examples), then they will be surprised and unhappy 
>> and it will be a bad situation.
>> [...]
> 
> Can I take a leap of faith and guess that of the 18% of web pages served 
> up with the XHTML doctype not using well formed XML probably are also 
> not using RDFa?

They aren't, because approximately no pages (regardless of doctype or 
well-formedness) are using RDFa. Looking at some more recent data 
(~425000 pages from http://www.dotnetdotcom.org/ collected in the past 
few months), about 0.04% of pages in the sample appear to contain RDFa 
attributes (specifically 'property' containing a colon).

But I presume the idea is for RDFa to become much more widely used, and 
I have no reason to doubt that it would end up with roughly the same 
spread of text/html vs application/xhtml+xml and well-formed vs 
ill-formed, so the numbers are still relevant.

-- 
Philip Taylor
pjt47@cam.ac.uk

Received on Saturday, 23 May 2009 21:00:15 UTC