Re: HTML 4 Profile for RDFa

Philip Taylor wrote:
> Shane McCarron wrote:
>> Julian Reschke wrote:
>>>
>>> It's clear that if RDFa is to be used with prefix declarations done 
>>> with xmlns, then mixing uppercase and lowercase declarations is not 
>>> going to work.
>>>
>>> I think restricting prefixes to be lower-case (insert proper Unicode 
>>> terminology here) would be acceptable; it's easy to live with, and 
>>> avoids introducing yet another prefix declaration mechanism.
>>
>> I would not be opposed to adding text in the RDFa in HTML definition 
>> like "prefix names SHOULD be defined in lower-case to help ensure 
>> maximum portability among parsers, since it is common for DOM-based 
>> parsers to not preserve the case of attribute names."
>
> If portability isn't guaranteed in a very simple case like this, then 
> it sounds like the specification would have failed at the fundamental 
> task of specifying behaviour that will be interoperably implemented.
>
> (Once portability is guaranteed, it might be good to recommend against 
> using non-lowercase prefixes because they might have surprising (but 
> portable) behaviour, but that's a very different reason.)
>
>> I don't see there being any need to change the definition of 
>> XML-based languages like RDFa for XHTML.  After all, in XML case is 
>> preserved.  Or is ot someone's goal that documents be able to be 
>> parsed as EITHER XML or HTML?  It's not my goal.  If I define a 
>> document using an HTML family language, I expect it to be parser 
>> using an HTML family parser.  If I define it using an XHTML family 
>> language then I expect it to be parsed using an XML-conforming 
>> parser.  Such a parser would preserve the case of element and 
>> attributes.
>
> People will read the RDFa-in-XHTML specs and guides and tutorials and 
> examples, and use the same syntax in their own pages. Then they'll 
> serve their pages as text/html and expect it to work the same.
>
> A survey of random pages from dmoz.org about a year ago found that 
> ~18% used an XHTML doctype, and ~0.03% were served as 
> application/xhtml+xml. On the Alexa top 200 a bit earlier 
> (http://lists.w3.org/Archives/Public/public-html/2007Aug/1248.html), a 
> third used an XHTML doctype and three quarters of those were not 
> well-formed XML. So: Any new markup will be overwhelmingly served as 
> text/html, and most of it that claims to be XHTML won't be usable with 
> an XML parser.
>
> Thus, the XHTML syntax will mostly be processed using the 
> RDFa-in-text/html processing rules. If those rules don't do what 
> people expect (after they've read the XHTML-focused specs and guides 
> and tutorials and examples), then they will be surprised and unhappy 
> and it will be a bad situation.
>
> To make the situation better, either (a) the RDFa-in-XHTML 
> documentation should all be removed and replaced with 
> RDFa-in-text/html documentation so that people won't be encouraged to 
> use the wrong syntax in their pages; or (b) the RDFa-in-XHTML syntax 
> should give the same results (as far as possible, given the 
> backward-compatibility constraints) when processed with the 
> RDFa-in-text/html processing rules.
>
> I presume (a) isn't going to happen. That leaves (b), which would 
> require coordination between RDFa-in-XHTML and RDFa-in-text/html, and 
> seems likely to require changes to the RDFa-in-XHTML spec to smooth 
> out the differences.
>
Wow, Philip, you're using an 8-gauge shotgun to hunt baby bunnies here.

Can I take a leap of faith and guess that of the 18% of web pages served 
up with the XHTML doctype not using well formed XML probably are also 
not using RDFa?

The RDFa in XHTML spec doesn't need to change if a new document covering 
RDFa in HTML is created. Does it? Maybe a cross-reference between the 
documents, with a general warning about differences between the two 
documents would be good.

As it is, there's probably going to be confusion about XHML versus HTML 
with the HTML5 spec. I'm rather waiting for someone to use <br> in XHTML5.

Shelley

Received on Saturday, 23 May 2009 20:36:10 UTC