- From: Terje Bless <link@pobox.com>
- Date: Sat, 5 Jul 2003 02:00:50 +0200
- To: W3C Validator <www-validator@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bjoern Hoehrmann <derhoermi@gmx.net> wrote: >>>>Granted XML rules suggest UTF-8 or -16 for this case, but as it's >>>>served as text/html -- e.g. Appendix C rules -- these do not, IMO, >>>>apply. >>> >>>But the MarkUp Validator honors the XML declaration in such >>>documents... >> >>So it does, but I'm inclined to consider this a bug where text/html >>documents are concerned. And note that it only considers an explicitly >>given encoding from the XML Declaration and does not apply XML >>defaulting rules here. > >That's inconsistent as it is inconsistent to honor both, the meta >element and the XML declaration, they are mutually exclusive. Right, and this is why I consider it a bug. IMO when served as text/html we should not pay attention to the XML Declaration; except in the case where it is the only source of encoding information (in which case it is a usefull heuristic but should generate a warning). Unfortunately, in 0.6.x, we don't have the requisite smarts about Content-Types and what they mean to be able to handle this distinction. In 0.7 we do so it will likely be smarter about this (at least potentially as I still need to figure out what behaviour constitutes "smart" for all possible Content-Types ;D). >XHTML user agents must ignore the meta element and HTML user agents >must ignore the XML declaration, Hmmm. I can't recall these two requirements from anywhere. Care to cite me a reference for them? >.... Maybe I should bring this issue up to the HTML WG? I'm not sure what good it would do. The underlying problem is that, due to Appendix C and other bits of XHTML 1.0 Rec, the text/html MIME Content-Type is used in an ambiguous way to indicate incompatible object classes. In fact, SGML vs. XML is impossible to resolve reliably (you need to guess; perhaps guess very reliably, but still a guess) without out-of-band hinting in the Content-Type (cf. Hixie's comment example). Exactly what steps need be taken depend on whether your two requirements above are supported by the current language in the relevant Recommendations, but in either case there is quite a bit of cleanup that needs to be done; and some of it in places (e.g. the HTML 4.01 Rec) where I think it would be hard to effect change at this juncture. Mainly the problem is that of whether text/html is SGML or XML, and Appendix C of XHTML 1.0 forces us to treat it as "a bit of both, really". OTOH, if we can get unambiguous specs on this it would make my life soooo much easier, and would let us tell a much more compelling story to web developers. - -- If you believe that will stop spammers, you're sadly misled. Rusty hooks, rectally administered fuel oil enemas, and the gutting of their machines, *that* stops spammers! -- Saundo -----BEGIN PGP SIGNATURE----- Version: PGP SDK 3.0.2 iQA/AwUBPwYVMaPyPrIkdfXsEQIyFwCdFvoQKE8RCyGXy00h0VWkMyS5N/oAniT1 arzM8mx5/OcO13JqkVffkGAq =GVUr -----END PGP SIGNATURE-----
Received on Friday, 4 July 2003 20:00:53 UTC