- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Mon, 25 Apr 2005 13:33:57 +0200
- To: www-validator@w3.org
leif halvard silli wrote: > <http://www.w3.org/TR/charmod/#C023> If you found #C023 you MUST have seen #C022: | Character encodings that are not in the IANA registry SHOULD | NOT be used, except by private agreement. Now you need a private agreement with the validator.w3 team ;-) Really, what's the pupose of this discussion ? The validator is a tool, it tries to find errors. In some cases like <br /> in HTML it's lost, in theory that's legal, in practice it does not work as expected. You have a choice, you can use the WDG validator where you get a warning for this issue. > It is not written anywhere that Validators, browsers or > whatever should _not_ accept x- encodings. But it is written that validating XML processors are expected to support a _limited_ set of encodings, the minimum is UTF-8 plus some others. I'm lost with SGML (the validator.w3 is AFAIK based on SGML), but probably the SGML rules are similar. >> If even you are not sure, the validator is lost. > Ah, I was waiting for you to say that. May be I am the > authority on this issue? Maybe. Send them a patch for their code. And tell me exactly how you've done it, I can't test perl on my box, but I'd like the validator to support "437" and "858", just because I think that it's bad to exclude minority platforms by ignorance. OTOH publishing XHTML pages in really obscure charsets without convincing reasons would be also bad. "Best viewed with Lynx on OS/2 and codepage 850" isn't a good idea. I really have texts in these charsets, but they are plain text. Fortunately the validator is too stupid to complain about a link like: <a href="xhtml.kex" type="text/plain" charset="PC-Multilingual-850+euro"> > If I understand you correctly, we agree that they should > offer a more enlightenling text. Why not offer the option to > validate with Character Encoding Override with the click of a > button instead of this unhelpful text? It doesn't help you that we agree. IIRC the missing support for some (registered) charsets is also listed in the bugzilla, maybe I've even voted for it. > It is customary to put [ Valid XHTML ] buttons on one's web > pages. Not really. If you have 50 (?) or less pages you can validate them with WDG, and then you don't need these funny buttons on every page. > Are there a way to get thos [Valid] buttons to automatically > use the extended interface? Not with the referer trick, but you can specify all parameters you like with an absolute URL, here's a macintosh example: http://validator.w3.org/check?uri=http%3A%2F%2Fpurl.net%2Fxyzzy&charset=macintosh The best you can get in this case is a "tentatively valid". > You can read about x-mac-roman, x-mac-cyrillic etc at > Unicode.org, btw. I'm more interested in what my browser supports, and it doesn't know x-mac-whatever or pc-multi-thingy. It also doesn't know UTF-8, and no windows-1252. But it's so stupid that it does the right thing for 1252 without knowing why. I love stupid software. Can't you create and read windows-1252 ? It's almost the same as Latin-1, minus 128..159, where it offers 128 Euro etc. For backwards compatible pages (in other words pages with an Euro) I use windows-1252, otherwise I use us-ascii plus some symbolic character references for Latin-1. Which I shouldn't if I'd follow W3C charmod, because Latin-1 is "better" than us-ascii + character references. But my pages and XHTML tool are older than W3C charmod, my local charset is normally pc-multilingual-850+euro, not windows-1252, and for English pages I really need only us-ascii with few exceptions. Visitors of my German pages will survive it when I "force" them to download some Ö instead of simple Latin-1 Ös. > Not Icelandic either, btw ;-) The DOS and OS/2 Icelandic codepage 861 used to be my favourite test case, because I was sure that it's strange and irrelevant from my POV. Maybe the Unicode conspira^H^H^H^Hortium should decorate the Euro as their most fearsome ally. It certainly killed a lot of legacy charsets. > OS X uses UTF-8 for all practical purposes, except in the > Carbon layer. Whatever that is, do you really need it for (X)HTML ? If the only reason to use x-mac-whatever is for fun, then I guess that the developers of the validator.w3 have their own priorities of what they consider as fun. Bye, Frank
Received on Monday, 25 April 2005 11:39:14 UTC