- From: leif halvard silli <hyperlekken@lenk.no>
- Date: Sat, 23 Apr 2005 19:33:59 +0200
- To: Frank Ellermann <nobody@xyzzy.claranet.de>
- CC: www-validator@w3.org
Frank Ellermann answered me: >> for instance the result of a puristic wish to only support >> the IANA registred charsets > > > If it's not IANA registered it doesn't exist. What you said there is merely a tautological statement. > The validator > tries to catch invalid character encodings depending on the > document charset, maybe also depending on XML 1.1 vs. 1.0, so > it can't handle unknown encodings. E.g. byte 133, it depends > on several factors how to handle it. I am pretty sure that that the x-mac encodings has enough in common so that that it should be pretty easy to find one common method for handeling them all. Referring to the XML 1.1 spesification, <http://www.w3.org/TR/2004/REC-xml11-20040204/#charencoding> it says that a processor MAY treat an [_IANA registered_] encoding as unknown, and explains this by saying that processors are, «of course», not required to know _all_ IANA encodings. Those encodings we talk about here, the x-mac encodings are referred to as «other encodings» about whom the recommandation says that they «SHOULD //use names starting with an "x-" prefix». It goes without saying that processors therefore _may_ deem x-mac encodings as _unknown_ (but not 'invalid'). But it _may_ also deem the 'macintosh' encoding as invalid, for that matter. And for any practical matters on the web, the 'macintosh' encoding is more incompatible than the 'x-mac' encodings. I am not even sure that 'x-mac-roman' and 'macintosh' (from Unicode 1.0 in in 1991) is 100% equal. If we compare this wo what the validator currently says about x-mac-roman we find that it uses the wordings «fatal error» and «non-existent character encoding» and finally «The error was "x-mac-roman undefined; replace by macintosh".». It would have been more approriate to warn against using the x-mac encodings pointing to their status as «private» encodings. But to completely refuse to validate is not very meaningfull. Since the x- encodings are mentioned in the XML 1.1 spec it is directly misleading with those kinds of responce from the validator. > OTOH you could always enforce "assume windows-1252" for all > MIME-compatible 8-bits charsets where codepoints 128..159 > are valid. You could enforce Latin-1 where that's not true. > And of course UTF-8 etc. are directly supported. This is a bit difficult to do «on the fly», for instance for an online document or one document you view locally and which works perfectly in your browser. > >> I was very suprised to find out that x-mac-roman was not >> accepted. > > > Compare <http://www.iana.org/assignments/character-sets> : > x-mac-roman does not exist, if you think that this is wrong > register it (but maybe x-... is reserved for private use). Exactly, x- are reserved for those that does not exist in IANA, so I or anyone else, cannot register them there unless we change their names ... There is nothing illegal with the x-mac encodings per se. And they do exist. >> the validator adviced me to use 'macintosh' as charset name. > > > Yes, that exists, why not use it ? Do you need more reasons than those I have given? If any 'x-mac-' encoding could be treated as 'macintosh' purely for validation purposes, then I think that this is a task that should be given to the validator. For the Euro sign and some others, the validator would then perhaps have had to demand a character reference (the Euro sign doesn't occupy the same place in all the x-mac encodings, if I remember.) The x-mac encoings are all 8-bit encodings so I guess it could have been possible. >> We Mac users live in this very perfect world where all >> encodings are named x-mac-something. > > > validator.w3.org is for the WWW. Maybe you could patch the > sources for a parallel universe of Mac users and a similar > validator.mac.org ? Thank you for your irony. Indeed, if the Validator will not validate all valid documents, why not have another validator as well? Besides, you are wrong if you interpreted me as saying that x-mac encodings are not used on the WWW. They are not used in a large degree, but they are used. But at least they are much used than anything called 'macintosh', which you --blindly using the IANA registry as guide to the WWW-- recommend. -- leif halvard
Received on Saturday, 23 April 2005 17:34:04 UTC