- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Thu, 29 Nov 2007 17:05:30 +0100
- To: www-validator@w3.org
Andreas Prilop wrote: > (b) Take ISO-8859-1 as fallback encoding (the default of RFC 2616). > This will "work" if no bytes from 0x80 to 0x9F are present - > hence with many of the traditional 8-bit character sets. > Otherwise (if some bytes from 0x80 to 0x9F are found), > give the usual errors about "non SGML character number ..." That's a variation of the current UTF-8 default, it could result in a flood of errors for say windows-1252 pages with lots of Euros. I'd prefer a completely unlikely "SBCS" with proper subset ASCII permitting all octets from 0x80 up to 0xFF. And at the end, after all other errors based on this assumption are reported, one final "you lose - unknown charset" (optional as gimmick: "whatever it is, it's certainly not UTF-8", if that is known in your scenario). Frank
Received on Thursday, 29 November 2007 16:04:09 UTC