- From: Nick Kew <nick@webthing.com>
- Date: Mon, 10 Dec 2001 20:49:53 +0000 (GMT)
- To: Michael Everson <everson@evertype.com>
- cc: <www-validator@w3.org>
On Mon, 10 Dec 2001, Michael Everson wrote: > I have a lot of pages with a few Latin 1 (non ASCII) characters in > them. I want to convert them all to UTF-8. This isn't always > straightforward. Won't iconv do it? > Where the Validator fails BADLY is that if I am converting to UTF-8 > and I miss one of the characters (usually this means there is a > single Latin 1 character in the file instead of a pair) I get a very > unhelpful message like this: > > "Sorry, I am unable to validate this document because on line 63 it > contained some byte(s) that I cannot interpret as utf-8. Please check > both the content of the file and the character encoding indication. " That'll be when the parser refuses your document outright because it's incompatible with your declared charset. It also means that the source is (technically at least) too broken even to try and display. > But the Validator is broken. It doesn't display the source, and so I > have NO IDEA how to find line 63. Erm - open your document in a text editor? BTW: do you have a need to convert, or is this an exercise? -- Nick Kew Site Valet - the essential service for anyone with a website. <URL:http://valet.webthing.com/>
Received on Monday, 10 December 2001 15:50:02 UTC