- From: Martin Duerst <duerst@w3.org>
- Date: Sun, 28 Oct 2001 23:39:37 +0900
- To: Michael Everson <everson@evertype.com>
- Cc: www-validator@w3.org
At 12:37 01/10/28 +0000, Michael Everson wrote: >As it happens, the UTF8 error was in a line further on, where it talks >about quotation marks and lists the left and right double angle quotes ォ >and サ. I fixed that UTF-8 but the point is that if you have a UTF-8 error >the validator just says what line it is in and doesn't provide you with >marked up text, which it does for invalid characters in, say, Latin 1. I went back to the code (mostly mine) and checked, but exactly the same thing is done for conversion errors from Latin-1 to UTF-8 as for UTF-8 byte sequence errors. It may be that you mean errors such as ‚. These are not Latin-1 errors, and are not related to the character encoding used for the page. These are markup errors, and are detected in a completely different part of the code. Other than that, the only thing I can think of currently is that you are comparing with an older version of the validator. Older versions indeed didn't check character encoding and were relying on ad-hoc errors produced by SP. In some cases, that lead to a huge list of errors (e.g. for a Shift_JIS page), while other errors were not caught. So we decided to just give a list of line numbers, because when something goes wrong with character encoding, it goes wrong quite a bit. Anyway, you can always check the 'show source code' box to get a source code listing with line numbers. If that doesn't help, you should try to set up two dummy pages that show the two behaviors that are different but you think should be the same. Regards, Martin.
Received on Sunday, 28 October 2001 09:39:49 UTC