- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Thu, 13 Sep 2007 11:51:14 +0200
- To: www-validator@w3.org
Hi, I didn't read this list for about three months, so maybe this
bug report isn't new:
When I try to validate an XML file with encoding="UTF-8" using the
file upload interface I get an error for the first non-ASCII byte.
Apparently (= reported by the validator) my browser claims to send
Content-Type: text/xml without charset. Therefore the validator
expects US-ASCII ignoring the first input line:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
There are several problems with this:
1 - After reporting a spurious \xD3 in line 35119 the source isn't
shown, i.e. using option "show source" has no effect. At the
moment I'm forced to use an editor where I don't see the line
numbers.
2 - The reported non-ASCII char. in line 35119 is actually the last
non-ASCII, not the first in line 626.
3 - Option "UTF-8 only if necessary" doesn't help. Only a "hard"
character encoding override gives me a "tentatively valid"
result showing the source with line numbers.
4 - Why is the encoding="UTF-8" completely ignored for text/xml ?
See http://xyzzy.webhop.info/home/ltru/4645bisU.xml (1175 KB) for
the tested file, I've used Firefox 2.x under Win XP to upload it.
Frank
Received on Thursday, 13 September 2007 09:53:30 UTC