- From: olivier Thereaux <ot@w3.org>
- Date: Wed, 12 Mar 2008 00:54:17 -0400
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: www-validator@w3.org
Hi Frank, all.
On Mar 11, 2008, at 20:33 , Frank Ellermann wrote:
>> I think the validator does look at the xml declaration as
>> a source. See e.g the following test case:
>> http://qa-dev.w3.org/wmvs/HEAD/dev/tests/charset-xmldecl.xhtml
>
> Valid and UTF-8, do you have a similar test not using UTF-8 ?
> With a default UTF-8 it is not obvious what triggered UTF-8.
>
> My example was <http://xyzzy.webhop.info/home/ltru/4645bisU.xml>
> sending text/xml without charset resulting in US-ASCII and a
> fatal validation error for the UTF-8 XML.
Indeed, I just checked the code (look at the check script, around line
500), and found out that it has a special case for text/(something+)xml:
elsif ($File->{ContentType} =~ m(^text/([-.a-zA-Z0-9]\+)?xml$)) {
# Act as if $http_charset was 'us-ascii'. (MIME rules)
$File->{Charset}->{Use} = 'us-ascii';
&add_warning('W01', {
W01_upload => $File->{'Is Upload'},
W01_agent => $File->{Server},
W01_ct => $File->{ContentType},
});
}
That code may be a mistake… I don't recall being around when it was
added, so it may be coming from a zealous interpretation of RFC 3023…
--
olivier
Received on Wednesday, 12 March 2008 04:54:24 UTC