W3C home > Mailing lists > Public > www-validator@w3.org > June 2004

Re: character encoding

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 14 Jun 2004 11:15:35 +0300 (EEST)
To: www-validator@w3.org
Message-ID: <Pine.GSO.4.58.0406141034520.18008@korppi.cs.tut.fi>

On Mon, 14 Jun 2004, Juergen Kayser wrote:

> I used Netscape 7.1 for file upload and the file was validated.
> With IE 6.0 it did not work. The only way I found, that it
> works with IE 6.0 is to use the manuel override.

I checked with your document, stored in a file on my Windows 98 system
using a file name ending with .html, and submitted it to the validator
using IE 6. It complains about incorrect characters and explains that
there is a "strong default" of charset=us-ascii for text/xml, and I guess
this is what you got too.

Checking what IE 6 actually sends, I find that it really says
Content-Type: text/xml
for the file included into form data. And the consequences are then
inevitable, due to (questionable, IMHO) principles that say that US-Ascii
must then be implied, no matter what the document's content says (in XML
prolog or in <meta> tag).

> Perhaps there may be a way to validate with IE by changing the
> source?

If I remove the XML prolog

<?xml version="1.0" encoding="iso-8859-1"?>

then IE 6 sends the file as text/html, and it passes validation.
Apparently IE looks both at the file name suffix and the (first few lines
of the) file content in guessing what Content-Type should be included into
the form data.

Whether omitting the prolog is acceptable is a different matter.
I would not recommend doing so on actual Web pages that declare themselves
as ISO-8859-1 encoded XHTML documents - and using different versions of
the file for uploading to a server and for validation via the file upload
would be no easier than using the extended interface that lets you
override the charset information that the validator otherwise implies.

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Monday, 14 June 2004 04:15:37 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:30:44 UTC