[Bug 2275] Charset/encoding issue

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2275

           Summary: Charset/encoding issue
           Product: Validator
           Version: 0.7.0
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: major
          Priority: P2
         Component: Parser
        AssignedTo: link@pobox.com
        ReportedBy: jh@awake.dk
         QAContact: www-validator-cvs@w3.org


When using the "Validate by direct input" (great feature!), I get an error,
which I believe is related to encoding or charset.

I try to validate this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xml:lang="se" xmlns="http://www.w3.org/1999/xhtml">
<head>
	<title>Encoding/charset test</title>
	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<p>ÄÖÅäöå</p>
</body>
</html>

and I get 3 errors that relates to "non SGML character". All 3 errors are in
this line:
<p>ÄÖÅäöå</p>

If I change charset from iso-8859-1 to utf-8, the XHTML will validate without
errors, I would have expected this behaviour to be the other way around.

If save the XHTML as .html and use the "Validate by File Upload" feature, I get
no errors, the XHTML validates fine. Also if this .html file is placed on a
webserver, using the "Validate by URL" feature will also validate fine.

So, to sum up, I believe there is an issue with the "Validate by direct input"
and encoding/charset - it does not seem to be set properly, even though it does
say "Encoding: iso-8859-1" a the top of the validation results page.

Received on Wednesday, 21 September 2005 12:18:28 UTC