- From: <Jarno.Elovirta@nokia.com>
- Date: Tue, 2 Jul 2002 13:43:59 +0300
- To: <www-validator@w3.org>
Hi,
Character Encoding is currently detected erroneously when the document uses SGML SHORTTAG constructs. The following document is valid SGML document and parses without errors (using SP 1.3.4):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN"><TITLE/test document/<META http-equiv=Content-Type content="text/html;charset=ISO-8859-1"<P>
However, the W3C Validator fails to read the character encoding information from the META element and issues a warning. The following document is the same document, but with the SHORTTAG construct not used in the META element.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN"><TITLE/test document/<META http-equiv=Content-Type content="text/html;charset=ISO-8859-1"><P>
This passes the validation without warnings. Both documents have the exact same parse tree:
AVERSION CDATA -//IETF//DTD HTML 2.0 Strict//EN
<HTML>
<HEAD>
<TITLE>
test document
</TITLE>
AHTTP-EQUIV TOKEN CONTENT-TYPE
ACONTENT CDATA text/html;charset=ISO-8859-1
<META>
</META>
</HEAD>
<BODY>
<P>
</P>
</BODY>
</HTML>
C
Cheers,
Jarno
Received on Tuesday, 2 July 2002 06:44:01 UTC