W3C home > Mailing lists > Public > www-validator@w3.org > December 2004

Re: [VE][139] Error Message Feedback

From: Lachlan Hunt <lachlan.hunt@iinet.net.au>
Date: Sat, 18 Dec 2004 00:08:13 +1100
Message-ID: <41C2DA3D.9010505@iinet.net.au>
To: dragonimp@impx.net
CC: www-validator community <www-validator@w3.org>

dragonimp wrote:
> 	I validated the page http://www.saferun.com/ with XHTML 1.0.
> 
> ==============================================
> Line 133, column 274: non SGML character number 30
> 
> ...字...</a> <span class="newsdatetime">(12-16 13:57)</span><br/>由于市场
> ======================================
> 
> I dont know  what's the meaning of "non SGML character number 30"
> I cannot find any error in the page at line 133:(

The error just indicates that a character used in the file is invalid 
for the character encoding declared, which is GB2312 (Chinese 
Simplified).  The error is often caused by the use of of a character 
encoding that differs from that declared for the file.  Therefore, it is 
likely that your editor is actually saving the file as a different, yet 
similar encoding.

Similar errors often occur when saving a file as Windows-1252 (the 
default for english versions of windows), yet declaring ISO-8859-1. 
There are only a few differences between them (from characters 128 to 
159), in that they are control characters in ISO-8859-1, but printable 
in Windows-1252, and it is the use of these characters that causes this 
error.

Although I was unable to find any mappings for GB2312, and thus unable 
to look up what character 30 is, it is likely to be a control character, 
but used as a printable character in whatever encoding your editor is 
actually using.  You'll need to check your editors documentation and/or 
settings to determine what exactly that is.

It appears to be this character [1]: 研 (U+7814), in this section on 
line 133:
   日本一家民间研究...
that is causing the problem.  I'm guessing that whatever character 
encoding your editor is actually using, uses the code point 30 for that 
character, and it just happens that browsers are smart enough to 
determine what you actually want in this case.  if the character does 
not exist in the GB2312 character repertoire, then replace it with 
either the hexadecimal or decimal character references: &#x7814; or 
&#30740;.  If it does exist in the GB2312 character repertoire, then fix 
your editor so that it saves your file correctly, or change the declared 
character encoding appropriately to whatever your editor is using.

[1] 
http://software.hixie.ch/utilities/cgi/unicode-decoder/character-identifier?characters=%E7%A0%94

-- 
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/    Rediscover the Web
http://SpreadFirefox.com/   Igniting the Web
Received on Friday, 17 December 2004 13:08:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:16 GMT