[Bug 1920] New: Validator fails becuase of symbol not found in windows-1251 character set

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1920

           Summary: Validator fails becuase of symbol not found in windows-
                    1251 character set
           Product: Validator
           Version: 0.7.0
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: check
        AssignedTo: link@pobox.com
        ReportedBy: mcsi@mcsi.pp.ru
         QAContact: www-validator-cvs@w3.org


http://validator.w3.org fails on this HTML:

====
<?xml version="1.0" encoding="windows-1251"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru" lang="ru">
<head><title>test</title>
<body>

&#1048;

</body>
</html>
====

with this error:

====
Result:  	 Failed validation,
File:	upload://Form Submission
Encoding:	windows-1251
Doctype:	

Sorry, I am unable to validate this document because on line 7 it contained one
or more bytes that I cannot interpret as windows-1251 (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.
====

However, the symbol on line 7 is russian capital I. This is perfectly valid
common character. Maybe you have wrong charset definition? This symbol has ASCII
code 200 (decimal). Here
(http://dll.botik.ru/educ/clerk/Library/Method/kod-tabl.ru.html) you can get a
clue what this symbol looks like. There's an image under CP1251 heading, that
shows russian capital I above code 200.

BTW, most other symbols are ok, however I didn't checked them all.

Received on Wednesday, 31 August 2005 11:28:27 UTC