Re: gb2312 support broken?

* Martin Duerst wrote:
>It looks like the document isn't correct.

>The document contains two-byte sequences where the second
>byte is in the 20-7E range. This isn't GB2312 as defined
>above. I don't know what it is.

What about GBK (aka windows-936)? Windows maps GB2312, iso-ir-58 and GBK
to the same code page (936), uses thus the same decoder and gives them
the canonical name "Chinese simplified (GB2312)". This would probably
explain why

http://www.fineart.com.tw/cn/news/news.asp
http://www.cadstudy.net/
http://www.solar-energy.com.tw/index-gb.htm
http://lists.w3.org/Archives/Public/www-i18n-comments/2002Jan/0079.html
http://lists.w3.org/Archives/Public/wai-tech-comments/2002May/0306.html
http://lists.w3.org/Archives/Public/smil-editors/2002JulSep/0846.html
http://lists.w3.org/Archives/Public/wai-tech-comments/2002Jun/0564.html
http://lists.w3.org/Archives/Public/wai-tech-comments/2002Jun/0585.html
http://lists.w3.org/Archives/Public/wai-tech-comments/2002May/0329.html
http://lists.w3.org/Archives/Public/www-rdf-validator/2002Mar/1120.html
http://lists.w3.org/Archives/Public/wai-tech-comments/2002Apr/0552.html
...

and an many other documents marked as GB2312 are invalid GB2312.

Received on Tuesday, 29 April 2003 15:31:17 UTC