- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Wed, 24 Sep 2003 22:23:39 +0100 (BST)
- To: www-html@w3.org
> would be nice to see this even for languages such as Chinese: simplified and I don't see how your points relate. Taking them in reverse order. > problem I have seen is when using languages such as French with Chinese > where the Unicode characters interfere with each other like letters in > French displaying as Chinese characters. In my experience, this only happens when you try to view an invalid European document in a browser configured to compensate for invalid Chinese documents. You are unlikely to have any more success in convincing authors to correctly use new technology than you currently have in getting them to correctly use old technology. All valid documents in modern forms of HTML specify the transfer character set in real or meta HTTP headers. Your problem arises when you make the browser think that documents that have no character set are really in GB2312 and you encounter an ISO 8859/1 or Windows 1252 document that doesn't specify a character set. (The correct default for slightly older versions was ISO 8859/1, but there is now no default, presumably because a lot of non-Latin countries treated the default as being their favourite character set, not the specified one, and even earlier browsers were not character set aware and passed codes through to their font engines, untranslated). The canonical document is in ISO 10646, so does not have an ambiguity. > would be nice to see this even for languages such as Chinese: simplified and > traditional as it has been an issue using both languages on one document, The simplified/traditional split is a rather complicated issue. Although they are conventionally language tagged as though they were different regional dialects, they are more like different fonts. Unicode adds to the confusion by using different code points for the different writing styles for the same logical character (a few characters don't have 1:1 mappings), but shares the code points for charactes with the same structure. Traditional characters are sometimes used in business signs in the PRC (I can think of a case in Shanghai) in a similar way to the use of gothic fonts in England, and my understanding is that formal calligraphy is done strictly with traditional characters, even though zh-cn is used a computer synonym for simplified ones. Note that, in a properly language tagged document - using the zh-cn/zh-tw convention - you can use CSS2 to select an appropriate font.
Received on Wednesday, 24 September 2003 18:12:37 UTC