Leif Halvard Silli wrote: > Andrew Cunningham On 09-10-14 03.53: > > > The reason, as much as I have picked up, is about market shares. And > the "poster child" here is Windows-1252. > I realise that, but if market share is the issue, then trhe reality is that microsoft is setting the trends here having the lions share of the market in terms of OS, and if oyu look at Microsoft policy all new languages if not encompassed by an existing code page are ONLY supported via unicode. Its been said often enough, in enough forums over the years. >> >> >> 3) declare encoding as x-user-defined, e.g. http://www.anandabazar.com/ >> >> although at least in IE (English UI) x-user-defined is parsed as >> Windows-1252, so in that version of the browser declaring >> x-user-defined was effectively the same as declaring iso-8859-1 or >> windows-1252. >> >> Which is why a lot of legacy content in some SE Asian scripts was >> always delivered as images or PDF files, rather than as text in HTML >> documents. > > > Which are served just as well as UTF-8? > >> Browsers assumed a win-1252 fall back so it was impossible to markup >> up content in some languages using legacy content. The Karen >> languages tended to fall into this category, and content is still >> delivered this way by key websites in that language, although >> bloggers are migrating to using pseudo-Unicode font solutions. > > > What do you mean by "pseudo-Unicode"? > pseudo-Unicode is the practice of remapping glyph based 8-bit legacy encodings to Unicode fonts, In terms of the myanmar script, for Burmese, etc. this means remian some glyphs to actual Unicode codepoints and assigning other glyphs to codepoints in the same block unused by the langauge in question or to the PUA and glyphs access directly by codepoint unicode uses a character based model pseudo-unicode uses a glyph based model that in many instances reassigns glyphs to codepoints required by other languages using the same script. For instance, with Burmese, the majority of online content uses a pseudo-Unicode font that reuses codepoints required for Mon, S'gaw karen, Shan and other languages pseudo unicode data can not be correctly displayed or read with Unicode capable fonts either the Unicode 4.1/5,0 version fonts or the Unicode 5.1+ fonts At the moment pseudo Unicode is more common for Burmese web content than Unicode. And in some projects has lead to splintering, i.e. the Burmese wikipedia project that uses Unicode 5.1 vs a splinter group that created a new wiki using pseudo-Unicode. Its a political issue in Burmese web development and IT communities. > > Forgive me for being occupied with those languages which are already > supported. Here is some Mozilla critic: > nothing to forgive, spent many many years myself concerned about those languages, but there are many languages who's needs are forgotten by developers and specification writers. Andrew -- Andrew Cunningham Senior Manager, Research and Development Vicnet State Library of Victoria 328 Swanston Street Melbourne VIC 3000 Ph: +61-3-8664-7430 Fax: +61-3-9639-2175 Email: andrewc@vicnet.net.au Alt email: lang.support@gmail.com http://home.vicnet.net.au/~andrewc/ http://www.openroad.net.au http://www.vicnet.net.au http://www.slv.vic.gov.auReceived on Wednesday, 14 October 2009 04:35:04 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 14 October 2009 04:35:06 GMT