- From: Tex Texin <tex@xencraft.com>
- Date: Wed, 24 Aug 2005 04:58:03 -0700
- To: Frank Yung-Fong Tang <franktang@gmail.com>
- CC: Jony Rosenne <rosennej@qsm.co.il>, www-international@w3.org
I was going to make more or less the same comment, which is that conversion from legacy encodings to unicode is a difficult but necessary subject. It is large so should be a separate faq or faqs, and should cover many encodings, not just bidi. Any minute now, Richard is going to pipe up suggesting Joni submit a faq for hebrew and Frank one for double-byte encoding conversions, so I'll preempt him and suggest that as well. ;-) Although we could use a treatise on these issues, I wonder if it would be better to identify libraries or tools that do the job right and give users appropriate choices. I muck around with iconv, ICU, perl, etc. and it is very hard to know which tools will do the entire job correctly, and which do the minimum, or are several versions behind. For example, a convertor written for Unicode 2.0 would not take advantage of the characters in Unicode 4.x. It is correct in some sense and incorrect in other ways. Also, a pure encoding convertor would not take into account the needs of the Web, and perhaps issues of conversion to the bidi markup. And which tools offer a choice when it comes to converting backslash to yen, wan, etc. when used as currency? Many users are confused by which conversions to use. e.g. When to use Windows-1252 instead of iso 8859-1, or when to use big5-hkscs instead of big-5, since often data is mislabeled? I think the tools view or roadmap may be more important than the character encoding details. But yes, it is a topic definitely needing expansion. -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Wednesday, 24 August 2005 11:58:26 UTC